Kandidatuppsats/menu/standar… · We also examine diﬀerent approaches to evaluate the...

Kandidatuppsats Statistiska institutionen

Bachelor thesis, Department of Statistics

Nr 2014:3

The State of the Pre-election Poll: The Making and Reporting of Political Polls in Sweden

Valundersökningens tillstånd: Om utförande och rapporterande av politiska

opinionsundersökningar i Sverige

Jonas Hallberg och Mearaf Berhane Teclab

Självständigt arbete 15 högskolepoäng inom Statistik III, VT2014

Handledare: Lars Lyberg

2

Abstract(

Pre-election opinion polls play an important role in society. They are used as sources for information, to prop up speculation and to pressure politicians. Despite this, the polls and their commercial stakeholders are rarely scrutinised in Sweden. This thesis reviews the underlying theory of producing pre-election polls and puts this in contrast to the conduct of selected pollsters and newspapers in Sweden. We also examine different approaches to evaluate the performance of polls and compare the results of Swedish pollsters with those of international counterparts. In doing this, we find a lack of transparency with regards to how the polls are conducted and substantial shortcomings in how the results are reported by the media. With these issues as the basis we make recommendations on how the media and the pollsters can improve. $

Keywords: Pre-election polls, opinion polls, polls, frequentist, Bayesian, survey, media(

Sammanfa-ning(

Politiska opinionsundersökningar spelar en viktig roll i samhället. De används som informationskällor, för att underbygga spekulation och sätta press på politiker. Trots detta så granskas undersökningarna och deras kommersiella intressenter sällan i Sver ige . Den här uppsat sen undersöker den teoret i ska ba sen för hur väljarundersökningar produceras och rapporteras och kontrasterar denna med utvalda opinionsinstitut och tidningar. Vi granskar även ol ika sätt att utvärdera undersökningarnas prestanda och jämför denna med internationella motsvarigheter. Vi finner brister i öppenheten kring hur undersökningarna genomförs, samt väsentliga tillkortakommanden i hur resultaten rapporteras av media. Med dessa problem som utgångspunkt utfärdar vi rekommendationer till media och opinionsinsitituten angående hur de kan bli bättre. (

Nyckelord: Väljarundersökningar, valundersökningar, partisympatiundersökningar, opinionsundersökningar, frekventistisk statistik, Bayesiansk statistik, undersökning, media

�II

Preface(

We would like to thank the people that have been kind enough to give us their time.$

Lars Lyberg, our tutor, who has been very helpful and supportive.$

Simon Hedström who has assisted with proofreading. $

The people at the pollsters and the newspapers, who have answered our questions. Felix Åberg at United Minds, Johanna Laurin Gulled at Ipsos, Toivo Sjörén at TNS Sifo, Camilla Sandberg at Demoskop, Lena Melin at Aftonbladet and Lina Modin at Expressen.$

!!!!!

�III

!Index(

!!

!

1. Introduction 1 ..............................................................................................................................2. The pre-election poll 5 ................................................................................................................2.1. Designing polls 5 2.2. Sampling 7 2.3. Survey error 9 2.4. Weighting 17 2.5. Bayesian and frequentist statistics 18 2.6. Accuracy and forecasting 22 3. The media 27 ...............................................................................................................................3.1. Standards 28 3.2. Previous studies 30 3.3. Analysis of the reporting of polls 32 3.4. Data 33 3.5. Results: How are political polls reported in Sweden? 36 3.6. The use of graphics 39 4. Analysis 41 ..................................................................................................................................4.1. Discussion 41 4.2. Conclusions and recommendations 45 5. References 49..............................................................................................................................

�IV

1. Introduc9on(In this thesis we evaluate the state of the pre-election poll. We review the process of conducting

pre-election polls in a Swedish context and examine four of Sweden’s biggest newspapers’

reporting of polls. From the basis of these analyses, we make recommendations on how to

improve upon current shortcomings. $

The roots of polls can be traced back to censuses. However, the first known example of an opinion poll was a local straw poll, an unofficial vote, conducted by the newspaper “The Harrisburg Pennsylvania” in 1824 (Moon 1999). But extensive application of survey research techniques did not begin until 1930, when researchers, among them George Gallup began to collect and publish opinion data (Traugott & Lavrakas 2007). In 1948, the poor performance of pre-election polls in the U.S. sparked great controversy. An inquiry into the practice of polling led by the distinguished statistician Frederick Mosteller, deemed the methods unscientific, but also presented designs and suggestions on how to improve the field (Mosteller 1949). Since 1948, the scope and influence of political polls have certainly increased. In the U.S., Traugott (2005) estimated that the period 1984-2000 saw a 900% increase in trial heat polls conducted between Labor Day and Election Day. Around 2008 the polling community was transformed. Websites publishing aggregations of polls became very popular. The blog Fivethirtyeight, affiliated with the New York Times, was one of these. Nate Silver, author and founder of Fivethirtyeight, became something as strange as a superstar statistician as “his website reported 3.63 million unique visitors, 20.57 million site visits, and 32.18 million page views in October alone” (Hillygus 2011). By weighting selected polls by a number of criteria, as well as manipulating the data in a number of ways, he managed to produce very accurate predictions in the presidential election. In 2008 Fivethirtyeight correctly predicted 49 of 50 states and in 2012, a perfect 50 out of 50. $

Pre-election polls survey the opinions of a sample of the population eligible for voting, to allow interested parties to use the results to speculate about an upcoming election, evaluate policy or simply serve as a background for discussions of current events. Why is it important to analyse polls? In the ideal case, polls strongly benefit

�1

the citizen. When properly applied, polls are the best tool at society’s disposal to represent the voice of the people. Politicians take heed of poll results. Also, prospective voters look to polls for information. If a party risks losing its place in parliament, voters might consider voting tactically to keep them in. If the polls are a fair appraisal of public opinion, it can be argued that they constitute an invaluable instrument of democratic influence. To the other main stakeholders; the media, the politicians and the pollsters, they have the potential to be of great importance. The media use pre-election polls to both create and interpret news, as well as to put pressure on politicians. “In Sweden, most of the ‘barometers’ of party preferences are commissioned by various media. The media thereby play a triple role in connection with polls: they commission them and then report and interpret the results.” (Peterson et al. 2006). The politicians need them to evaluate and design policy. In a recent article, 89% of the surveyed Swedish politicians agree that polls 1

influence voting behaviour, 85% agree that they are taken into account when shaping electoral strategy and 75% agree that polls are considered when determining policy (Aalberg and van Aelst 2014). For the pollsters, the position of influence outlined above make pre-election polls a symbol of status and power, and a driving force of other products and services. As the results of these polls seem to exert a substantial influence on the political decision-making process, it becomes vitally important that the methods used in producing these results are sound and the reporting of them is fair. We would like the system to be made up as to produce quality information. But the incentives of those conducting and reporting the results are not convincingly aligned with the interests of those using the results to make decisions. Nor seems reality; "the media do not use polls mainly to give voice to the people, but rather to serve the media’s own needs for unique news stories and for information that can be used to inform their horse race coverage, the framing of politics as a game, and as a help when deciding on and legitimising their issue coverage” (Strömbäck 2009). One could hope that the pollsters would discipline the media in order to make sure that pol l results do not get misconstrued, but the pollsters want to sell their services and products to the

�2

The response rate for members of the Swedish parliament was 45%. 1

media. The media want to make money, which is generated by the reporting of interesting news. If reporting of questionable quality results in more revenues for the media outlets as well as more business for the pollsters, it seems far fetched to think that the equilibrium arising in this situation will generate the best possible information to reach the voters. In his 2005 article “Political Polling and the New Media Culture: A Case of More Being Less”, Tom Rosenstiel depicts a dreary situation: “In the end, the landscape of political polling and the press might be thought of as an ecosystem in distress. The press culture represents a marketplace. With fewer barriers to entry, there are more pollsters who will move in whatever ways they can to fill that market” (Rosenstiel 2005). In order to create a situation more geared towards creating quality information, we would like to both evaluate and guide the system. As it turns out, evaluating the current situation is not straight forward. The polling industry uses a range of methods to adjust their estimates, and they are not entirely open with what they do. “AAPOR president Peter Miller explained: Despite decades of work, transparency in public opinion and survey research remains an elusive goal” (Hilligus 2011). Swedish pollsters interviewed in this study, with the exception of TNS Sifo, have been reluctant to release any details about their method.

Trying to force the system might be even harder. Hopefully, there is room for exellence. In the article by Aalberg and van Aelst (2014) 46% of the Swedish journalists that were surveyed, view both the results and the media’s reporting of opinion polls as unreliable. Coupled with the fact that Nate Silver and others have shown that there is an appetite for more rigorous analysis, there might be market incentives for quality reporting starting to form. And if the demand from the media changes, so will the material the pollsters provide. $

In any case, it is vitally important that both pollsters and the media reporting the results are put under scrutiny. 2014 is being called a ”super election year” by the Swedish media, as the election to the European parliament precedes the national 2

�3

Held every fifth year.2

election with only about four months. With the success of new parties, like 3

Feministiskt Initiativ, and the troubles of long-established ones, like Folkpartiet and Centerpartiet, it seems reasonable to think that polls will play an essential role in deciding the outcome of both elections as voters turn to polls for information. If the conducting and reporting of pre-election polls is not done responsibly, their prominence might not be the instruments of democracy they have the potential to be.$

Our objective in this thesis is to conduct a review of the producing and reporting of pre-election polls. The thesis is divided into three parts. The first part of the thesis (The pre-election poll) reviews the pre-election poll in a Swedish context. We detail and evaluate the main sources of uncertainty, and apply measures of accuracy to the outcome of the last three Swedish elections to parliament. The primary goal of this part of the thesis is to evaluate the origin and consequences of the survey error. In the second part (The Media) we inspect how the media handle the reporting of polls. In addition to reviewing previous studies, we present our own content analysis of 200 news articles about pre-election polls, from four of Sweden’s biggest newspapers. The objective of this segment is to determine if the reporting of polls meets conservative statistical standards. The concluding section of the thesis (Analysis) uses the results from these analyses to make recommendations to which we feel both pollsters and newspapers should adhere, in order to ensure that the information reaching the voter/reader is both statistically defensible and comprehensible. We aim for these recommendations to be sensible and possible to implement.$

Our main focus is pre-election polls in the Swedish context, but as most of the research on our main topics of interest is not conducted in Sweden, some of our discussions will naturally take place in an international context.$

�4

Held every fourth year.3

2. The(pre>elec9on(poll(Sweden was one of the first countries to popularise opinion polls. In the 1940-ies election coverage, the newspaper Dagens Nyheter (DN) made references to polls in their editorial on average once a week. The vast expansion of polls seen in for example the U.S. during the last decades (Traugott 2005) is not as apparent in Sweden, where the 90’s may even have been something of a peak in frequency. But with the exception of the post war period, pre-election polls have been a strong presence in the news cycle (Peterson et al. 2008, s137-138). In principle, a poll and a survey are the same thing. Poll is a term used to indicate a survey conducted by a commercial organisation. Generally, a poll contains a standardised questionnaire with relatively few questions and is commonly conducted within a narrow time frame (Traugott & Lavrakas 2008). $

In this section of the thesis, we review the process of conducting pre-election polls and provide an overview of how they are evaluated. We have conducted interviews with four representatives/pollsters, active in Sweden today; David Ahlin/Ipsos, Felix Ågren/United Minds, Toivo Sjörén/Sifo and Camilla Sandberg/Demoskop. The reason for this selection is that these pollsters are affiliated with the Swedish newspapers DN, Aftonbladet (AB), SvD and Expressen which we have examined in regards to their reporting on polls. The method, data and results of this review are presented later in the thesis.$

2.1. Designing(polls((

Some of the main elements of poll design are illustrated with the help off Figure 1.$

�5

�

The target population (A) is the entire group a researcher is interested in drawing conclusions about. In pre-election polls the target population is usually those who are eligible to vote (e.g. Swedish citizens 18 or above). The sampling frame (B), from which the sample (G) is drawn, is an important tool when designing polls as it determines how well the target population is represented. A less than perfect frame leads to coverage errors, manifested as either over- (D) or under-coverage (H), which in turn can lead to insufficiencies in representativeness. The sample population (C); i.e., the intersection between the target population and the sample frame, is normally a subset of the population.$

To collect the data, the surveyor choses from a number of modes. Telephone interviewing, face to face, mail questionnaire and web questionnaire. The choice of method has historically been influenced by both theoretical aspects and practical considerations. The prevalence of the internet now pushes the field towards more web based approaches. In Sweden today, the general praxis is to conduct pre-election polls via a so called dual frame, consisting of both landline- and cell phone

Target population Sampling frame

Sample

Target sampled

Missing from sample

Non target Over-coverage

Under-coverage Sample population

A B

C

DEF

G

H

I

Figure 1.

�6

numbers. Three of the four pollsters we have examined; Ipsos, Sifo and Demoskop, use this method. Web access panel, the method used by United Minds, our fourth pollster, is on the rise. It is faster and cheaper, but the vast majority of today`s web based approaches use non-probability sampling. Thus, they do not enable the pollster to make probability based inference . $4

When the data has been collected, it is processed. Data processing involves the steps taken in turning data into information. These steps include coding, data capture, editing and data quality and this has to be done appropriately to reduce the errors in the data file.$

2.2. Sampling(

Broadly speaking, there are three versions of sampling. Probability, non-probability and a combination of the two. In probability sampling, every unit has a known nonzero probability of being selected.  In non-probability samples, the relationship between the sample and the population is unknown. Whether or not the sample has been selected randomly will affect the precision of the estimates it produces (Biemer and Lyberg 2003). Both probability and non-probability sampling will generate a survey error but probability methods are less likely to lead to selection bias. The significance of this was realised in the 1936 U.S. presidential election campaign, when the magazine Literary Digest conducted a mail-in survey (non-probability sampling) that generated over two million responses. Despite the large sample, the survey incorrectly predicted Alf Landon as the winner over Franklin Roosevelt (Wanga et al. 2013). $

The high telephone coverage of modern times has provided pollsters with a means of contacting random samples of the population at relatively low costs. But concerns like increasing non-response in addition to the ever expanding internet access have spawned discussions on whether non-probability sampling methods could be an acceptable alternative (Baker et al. 2013). The fact that a survey is based on a probability sample does not mean it is a reliable reflection of the population it

�7

Web panels can be make use of probability sampling but it is unusual and reduces much of the cost 4

and speed advantages.

purports to measure. In the area of pre-election polls, non-probability samples have actually been shown to be capable of producing results as good and sometimes even better than probability based samples (Baker et al. 2013). $

Ipsos, Sifo and Demoskop, use some form of probability sampling. Despite this, none of them report response rates. As to why, TNS Sifo replies that it is because of their sampling method. The review of their method they have shared with us, explains that non-contacts (in this case either people not answering the phone or people not present in a household at the time of the poll) are replaced by new targets or another member of the household. This is done to achieve a daily quota of approximately 250 responses. As such, their method departs from probability sampling and the response rate cannot be calculated in a conventional manner. It seems reasonable to think that a similar explanation applies to the other pollsters, though as they have not shared any methodological details with us, we cannot know for sure. $

As previously implied, it is certainly possible to conduct web surveys using probability sampling, and quality reviews of such surveys have been encouraging. They are however unusual (Baker et al. 2010). United Minds uses a non-probability web access panel approach. Apart from this meaning that their results might not command the same respect as those based on probability sampling methods, they are also harder to describe. Concepts like the margin of error and response rate, cannot be used in relation to non-probability approaches. Thus, United Minds do not report any measure of uncertainty, which is as it should be. There are however standards for online panels, like the ISO 26362:2009, and guidelines aimed at clients of online panel providers such as “ESOMAR: 28 Questions to Help Buyers of Online Samples”. However, “There currently is no generally accepted theoretical basis from which to claim that survey results using samples from nonprobability online panels are projectable to the general population” (Baker et al. 2010). This should decidedly not be confused with these surveys being less erroneous. If anything, for a newspaper like AB, using United Minds’ non-probability sampling as the basis for their pre-election polls increases the burden of explaining the inherent uncertainty of the polls. As they cannot quantify the error of the polls in a

�8

theoretically acceptable way, they should take it upon themselves to emphasise the uncertainty of the polls in other ways. $

2.3. Survey(error(

The total survey error can be divided into two parts; the sampling error and the non-sampling errors. The following subsections discuss these errors in the context of pre-election polls. $

2.3.1.(Sampling(error(

“Classical sampling theory assumes that sampling is conducted with a perfect frame, that every unit selected for the sample responds fully and accurately, and that no errors are introduced in the data from data processing.” (Biemer and Lyberg 2003). Three of the Swedish pollsters affiliated with the newspapers in our study rely on the above citation, that is, they use the sampling error as the sole source of uncertainty in reporting their results . As United Minds (affiliated with AB) uses 5

non-probability sampling they are excluded from the subsequent discussion. $

The sampling error arises from the fact that we are only surveying a part of a population, but can only be quantified if the sample is drawn in accordance with probability sampling. So how precise are pre-election polls when the only source of error considered is the sampling error? Swedish pollsters typically interview between one and two and a half thousand people in their polls . As sample size gets 6

larger, the sampling error gets smaller according to the simple formula:$

� $

where 1,96 is the z-value for the sampling error to form a 95% confidence interval around (±) the point estimate, and the second term is the standard error.

(1.0)= 1,96 × p(1− p)n

Sampling error

�9

Ipsos uses a revised sampling error based on what they call “efficient base” instead of “actual base”. 5

They state that they do this to impede the weighting from creating false significances in the results.

Ipsos typically report having contacted about 2 200 individuals. Sifo just below 2 000, and Demoskop 6

approximately 1 250. United Minds, though not quite relevant to this section, report the lowest number of respondents, normally around 1 100.

In a sample of 1 500 this means that an estimated proportion of 0.5 (or 50%) for a political party, has a sampling error of about 2.5% and a confidence interval of 47.5-52.5%. But let us also look at this from another perspective. What sample size would be required to discover a 1% change in support for the aforementioned 50% party? Quite large it turns out: “…to give a researcher an even-money chance of detecting a 1 percentage point change using a 90% confidence level, the researcher requires roughly 13,500 respondents per sample.” (Jackman 2005). For detecting an underlying change of 1% at a 95% level of confidence, the sample would be required to include around 65 000 people. $

� $

Figure 2.

Each line represents the sample size required to detect differences (from a 50% baseline) with a

given probability. For each case the following applies. H0: Not separated from 50%, and H1:

Separated from 50%, using a 95% confidence level (Jackman 2005).

�10

Reading from the above figure (Jackman 2005) we can see that the Swedish pollsters, reporting samples of approximately 2 000 respondents, have something like a fifty-fifty chance at detecting a true change in voter support of about three percent. In comparison, the largest pre-election poll in Sweden, conducted twice a year by Statistics Sweden with around 5 000 respondents, would detect a true change of three percent with much greater certainty, between 90 and 95%. $

The above examples all apply to proportions around 0.5. Sampling error is relatively higher for subgroups (in the Swedish context voters supporting smaller parties) since each party is such a small part of the sampling frame. In the reporting of polls in Sweden, great emphasis is put on comparisons, for example with previous polls. Comparing two proportions, e.g. the estimate of party X for March and April, requires another formula:$

� $

The pooled proportion is retrieved by the formula:$

� $

Consider the fol lowing example, in which the formulas are employed. Kristdemokraterna (KD) is a party that recently has been getting results indicating they risk losing their place in the Swedish parliament, for which you need 4% of the votes. If we a assume a constant sample size of a thousand respondents and an initial support of 4%, how much would the next poll result need to differ to indicate a shift in support with 95% certainty? To get an absolute z-value over 1.96, the result (point estimate) would need to be approximately 2.45%. As we can see, citing ”shifts” in support of minor (insignificant) changes when dealing with these smaller parties, quickly becomes ridiculous. A ”change” in support of 0.5%, in this case would mean that five more/less people have declared their support for KD.

z = p1 − p2pc(1− pc )

n1+ pc(1− pc )

n2

where p1 and p2 are proportions, pc is the pooled proportion and n1 and n2 are the respective sample sizes.

(1.1)

pc =X1 + X2n1 + n2

where X1 and X2 are the number of respondents. (1.2)

�11

However, hovering around that threshold of 4%, 0.5% might seem important to a prospective voter, if such a claim is not distinctly declared insignificant.$

Another thing worth mentioning is the fact that for large populations, the sampling error is not sensitive to how large the populations are. The sampling error for a sample of a thousand does not differ between Sweden and the U.S. despite the voting population in the U.S. being well over a hundred million people larger. $

2.3.2.(Non>sampling(errors(

Non-sampling errors can be classified as random or systematic errors. Random errors are unavoidable but generally cancel each other out in large samples. Systematic errors accumulate over the entire sample and often lead to bias in the estimates. In principle systematic errors could be avoided but they are commonly both difficult to control and measure. $

The task of assessing the impact of non-sampling errors is tricky and to a large extent subjective. In regard to election polls, the non-sampling errors of primary interest are specification, coverage, measurement and non-response. These are seen as the main cause of poor performance of election polls (Ruiz 2010, Fumagalliand Sala 2011). $

2.3.3.(Specifica9on(error(

Specification error commonly occurs in the planning stage of the survey due to a difference between the concept implied by the survey question and the concept the survey question aims to measure. Determining voters` political preference is not a simple task; e.g., nuances in how a question is formulated or where it is placed in the questionnaire can affect the results. In the context of pre-election polls, some of the potential specification errors are the wording of the question(s), difficulty in interpreting survey instructions by the respondents and the general challenge of measuring abstract concepts. Even though the questionnaires in pre-election polls are standardised, short, simple and have been more or less unchanged for years,

�12

there can still exist specification issues. Question wording is probably the greatest source of bias and error in the data (Newport et al. 1997). $

However, even when the wording accurately reflects the pollster's intention, differences in the respondents' interpretation might create problems. As an example, if one look at the main question posed by the Swedish pollsters examined in this study ; i.e., United Minds, Sifo, Ipsos and Demoskop: “How would you vote 7

if the election were held today?” The concept of the question has an abstract 8

element. The pollsters’ intention with the formulation “If the election were held today…” is to get people to answer how they would vote in an imminent hypothetical election. However, it is reasonable to assume that some will interpret the true intention of the question as being “How will you vote in the upcoming election?” especially as polls are widely used for speculation about future election outcomes. This might evoke a number of considerations not intended by the researcher. In addition, the differences in interpretation probably depend heavily on the timing of the poll. With three years left to the election, most people will surely try to imagine the election being held the day of the poll, while a week before the election everyone will undoubtedly perceive the question as “How will you vote in the upcoming election?”$

The question could also be interpreted as asking for which party the respondent wishes to see win the election, as speculation about this is often what the results are used for. This may also skew the responses from what the pollsters try to measure; the current political preferences of the voters.

�13

All the pollsters pose a second question to respondents answering ”do not know”. The wording is 7

somewhat different but the sentiment is ”towards which party do you lean?”. TNS Sifo and Demoskop also pose a third question if needed that asks roughly ”which party do you think is less/least bad?”. The answers of the questions are totalled to form the basis for the estimates.

Precise wording, authors’ translation: UM: “How would you vote if the election to the national 8

parliament were held today?”. TNS Sifo: “For which party would you vote if the election to the national parliament were held today?”. Demoskop: “For which party would you vote if the election were held today?”. Ipsos: “If the election to the national parliament were held today, for which party would you vote?”

2.3.4.(Coverage((frame)(error((

Coverage or frame errors manifest as either over- or under-coverage. Over-coverage in pre-election polls occurs when invalid respondents are included in the sample. In the Swedish context, this includes those who are not eligible to vote (e.g. under aged or non-citizens) and those duplicated, meaning they can be sampled more than once (higher probability of getting sampled). Under-coverage refers to elements of the target population that are of interest but that are not included in the sample frame. $

Pre-election polls are most commonly conducted via telephone. Internationally, the most widely used method to design the sampling frame is known as Random Digit Dialling (RDD). As the name implies the gist of the method is random number creation. RDD is popular mainly because it facilitates probability sampling and includes unlisted numbers (avoiding under-coverage). It is rather hard to implement RDD in the Swedish case (Petersson and Holmberg 2008, Bülow 2009), but TNS Sifo (and possibly/probably others) use a method utilising number creation. They draw a random sample of numbers from a phone register. From each of the sampled numbers they create nine additional numbers. They then randomly select one of these ten numbers for dialling, thus including secret/unlisted numbers (Forsman 2010). Statistics Sweden instead randomly selects samples from the population register. The randomly selected individuals are then pursued by various means. As RDD, this approach avoids the problem of unlisted numbers. Another approach is to just draw a random sample from a phone directory. This method runs the risk of substantial under-coverage.$

Cell-phones have become prevalent partly on the expense of landlines. For example, 65% of the Swedes in the age group 21-30 do not have a landline (PTS 2013). To overcome under-coverage it is therefore necessary to use a dual frame sample design. If the sampling was conducted using a cell-phone-only frame, groups relatively less prone to own a mobile phone, for example the elderly, would be under-covered. Conducting a poll based on a landline frame would instead fail to represent other fractions of the population. Problems like these lead to selection bias, as some units have larger probabilities to be sampled, which might result in

�14

misleading inferences (Baker et al. 2013). In Sweden, the combined coverage of landline and mobile phones is 99% (PTS 2013:20). Uncritically using all the information in these frames would lead to over-coverage but this can be managed by making sure that an individual only can be represented by one number; i.e., either by landline or cell phone. However, the dual frame method is not without complication as research suggests that respondents with dual service respond differently, depending on whether they are contacted on their cell or landline phone (Lavrakas et al 2010). Thus, it is debatable how well these methods actually go together.$

Pre-election polls are also conducted using web access panels, overwhelmingly constructed using non-probability methods. As an example of the problems an online approach poses, elderly people use the internet less frequently (PTS 2013:20). These panels could thus obviously lead to substantial under-coverage of this group of voters. The bias resulting from this could in addition be intensified from the well-documented fact that the elderly are also more likely to vote (Oscarsson & Holmberg 2010). Pollster using web panels try to circumvent these problems by weighting their samples to achieve representativeness.$

2.3.5.(Measurement(error((

Measurement errors originate from a number of sources. They can be caused by the questionnaire, the data-collection method, the interviewer and the respondent. As pre-election polls are often short and fairly easily processed, our interest is mainly focused on the errors caused by the respondent. For our context, this is the primary source of measurement error. Research has shown that responses to survey questions concerning political issues can lead to response instability, as most respondent will answer randomly (Sniderman 2004). However, this might be more of an issue if the questionnaires are long and complicated.$

In Sweden, there are currently eight parties holding seats in parliament. That means that pollsters interviewing via telephone must choose to either list the parties to the respondent or not. As it turns out, respondents tend to be more prone to pick the parties at the beginning and end of a list of provided alternatives. These effects

�15

are known as primacy (beginning) and recency (end) effect and can severely bias survey results (Berg et al. 2010). Listing eight options plus an ”other” option might give rise to primacy and recency effects. To compensate for this, the pollster might only provide the list when requested and/or randomise the order in which the alternatives are read. Interestingly, all three pollsters we have interviewed have chosen different approaches. TNS Sifo always list the alternatives in randomised order, Demoskop provides randomised alternatives upon request, and Ipsos neither lists nor provides options upon request (Institute Interviews 2014).$

Another well-known psychological effect arising in polling situations is the social desirability bias. The effect concerns the fact that people are more prone to give answers in line with what is socially acceptable. In Sweden, this could for example lead to the controversial party Sverigedemokraterna to be underestimated by pollsters using telephone interviewing.$

Aside from these mostly subconscious demeanours, there are also more deliberate behaviours affecting the pollsters` estimates. Voters might consider voting tactically but might not want to reveal this in polls as they only wish to do so if the polls show that it is necessary. In other instances, respondents might want to send a message by declaring their allegiance to another party, in response to a campaign event or a policy change. $

All this naturally makes it hard to know exactly what is actually measured.$

2.3.6.(Non>response(error(

Non-response errors occur when the results fail to include the responses of respondents in the selected sample. Reasons for this type of error may be that the respondent is unavailable or temporarily absent; that the respondent refuses to participate in the survey; or the pollster is unable to get respondent due to time constraints of the pollster. The problem of non-response is central especially when conducting pre-election polls as the response rate is generally low. This could produce biased results as those who responded may not necessarily represent those who did not responded to pollsters. Time constraint may also affect availability of

�16

respondents; for instance, the institutions examined in this study have duration of one to two weeks (Petersson and Holmberg 2008).$

In the last decades, the rate of non-response has increased significantly from about 15% in the early 80’s to almost 40% in 2012 (Statistics Sweden 2012). It is common in Sweden to use a dual frame were both landline and cell phone numbers are included. Landline is often used as the main sample frame to which cell phone numbers is added. To avoid duplication, the respondents reached on their cell phones are asked if they have a landline. If they do, the interview is ended hoping to reach them on their landline. The problem with this is that one does not know whether the respondent in practice can be reached in the same way via both types of services; i.e., how often, if at all or if they will answer a call on a particular service they have. This is an example of a non-contact related non-response issue (Lavrakas et al 2010). Accuracy of a survey can be affected by non-response error, if the group that is interviewed holds considerably different opinions and intentions from the group that was also sampled but not interviewed. Thus, interviewed respondents may not accurately represent the target population. However, low response rate does not necessarily mean that a survey is inaccurate (Traugott and Lavrakas 2008). $

2.4. Weigh9ng(

Weighting is an important and common practice in pre-election polls. It is done to ensure that the sample more precisely reflects the characteristics of the population from which it was drawn. Weighting is mainly used to account for under-coverage and non-response. It is also used to identify likely voter.$

To reduce some concerns in under-coverage, Random digit dialling (RDD) may be used, as the inclusion probabilities of listed and unlisted numbers are the same. An alternative approach is post-stratification. In this case, weights are used to adjust for respondents' unequal probabilities of selection in the sample frame, so that the data represents known population characteristics such as age, gender, region and education. This can be done as they can be compared to known characteristics of the population obtained from an external source, such as Statistics Sweden. Weighting is also done to adjust for non-response in a final sample whose

�17

demographic characteristics do not match those of the target population very well (Traugott and Lavrakas 2008). This adjustment in the final sample is made because some demographic groups tend to be overrepresented or underrepresented. It is reasonable to weight the data as weighting procedure may be successful in decreasing the extent of non-coverage and non-response bias (Fumagalli, and Sala 2011). $

In pre-election polls, pollsters should give greater weight to the answers of those respondents who are most likely to vote. Otherwise one is assuming that all the respondents will vote on Election Day with the same probability. However, if this assumption is incorrect it will introduce bias on the final estimates (Traugott and Lavrakas 2008). In the U.S., voter turnout is below 60% (Census Bureau 2012). The issue of assessing how likely a respondent is to actually vote is thus a big concern. It is therefore common for pollsters to use a combination of question to identify likely voters. The voter turnout in Sweden is above 80% (Statistics Sweden 2012), but the likely voter approach can still be useful to reduce the bias in the final estimates. For polls regarding the national vote, all the pollsters interviewed in this study use weights regarding previous voting behaviour to adjust their estimates, but nothing more advanced is done to identify the likely voter. Demoskop states that in polls regarding the European parliament (where turnout is much lower) they do “…a Bayesian weighting, by the self-reported intention to vote…” (Pollster interviews 9

2014).$

2.5. Bayesian(and(frequen9st(sta9s9cs(

Different aspects of election polls have been discussed in the earlier sections. Once the data are ready for statistical inference, an election prediction can be made using different statistical analysis approaches. With regard to statistical analysis based on survey data, there exist two competing schools of thought; the Bayesian and the frequentist. A brief historical resume is given bellow, followed by a description of the approaches.

�18

Authors’ translation.9

Bayes theorem was conceived in the 1740-ies, by Reverend Thomas Bayes, but it was not published until two years after his death, in 1763(Bellhouse 2004). The theorem was given its modern mathematical form and scientific application by Pierre Simon Laplace 1774. Later, in 1810 Laplace introduced the central limit theorem and came to the conclusion that if large amounts of data were available, the Bayesian and the frequentist approaches would give the same results (Hald 2008). This conclusion favoured the use of the frequentist approach due to its simplicity as compared to the Bayesian approach. Thus the world’s first known Bayesian became a frequentist in turn leading to a decline in the use of the Bayesian approach. This persisted until around the end of 18th century. In fact, in 1891 George Chrystal, a famous Scottish mathematician, claimed that the Bayes theorem was “dead” and “buried” (McGrayne 2011). The popularity of the frequentist approach soared even higher when Jerzy Neyman published his classic paper in 1934. His work provided a platform for subsequent research to further strengthen the frequentist approach. $

In the 1960-ies' tight presidential race between Nixon and Kennedy, a Bayesian approach was used to correctly predict Kennedy as the winner, making the method more attractive (McGrayne 2011). The Bayesian approach lost the popularity in favour of the frequentist in the early 1980-ies and it took almost 30 years for it to make any real comeback (McGrayne 2011). In 2008, it did, when Nate Silver used the Bayesian approach to make remarkably accurate predictions in the U.S. presidential election. A feat he repeated in 2012. $

In simple terms, the frequentist method analyses the existing data and makes predictions solely based on the data. In the Bayesian approach, the data is amended with prior information, to get posterior or updated information. In terms of election polls, prior information could be any number of things; e.g., consideration of opinion trends, future campaign spending, historical voting behaviour, economical aspects, expert judgment or earlier studies. Bayesian inference provides the opportunity to take into account information that is available before the data are collected. A technical definition of the two approaches is given below. $

�19

Bayesian methods are based on the idea that population parameters are random variables and unknown quantities. The distributions of these parameters are called prior distributions and contain our knowledge about a given parameter, before we add the insight coming from our sampled data.$

� $

This formula is a diachronic (through time) interpretation of Bayes theorem, indicating that if you have new evidence (E), you can update your confidence in the hypothesis (H) using the term to the right. In this sense the data are fixed but the parameter (voter’s preference) can vary. The prior information can be used along with current polling data to arrive at the the posterior distribution (using Bayes’ theorem) of the probability of e.g. a party winning the election. The posterior distribution is a transformation of the prior distribution that embodies an updating of beliefs after observing some data. The likelihood function is established from the data (expresses the distribution of the data given the parameter of interest being looked at) and the posterior distribution is designed in such a way that the likelihood function dominates the prior distribution. The procedure given by the Bayes Theorem can conveniently be summarised as:$

� $

Frequentist methods regard the population value as a fixed, unvarying but unknown quantity without a probability distribution. Frequentists then calculate confidence intervals for this quantity, or make significance tests of hypotheses concerning it. In such an approach, analysts estimate parameters, often with an assumed distribution, and standard errors around them. In regard to pre-election polls a frequentist approach assumes that there is a fixed parameter that represents the true state of the voters. In doing so, this approach employs well established statistical data analysis methods, such as regression approaches and decision trees.$

where p(H | E) is the posterior, p(E | H) is the likelihood, p(H) is the prior and p(E) is a normalising constant.

p H | E( ) = p E |H( ) p H( )p(E)

(2.0)

Posterior ∝ Prior x Likelihood. (2.1)

�20

The approaches are different but both ways of thinking have advantages and

disadvantages. Bayesian inference is a powerful technique that builds on prior

knowledge to estimate the probability of an outcome. However, a major difficulty with the method is deciding on the prior distribution. This is going to influence the prediction of the, for example, election outcome. Yet it is an inherently subjective synthesis of the available information, which means that the same data analysed by different people, can lead to different conclusions. Another difficulty is that Bayesian methods may lead to intractable computational problems. However, considering the pace at which computational power has increased and continues to

increase, this issue seems to rapidly decrease in relevance.

The approaches used in the frequentist method are much better established in scientific research. In respect to this, the frequentist approach is more convenient and practical. The major disadvantage is the fact that prior information cannot be incorporated in the analysis. In a runaway election, such an approach can deliver a confident result, but in a close race, like for example the Swedish 2010 election (Oscarsson & Holmberg 2010), this approach becomes unreliable. In such instances, a priori (existing) information can be of decisive importance. Nate Silver, who correctly predicted 50 out of 50 states in the 2012 U.S. election, is an avid proponent of the Bayesian approach, albeit not everyone think he plays that role properly. In a blog post titled ”Nate Silver is a Frequentist: Review of “the signal and the noise””, Larry Wasserman, professor in statistics, claims that Silver has misunderstood what constitutes the Bayesian and frequentist methods, since “…his goal is clearly to have good long run frequency behaviour.”, which Wasserman deems to be indicative of a frequentist approach (Wasserman 2012). Wasserman got some pushback on proclaiming Silver a frequentist, from among others Gelman (2013), which goes to show that the separation between the two “schools” is not necessarily clear cut. We would put like this: If there is a lot of prior information, it is probably a good idea to consider how it might be of use.$

�21

2.6. Accuracy(and(forecas9ng(

How can we begin to assess the “real” error of these polls? The general challenge revolves around the fact that we very seldom have a true value in relation to which we can evaluate our estimates. In the Swedish context, only once every four years . 10

In addition, if we accept that the voters' preferences are not completely stable in the run up to the election, we can only evaluate the final polls. A way of solving this problem is to estimate a true value by aggregating the available polls. $

The obvious problem with this is that “this procedure uses the result of a poll to help estimate a “reality” against which that poll’s results are to be judged” (Lau 1994). Another problem is that if many of these polls have the same systematic errors due to similar methodology, this will not be detected. An evaluation of Swedish pollsters using a type of poll of polls as an estimate of the true value has been conducted by Bergman (2013). The issue of evaluation estimates using an aggregate of estimates can be addressed in different ways. For example, Lau (1994) excludes the pollsters that are to be evaluated from the estimation of the true value. But basically, one cannot know how good these true value estimates are.$

Another approach to analyse polls is to try to assess how much of the variation leading up to an election actually originates from changing voter preferences, and how much is due to randomness. Erikson and Wlezien (1999, 2002) have done this in regard to U.S. conditions. In their review of 15 presidential elections in the period 1944-2000 they found that “the ratio of true variance to error variance is about four to one. We know this because the variance we observe is about four times what we would expect from sampling error alone...” Erikson and Wlezien (2002). This approach gives an indication of the “true variation” in voter preference, but is very rough and varies wildly between different elections. $

To reliably evaluate the precision of polls we have to look at the pollsters’ final estimates and compare those to the election outcome. In 1996, Everett Carll Ladd, Jr claimed that the performance of polls in the U.S. 1996 election was so poor that ”the entire enterprise should be reviewed by a blue-ribbon panel of

�22

The Swedes cast their votes for the national parliament every four years.10

experts” (Mitofsky 1998). Mosteller (1949) proposed eight measures for evaluating poll errors. Since then, other measures have been developed but in many instances the initial measures are still used. Warren J. Mitofsky confronted the claim made by Ladd by analysing the polling performance using a number of measures from Mosteller (1949). He arrived at the conclusion that the claims of abysmal performance were much overstated; “...by any of the measures reviewed here, 1996 was not the best but was far from the worst year for the polls”. The review stated that election polling was fairly average, however troubled with a clear democratic bias (Mitofsky 1998). In 2001, Traugott followed up on Mitofsky’s article by examining the 2000 election, using the Measures 3 and 5, which were also the measures considered most appropriate by Mitofsky. Measure 3 is calculated as the average difference between each estimate and the outcome. Measure 5 is the difference in difference between the estimate and the outcome of the two leading candidates . The 2000 U.S. election was a much tighter race and in conflict with 11 12

the actual outcome, most pollsters predicted Bush to receive a majority of the votes . Despite this the analysis shows that the polling accuracy was about average 13

when compared to other modern day elections (Traugott 2001). $14

With the shortcomings of Mosteller's measures as a springboard Martin et al. (2005) set out to create an improved measure. The result was A, a measure of accuracy based on the log odds ratio of the results of a poll and the outcome of the election:$

� $

The interpretation is simple. “The farther from one an odds ratio is, the worse the poll performed at predicting relative preferences in the election.” (Martin et al.

2005). This measure has the strengths of being amenable to multivariate analysis,

where pa and pb are estimated proportions and PA and PB are election outcome proportions

A = ln pa / pbPA / PB

⎛⎝⎜

⎞⎠⎟

(3.0)

�23

So if Candidate 1 is estimated to have a 5% lead over Candidate 2, and the outcome is a 2% win, 11

Measure 5 would be 3.

Famously won by the republican George W. Bush, despite receiving fewer votes than the democrat Al 12

Gore.

I.e, a republican bias was present in the estimates.13

1956-2000.14

less sensitive to the size of the group undecided voters, standardised (comparable over elections) and possible to use for assessing pollsters as well as overall polling accuracy. After the 2004 election, Traugott again reviewed the performance of polling in the U.S., now using A as well as Measures 3 and 5, a setup that has since been common. He concluded that the pollsters' performance was mostly fair with a majority indicating Bush as the likely winner, and that they performed above the historical average (Traugott 2005). In the article proposing the Measure A, the authors state that it is not limited to two-party situations and suggests how one might proceed to include a third party. $

The 2008 election garnered unprecedented interest in polling, in no small part due to the rising popularity of poll aggregating websites in general, and Nate Silver’s Fivethirtyeight in particular, at the time published by the New York Times. A lot of the discussion (and controversy) revolved around new techniques being used, but the general performance seems to have improved. In their final estimates, all pollsters showed Obama as being ahead and the pollsters' estimates all captured the actual election result within their confidence intervals. In addition, evaluating the accuracy using Measures 3 and 5 from Mosteller (1949) indicates improved performance, substantially better than the historical average. Analysing bias using A revealed no significant results (Panagopoulos 2009). $

In the Swedish context, the data available is too sparse to make any assertive claims. We have compiled a summary of the accuracy of pollsters producing estimates in the last three elections, using election data from Statistics Sweden and historical data on pollster estimations compiled by the Swedish pollster Novus. For the 2002 polls we have not had access to the actual sample sizes. Sample sizes of 1 500 have been used to calculate the results for TEMO and Gallup. For Sifo, Skop, Ruab and Demoskop the sizes have been estimated from what we know about the pollsters sample sizes in other elections.$

!!

�24

We have done the calculations by grouping the parties in the way that has been more or less formally defined during this period, so as to portray a left- and right leaning alternative , as the measures are not suited for multi-party applications. 15

The way A is calculated…$

� $

…a positive number suggests a right-leaning (r) bias, and a negative number a left-leaning (l) bias. The performance of pollsters in Sweden seems to have improved

Table 1. Mosteller 3 Mosteller 5 Measure A C.I include outcome

C.I include outcome

Avg. error % Avg. error % Log of odds-ratio Left-leaning Right-leaning

2010 Skop 0,20 0,40 0,008 Yes YesNovus 0,35 0,70 −0,016 Yes YesUnited Minds 1,40 2,80 −0,059 Not applicable Not applicableSifo 1,10 1,20 −0,028 Yes YesSynovate 0,45 0,50 0,012 Yes YesDemoskop 1,45 2,90 0,061 Yes Yes

Total 0,83 1,42 −0,003

2006 Synovate Temo 2,35 4,70 0,099 No YesSifo 0,25 0,50 0,010 Yes YesSkop 0,85 1,50 −0,032 Yes YesRuab 0,75 0,10 −0,003 Yes YesDemoskop 1,65 3,30 0,069 Yes Yes

Total 1,17 2,02 0,029

2002 Ruab 2,75 5,50 0,115 Yes YesSkop 2,05 4,10 0,087 Yes YesSifo 1,90 3,80 0,080 Yes YesTEMO 1,70 3,40 0,072 Yes YesGallup 0,85 1,70 0,035 Yes YesDemoskop 2,90 5,80 0,120 Yes No

Total 1,74 3,47 0,07 93,8 % of C.I’s include outcome

A = ln pr / plPR / PL

⎛⎝⎜

⎞⎠⎟

where pr and pl are the estimated proportions of the right- and left-leaning alternatives, and PR and PL are the election outcomes

(3.2)

�25

In left-leaning alternative: Socialdemokraterna, Vänsterpartiet and Miljöpartiet. In right-leaning 15

alternative: Moderaterna, Centerpartiet, Folkpartiet and Kristdemokraterna. In neither: Sverigedemokraterna.

over the course of the three elections. From producing an average Measure 3 error of 1.74 in 2002, beating six of the thirteen U.S. elections reviewed in Traugott (2005), to an average error of 0.83 in 2010, better than all U.S elections. Measure 5 shows a similar development going from 3.47, just above the U.S. historical average of 3.3, to 1.42. When investigating the Measure A in the Swedish context, pollsters seem to have consistently overestimated the right-leaning alternative in 2002, while 2006 and 2010 shows only small biases in different directions. The final estimates from pre-election polling seems to be holding up fairly well, even though it is hard to draw any conclusions about Sweden due to the scarcity of data. $

So are polls the best predictor at our disposal? On this topic, there is substantial disagreement. Though more ways have been suggested, we focus on comparing polling with market based predictions and the poll-based aggregation and modelling approach popularised primarily by Nate Silver. $

Markets, like Iowa Electronic Markets (IEM), offer the chance of buying and selling contracts whose liquidations are tied to some future event, for example an election. As with any market, the prices of the contracts are considered to be a result of all available information. Berg et al. (2001, 2008) note that markets seem to be consistently more accurate than polls. An advantage that seems to be even greater farther from the election. Several points can be raised to contest the relevance of this comparison. One the authors raise themselves is the fact that markets and polls measure different things. Polls usually asks the question; “How would you vote if the election were held today?”, while election markets imply the question; “What do you think the outcome of the election will be?” (Berg et al. 2001). Erikson and Wlezien (2008) question the superiority of market from this very perspective; “Since market prices reflect forecasts of what will happen on Election Day and trial-heat polls only register preferences on the day of the poll, it is inappropriate to naively compare them on any given day in advance of an election” (Erikson and Wlezien 2008). They claim that when adjusting polls for this fact they outperform IEM, though it should be noted, that bigger markets might accomplish better results in the future. Rothschild (2009) in turn, criticises Erikson and Wlezien (2008) for compensating for a poll bias while ignoring an easily corrected bias in

�26

market predictions. He goes on to compare Fivethirtyeight’s predictions in the 2008 election, with Intratrade and finds that while Fivethirtyeight’s debiased 16

forecasts beats the market the reverse is true when the market forecasts are also debiased. It seems safe to say that comparing raw predictions favours markets, while more in-depth analysis show some conflicting results. Theoretically, markets should have the upper hand since they can encompass the data available from polls, as well as other types of information. $

The comparison between poll-based and market-based predictions is not yet applicable on Sweden. The poll-based aggregation methods exist in some form. Henrik Ekengren Oscarsson, professor of political science at Gothenburg University, regularly produces a poll of polls that weighs included polls by size and date. Novus, a Swedish pollster, also publishes a poll of polls. But the sophistication of something like Fivethirtyeight is yet to appear. It is possible to bet on election outcomes via odds betting sites like Unibet and Ladbrokes, but a dedicated political market does not exist. It should also be said that the political landscape in Sweden is ill suited for “winner” predictions. As it stands, there are two political alternatives in the form of two groups of more or less affiliated parties. But this might change depending on the election outcome. It is not a clear cut winner-takes-it-all scenario as in, for example, the U.S. presidential election.$

3. The(media(In this section, in addition to reviewing previous literature and describing current standards in the field, we outline our own study of four of Sweden’s biggest newspapers, examining the reporting of pre-election polls. To support this discussion we have conducted interviews with some of the included newspapers, in addition to their affiliated pollsters (as mentioned earlier). We have received answers from Lena Melin of AB, Lina Modin of Expressen, but Caspar Opitz of DN has declined to answer (due to time constraints) and despite several attempts, we have not been able to get any response from SvD. $

�27

Another prediction market. Closed as of march 2013, relaunch of a 2.0 version pending.16

3.1. Standards(

The field of polling has come a long way in the last half century. The increasing importance of pre-election polls in shaping the discourse of political debate, has both helped popularise and professionalise the industry. In the U.S., this has led to many of the leading news organisations putting a lot of effort into polling and the reporting thereof. As previously mentioned, one such effort by The New York Times (NYT), the blog Fivethirtyeight became very famous in the run-up to the 17

2008 election because of its take on polls and election forecasting. The NYT also has fairly extensive standards to which they hold their reporting to. “If we get it wrong , we ha ve not on l y mis led our readers , but a l so dama ged our credibility.” (NYT 2011). The standards use straightforward language to explain both how to assess the quality of the information of a poll, but also what to be careful about when writing about it, like understanding the pitfalls of interpreting the sampling error. For example pointing out that comparisons between polls must be made using a separate formula (NYT 2011) . The “ABC News' Polling Methodology and Standards” is yet more ambitious, or at least more technical. In great detail Gary Langer, of Langer Research Associates, the primary polling provider to ABC News, explain both how polls should be viewed in general and the methods used when conducting polls for ABC (ABC 2013). But ABC (again through Gary Langer) also provides a lighter guide to their viewers/readers on how to think about polls (ABC 2008). The National Council on Public Polls (NCPP), an association of polling organisations, publishes list of twenty discerning questions. They claim that “By asking these 20 questions, the journalist can seek the facts to decide how to report any poll that comes across the news desk.” (NCPP 2014). The list of questions and associated topics of discussion are indeed very thorough. The European Society for Opinion and Marketing Research (ESOMAR) have published a number of documents with regards to polling. Their guide on opinion polls state: “The first objective of these guidelines is to ensure that polling organisations take all possible technical steps to ensure that polls published close to the vital decision point for voters are an objective guide to the state of public opinion and voting

�28

The name Fivethirtyeight refer to the number of electors in the electoral college 17

(Fivethirtyeight.com 2014)

intentions.” (ESOMAR 2009). The publication provides guidance on publishing as well as specific advice for pollsters producing pre-election polls. The Swedish pollster Novus (not included in our survey) is a member of the organisation. None of the pollsters included in our study; United Minds, TNS Sifo, Ipsos or Demoskop are as far as we can tell members of ESOMAR. They are members of Swedish Market Research Companies (SMIF), which applies the ICC /ESOMAR Code, 18 19

“designed primarily as a framework for self-regulation” outlining ethical standards for market institutions and their clients (ESOMAR 2007). The World Association for Public Opinion Research (WAPOR) also make recommendations very similar to, and often issued together with ESOMAR. Another alternative is the ISO 20252:2012 standard for “Market, opinion and social research - Vocabulary and service requirements”, developed in response to an initiative from market research organisations (ISO 2006). $

Specifically Swedish standards or guidelines are harder to find. Radio Sweden has published a document called “Checklist for Surveys/Opinion Polls” , which has 20

surely been made with the best of intentions, but make several very problematic assertions. The document deem surveys with more than 50% non-response as suffering from “...substantial legitimacy issues” despite the research on the effects 21

of non-response being far from one-sided. A survey should have over a thousand respondents to be considered for reporting, which seems like a very arbitrarily chosen threshold. A statement asserting that Radio Sweden “..aside from Statistics Sweden only approve 5 pollsters” to deliver pre-election polls, concludes the 22

document. The checklist was met with protests and the updated version is somewhat improved. A passage about the possibility of making exceptions was included as well as some other modifications. However, the issues mentioned previously was not removed, though the language is moderately softened (Radio

�29

Authors' translation18

International Chamber of Commerce 19

Authors' translation.20



Sweden 2013a, Radio Sweden 2013b). The Swedish public service television company (SVT) has produced a similar policy document that’s slightly better, but mainly only because it is a bit less assertive. $

In relation to our review of the reporting on pre-election polls by the newspapers AB, SvD, Expressen and DN , we have asked both the newspapers and their 23

affiliated pollsters if they abide by any formalised guidelines or policy document concerning polls. None of them have provided us with any formal guide of their own. Lena Melin states that AB has an verbally agreed policy and Lina Modin says that Expressen have rules concerning which pieces of information that should always be published (Melin 2014, Modin 2014). Of the pollsters, only Sifo refer to a formal policy, the ESOMAR Code, which they claim to follow. Demoskop and United Minds only state that certain type of information always should be included, though the requirements on the extent of the information to be published are lower for AB/United Minds than for Expressen/Demoskop. Ipsos (DN) refer to their latest report as an example on how the reporting should look. $

3.2. Previous(studies(

Even though the awareness of the shortcomings of poorly conceived polls is much higher today than historically, the opportunity of conducting and publishing such poorly conceived polls have increased dramatically. As underlined in the introduction to the thesis, the incentives of the commercial stakeholders are hardly conducive to methods of best practise. Although the conditions in the U.S. cannot be directly translated to Sweden, the underlying problems are of course similar. Cliff Zukin, a former president of AAPOR highlights this dilemma: “A lot of pollsters are not following best practices and they know it,” he said, “including pollsters at some of the biggest news organisations, because there is a demand for it.”” (Rosenthiel 2005).$

The way that polls are handled by the media, seems to be put in the spotlight fairly seldom in Sweden. To be fair, the newspapers included in our survey have some

�30

Outlined in the next section.23

articles discussing the uncertainty of pre-election polls but as our study show (later in the thesis) these types of articles are very uncommon. Perhaps this is due to the fact that media (probably) would gain little by highlighting the flaws of their own reporting and the underlying material for this reporting. It might also be because it is something that has not been given proper attention by researchers. There are however a number of studies worth highlighting. A 2008 publication from the SNS Democracy Policy Council broadly investigates the role of the opinion poll in 24

society. They investigate the frequency with which the newspapers Aftonbladet, Expressen, SvD and DN, include basic statistical information in their reporting, and find considerable shortcomings in this regard. The margin of error is reported in just 0, 5, 40 and 50 percent respectively. It is clear, the report concludes, that the newspapers do not live up to neither the WAPOR/ESOMAR standards nor the standards they themselves claim to uphold (Peterson et al. 2008, p178-179). Strömbäck (2009) paints a similar picture. Both newspapers and television broadcasting severely lack in providing the reader/viewer with sufficient information. In 2006, the newspapers mentioned if a stated change was inside or outside the margin of error in only 18.3 percent of the cases. Strömbäck (2009) concludes that “Taken together, the results of this study thus show that if the media indeed use opinion polls to give voice to the people, they fail.” In addition to properly reporting the margin of error being all too unusual, the reported margin of error is almost always estimated by the sampling error. This is surely an understatement as it implies that non-sampling errors have no influence on the margin of error, which after reviewing the pre-election poll process to us seems wholly unlikely. Not mentioning this gives a false impression of quality.$

Looking outside Sweden, there are several reports on the media's relationship with polls. The international results seem to be in line with those from Sweden, in that they find a lot to be concerned about. In an article in Public Opinion Quarterly, Stephanie Greco Larson concludes that “Reporters seem to have received the message that they should report the margin of error, yet they are confused about what it means.” (Larson 2003). In a study of the Canadian 1997 election, Robert

�31

SNS, Centre for Business and Policy Studies in English, is a Swedish policy research institute. 24

Andersen states: “Polls were typically treated as matters of fact, with their limitations rarely discussed.” (Andersen 2000). In a review of six Flemish newspapers, Sonck and Loosveldt (2008) state that “poll results are used in quite an uncritical way by the Flemish newspapers”. $

It seems that the media both in Sweden and abroad are coming up short in their reporting of polls. That some reporters are not sufficiently well versed in statistics to make certain interpretations is one thing, but the omission of certain statistical information, vital for the reader/viewer/listener to be able to assess the information, must be considered to mainly be an editorial decision. $

In quite a lot of countries, concerns about the role of the pre-election poll have pushed legislation to regulate their use. We do not perceive this discussion to be especially elevated in Sweden, but the survey from Aalberg and Aelst (2014) reveals that 45% of the Swedish politicians and 33% of the Swedish journalists that answered the survey, supported a ban on publishing polls just before an election. $

3.3. Analysis(of(the(repor9ng(of(polls(

In writing this thesis, we want to give a broad and complete understanding of how polls are produced and reported. Though there are several studies investigating the media’s reporting of polls, some of which are cited in the preceding sections, we have not found any study completely suited to our purpose. We have therefore conducted a quality analysis of 200 articles on pre-election polls from four of Sweden’s biggest newspapers: DN, Aftonbladet, SvD and Expressen, from a statistical standpoint. We aim to assess both if the most basic pieces of information are in place and more subtle notions, such as if the reporting facilitates understanding of the underlying concepts and if the message of the text and the statistical results are well aligned. The overall results , detailed in the following 25

sections, are discouraging. In accordance with several previous studies, the reporting effectively masks much of the uncertainty of the polls, in a way that we feel give the results a false sense of precision. We expect that the continued

�32

It should be noted that the newspapers, in some respect, differ substantially amongst each other, 25

though we wouldn’t consider the result of any one paper to be acceptable.

discrepancy between this downplayed uncertainty and the actual uncertainty of the polls, will help fuel further skepticism towards survey research amongst the public. $

3.4. Data(

The data we have collected concern the media reporting of pre-election polls in the period 2011-2014. We have sampled 50 articles each from four of Sweden’s largest newspapers: Aftonbladet (AB), Expressen, Dagens Nyheter (DN) and Svenska Dagbladet (SvD). Two of them are evening newspapers (AB and Expressen) and 26

the other two morning newspapers. We chose between including Göteborgsposten (GP) and SvD as they both use TNS Sifo as “their” pollster. Despite GP having a somewhat larger edition (Dagspress 2013), we went for SvD due to GP’s strong geographical association with the city of Gothenburg (Göteborg). We decided not to examine every article concerning pre-election polls, but rather a subset that we deemed more suitable for the kind of analysis we had in mind. Our approach boils down to judging each article by a set of criteria. The criteria we use are not suitable for speculative longer form texts; e.g., most columns by political reporters. An article should fit the following description to be included. $

The main purpose of the article in question is to report the results of a pre-election poll. The poll is about a European , national, regional or local 27

election. It is a poll estimating the support for more than one of the involved parties. $

Thus, topics like “...which issue is most important in the upcoming election?”, “…who is most trustworthy?” or the like, have not been included. $

To find the articles we have used the respective newspaper’s webpage search function. In doing this we have found three things complicating the selection: 1) The search functions work differently, making it difficult to conduct the searches in a completely uniform way. 2) It is hard to asses if the newspapers are using the same

�33

Though they are now a days available from early in the day.26

The election to the European Parliament, not for example an election in another European country.27

wording with similar frequency . 3) The terms “poll” and/or “opinion poll” were 28

often excluded when the reporting concerned the newspapers’ “own” results; i.e. the results from the newspapers’ affiliated pollster. $

With respect to the above we employed the following strategy. We searched each web archive for articles including the name of the pollster affiliated with the newspaper and/or the term “opinion poll” . Thus we got a mix of the newspapers 29

reporting on their “own” polls as well as their reporting of polls in general . As we 30

have sampled 50 articles from each outlet and since the frequency of fitting articles is not exactly the same across the newspapers, the time spans covered by the articles differ. DN, AB and Expressen have similar spans, from mid 2011 to the beginning of April 2014. SvD has a higher frequency and thus the articles from it spans over a shorter time; September 2012 to the beginning of April 2014. The newspaper/pollster partnerships is as follows; AB/United Minds, Expressen/Demoskop (previous partner was Skop), Svd/TNS Sifo (earlier Sifo), DN/Ipsos (previous partner was Synovate which was acquired by Ipsos in the fall of 2011). $

These are the criteria used to judge each of the articles:$

1. Is the margin of error mentioned in either the text or the graphical presentation? Alternatively, is language like “statistically certain” (or equivalent) used? $31

2. Is the margin of error quantified?$

3. Is the meaning of the margin of error further explained in some way?$

4. Is the message of the statistical results as compared to the message of the text, conflicting or confusing? Here are some examples on conflicting/

�34

For example, the terms “opinion poll” and “opinion measurement” are used interchangeably.28

Authors' translation of the commonly used swedish word “opinionsundersökning”29

The only difference between the searches is that some of the search functions allowed us to do a 30

combined search and some forced us to do two separate searches.

The wording “not statistically secured” (rough translation) is much more commonly used in Swedish 31

media than “within the margin of error” that’s common in the US and other english speaking countries, though both are used.

confusing: Claiming that a party would loose their place in parliament if the election were held today, when the confidence interval includes the support needed. Making claims about increasing or decreasing support for a party when the changes are not significant . Stating that a party is e.g. ”the third 32

biggest party in Sweden” when this is not statistically significant in the poll results. $

5. Is there sufficient information about the poll within the confines of the 33

article? We consider information about who has conducted the poll, the number of respondents and the period the poll was in the field, to be sufficient in this context. As compared with the ESOMAR/WAPOR guidelines this is very lenient.$

6. Is the rate of non-response reported?$

Each criterion is coded with “0= No” and “1= Yes”, with two amendments. The first concerns Criterion no 4. In many cases, little information about the poll is given. Then we might not be able to determine if the message of the text is in conflict with the message of the poll. In some cases, where a simple calculation of the sampling error can be done to determine the statistical validity of a suspicious claim, we have done this. When we have to go outside the article to retrieve the necessary information to do so, we instead use the coding “2= Cannot be determined”. $

The second amendment is a minor adjustment to the last criterion concerning non-response. In some cases the reporting refers to a so called “poll of polls” which is an aggregate of several different polls. In this case we can understand that non-response is not reported as the possibility to do so depends on the included pollsters providing this information. $

�35

Many times a sweeping statement like “no changes are significant” is included in the text. We still feel 32

that this is both confusing and conflicting.

3.5. Results:(How(are(poli9cal(polls(reported(in(Sweden?((

Table 2 summarises the results from our look at how AB, SvD, Expressen and DN report on pre-election polls. As the AB reporting of its “own” polls is based on United Minds' non-probability design, they can only really comment on the margin of error when reporting on polls conducted by other pollsters. Therefore, we have separated AB from the other newspapers in the table. The general conclusions below thus refer to DN, SvD and Expressen. We summarise our view on AB at the bottom of the section.$

The fulfilment of Criterion 1 is very easily determined. If there is any mention of the margin of error (or equivalent) the criterion is considered to be met. However, this also means that many articles in which the meaning of the margin of error is not at all clear are also included. 70% of the articles meet Criterion 1. Criteria 2 and 3 are extrapolations of Criterion 1 in that the articles to meet the criteria have to

Table 2. AB DN SvD Exp. Total (DN, SvD, Exp.)

Percent Percent Percent Percent Percent

Criterion 1: The Margin of Error is (MoE) mentioned? Yes 22 % 76 % 68 % 66 % 70 %

Criterion 2: The MoE is quantified? Yes 4 % 4 % 10 % 4 % 6 %

Criterion 3: The MoE is explained? Yes 4 % 0 % 8 % 2 % 3 %

Criterion 4: The text and the poll results are in conflict?

No 8 % 32 % 22 % 14 % 23 %

Yes 66 % 34 % 34 % 60 % 43 %

Cannot be assessed 26 % 34 % 44 % 26 % 35 %

Criterion 5: There is sufficient information about the poll?

Yes 72 % 60 % 30 % 76 % 55 %

Criterion 6: Non-response is reported? No 98 % 94 % 90 % 90 % 91 %

Yes 0 % 0 % 0 % 4 % 1 %

Does not apply 2 % 6 % 10 % 6 % 7 %

�36

somehow quantify (Criterion 2) or explain (Criterion 3) the statistical uncertainty. Across the board, this is very unusual. SvD is better than the others but hardly good. That the “best” newspaper in this regard quantifies and explains the margin of error in respectively 10% and 8% of the cases was certainly surprising to us. It should be mentioned that any attempt to explain or quantify the margin of error (even for just one party) qualifies the article in question to meet the criterion in question. It is a very generous design (towards the newspapers). $

In contrast to previous criteria, Criterion 4 is highly subjective. The intention is to note when the message of the text and the statistics underpinning the narrative of the text, are not in alignment. Simply put, when the text says something that is not supported by the statistical results. The most frequent example of this is the use of “none of the changes are statistically verified” as a sort of cover to be able to still use the poll results to make claims. We realise that some might think that this sort of sweeping qualification should be enough. However, we feel that claiming “The poll shows us that X, Y and Z has happened” and then “clarifying” that “We cannot really say that anything has happened” , is confusing to the reader. We have tried to 34

reward the use of cautious language. For example, we consider “In this poll, C does not reach 4%” to be better than “According to this poll, C would not get into parliament if the election were held today”, as the latter clumsily generalises the poll result to the population. If the more cautious language heavily outweighs the too assertive language, we have given the article a pass. When information is sparse and we simply cannot assess the claims made, the answer is coded as “2” (also a bad “grade”). The overall “pass-rate” is 23%, but the newspapers' results differ substantially. DN has the best performance with 32%, while only seven articles from Expressen (14%) is considered to meet Criterion 4. Criterion 5 asks if information about the poll in question is present within the confines of the article. Expressen has the best result, including sufficient poll info in 76% of the articles. Expressen also generally publishes more complete information about their polls than DN and SvD.$

�37

The results are not statistically verified34

None of the newspapers regularly report non-response. A possible explanation for this is given in the “Non-response error”-section; namely that the sampling methods used by the pollsters hinders conventional calculation of the response rate. But aside from TNS Sifo (that provided a review of their method), we do not really know why. Research conducted in USA has shown that it is highly unusual for news stories to report the response rate or mention non-response error. The reason why non-response is not reported could depend on the fact that there is no demand for such information to be released (Traugott and Lavrakas 2008). The most thorough poll in Sweden, conducted by Statistics Sweden, do report non-response. Ironically, the outcome of this is that the only articles mentioning non-response are those reporting on the results from the Statistics Sweden survey, which sadly might lead people to in this specific regard doubt these results relatively more as compared to those of commercial pollsters . $35

In our view, the consequent omission of margins of error and/or confidence intervals can hardly be defended. Without carefully describing the uncertainty of the poll, the reader is not given the opportunity to form a well-founded opinion. Without quantifying (94% overall), without explaining (97% overall) and in a surprising amount of cases without even mentioning the statistical uncertainty inherent in these polls (30% overall), the reader is invited to believe that the results of the poll are more precise than they actually are. There is, however, information other than the point estimate in the reporting. Instead of presenting the reader with margins of error or confidence intervals, the praxis among the included newspapers is to include the change in percentage points from the previous point estimate. It is not hard to understand why. It is most often in the changes that the news lie, but nonetheless it does not facilitate understanding. Rather, changes that are not significant (and often far from it) are highlighted. Lina Modin of Expressen explains that they feel that what is important to the reader is whether or not the changes are significant, not the size of the margin of error (Modin 2014). This would be a completely acceptable position if Expressen always qualified cited

�38

Several articles give the impression that the response rate is low, a topic that never surfaces in the 35

reporting of other polls.

changes as significant or not, but in 34% of the cases they do not even mention the margin of error. In many other cases, statements about significant changes are mixed with statements about insignificant changes in a very confusing way. Unfortunately, DN and SvD have not answered our questions. $

As a media outlet, one could perhaps argue (though this is not something we have seen anyone do) that since the size of the total survey error is uncertain, it is irresponsible to give the reporting too much of a scientific air. That presenting the results with a sampling error would give them a sense of reliability. But the sampling error might be a very conservative approximation of the survey error, which would mean that the promised precision was not delivered. It might be better just to report the point estimates and emphasise that this is a highly uncertain best guess? The problem is that there is very little room made by the newspapers for highlighting uncertainty whatsoever. All in all we feel that if the primary interest of the newspapers were to inform their readers, their reporting of polls would not look like this. Pre-election polling is currently partly reporting and partly producing news. What makes this unusually contentious is the fact that the influencing it exerts on the democratic proceedings, seems to be far from negligible. $

With respect to AB, a different kind of analysis is needed to thoroughly assess the quality of the reporting of their “own” polls (the polls conducted by United Minds). Since there are no accepted, theoretical measures to describe and quantify the uncertainty non-probability polls, one might have to examine the use of language more carefully. As previously mentioned, we feel that AB by using polls based on non-probability sampling, should take on a greater responsibility to explain the shortcomings of polls in general and non-probability polls in particular. The impression we got from our review of their articles (though assessing this was not the purpose of the review) is that they do not do this very well. $

3.6. The(use(of(graphics(

Three graphical styles (A, B and C in Figure 3) are prevalent in the reporting of the included newspapers. C is only used by DN. Among the others B is by far the most frequent.$

�39

Sometimes the presentation is complemented with other graphical elements but these types of figures are usually the main focus in the newspapers reporting on their ”own” polls, that is, the polls produced by their affiliated pollsters.$

� $

� $

Also included in the figures are the numerical results. Almost without exception these are represented by the point estimates followed by the change in percentage points from the last poll in parentheses; e.g., 46% (-0.4).$

The purpose of presenting statistical results with the aid of graphics should always be to enhance understanding. In many cases, properly employed graphs and diagrams can help dismantle the complexity of the statistics. The visual aids should try to bring forth the true meaning of the statistical results. All of the graphics presented by the newspapers in our study do this very poorly. Firstly, none of them use diagrams that highlight the uncertainty of the results; i.e., the columns, or half pie chart pieces in the case of DN, have sharp cut-offs, with neither graphical or numerical cues to indicate the interval nature of the statistical results. What the polls actually tell us, in frequentist terms, which is the methodology that at least Ipsos, Demoskop and Sifo seem to apply , is that the (fixed) true parameter lies 36

within an interval with at a specific level of certainty chosen by the researcher. Instead of depicting this reality, all results are shown as having distinct cut-offs. The half pie chart is especially improper in this regard, as implementing confidence intervals to the presentation would be hard to do, while this could easily be done

Figure 3.

A B C

�40

United Minds seem to use a lot of prior information to adjust their estimates, and cite no frequentist 36

concepts in the description of their method.

with a bar chart. Secondly, instead of presenting the margin of error in parentheses next to the point estimate, all the newspapers report the change from the previous poll. Very seldom, the margin of error or the statistical significance of changes are reported. $

We feel that graphics definitely could be used to facilitate the understanding of pre-election polls. But rather than doing this, the use of graphics in the newspapers we have evaluated help to conceal the uncertainty of the polls, thus giving a false sense of assertiveness to the results.$

4. Analysis(

4.1. Discussion(

The importance of pre-election polls is apparent. Two of the major stakeholders, journalists and politicians, give fuel to that fire by agreeing with polls being influential in both determining political strategy and policy (Aalberg and van Aelst 2014). At the same time, several studies, including ours, show that the media falls very short when it comes to reporting basic information needed to assess the polls. The field of pre-election polls is developing fast. The fairly homogenous landscape of the past, i.e., interviewing a random sample of eligible voters via telephone and basically only have to worry about coverage and non-response, is changing. Technological and theoretical progress, the advent of big data, the success of innovators like Nate Silver, as well as other factors, have driven the rising popularity of Bayesian statistics. $

Non-response rates have been steadily increasing for the last decades, and a changing technological landscape, raises questions about the representativeness of the responses that are left. It is an exciting time, but also frustrating. There is room both for encouragement and discouragement. Most prominent Swedish pollsters, and three of the four in our study, use well established techniques like probability sampling and telephone interviewing. However, non-response and other errors make their samples not fully representative. In addition, there are indications of the

�41

pollsters not complying fully with probability sampling . All this opens up for both 37

Bayesian approaches and non-probability sampling. From analysing the last three Swedish national parliament elections, the accuracy seems reasonable and also seems to have improved over time, though this pattern could easily be coincidental. 93.8% (30 of 32) of the pollster’s confidence intervals regarding the two party groups include the election outcome, which is good but also very uncertain, as the data are sparse. However non-response and other errors make their samples not fully representative. In addition, there are indications of the pollsters not complying fully with probability sampling. All this opens up for both Bayesian approaches and non-probability sampling.$

The task of evaluating pre-election polls is not straight-forward. There is only a fairly small number of institutes producing election polls and the validity of the estimates can only reliably be evaluated once every four years for each type of election. How accurate or not the estimates are in between elections simply cannot be answered. In addition, non-sampling errors are very hard, and sometimes downright impossible, to measure. But due to the commercial importance of the polls, the methodology of conducting polls is shrouded in a lot of secrecy, tucked away in black boxes, which makes it even more difficult to properly assess the quality. They do not report non-response, which makes it hard to evaluate the performance of their sampling method, and they have not presented us with details about their qual ity control . The pol lsters do not publish any detai led methodological description though United Minds publishes a somewhat more ambitious general description than the others. TNS Sifo has provided us with a fairly detailed review of their method, but the other pollsters (United Minds, Ipsos, Demoskop) have declined our requests.$

When it comes to the newspapers we have studied, there are few positive things to say. They frequently publish information about the polls in accordance with our admittedly very conservative requirements (an average of 55%, 60% when including AB). If we would have held them to the ESOMAR/WAPOR guide standards, none of the newspaper’s articles would have passed, though some of Expressen’s would

�42

Exemplified by the review of TNS Sifo’s method discussed earlier in the thesis.37

have come close . Some mention of the statistical uncertainty is often made, but in 38

our opinion the mix of unqualified and qualified statements become confusing all too often. A measure of uncertainty is almost never quantified or explained, which is especially worrying. Presented with this fact, Lina Modin of Expressen says that they assume that the readers understand these things, which to us seems very unlikely. In fairness, Modin also expressed that it might be something they could consider to start including (Modin 2014). The message of the text and what the statistics actually support with any confidence often make for a confusing whole. Though DN (32%) is much better at avoiding this than the others (SvD 22%, Expressen 14%, AB 8%), producing a clear and easily understandable text in about a third of the cases, is still very low. We also feel that the graphics used help to hide the uncertainty of the results. We would recommend visuals that downplay the importance of point estimate and highlight the margin of error/confidence intervals. The contents of Figure 4 might serve better as a educational material for journalists than for publishing , but we think some of the ideas could should be 39

implemented in the reporting. Just graphically highlighting the confidence intervals of the estimates and quantifying the margin of error would go a long way towards educating the readers. $

!!!!!!

�43

The “ESOMAR/WAPOR Guide to Opinion Polls and Published Surveys” states that the publishing 38

of pre-election polls should include “the sampling method used (and in the case of full random probability samples the response rate achieved”. Expressen’s reporting implies full random sampling but does not report response rate.

It is conceived as an example of a policy adopted by a hypothetical newspaper.39

!� $

� $

Figure 4.

+Y %

-Y %

XX %

The midpoint of the brighter area is called the

Point Estimate (the dotted line). It is the

percent of people saying they would vote for the

party in this specific poll.

A B C-Y %

+Y %

XX %

-Y %

+Y %

XX %

THE 1, 2, 3, 4 OF PRE-ELECTION POLLS

This brighter area in the column highlights what the poll is actually

telling us. From measuring the characteristics of a relatively small number of people, we can make

claims about the entire population. In this case, the supprt for the party is with 95% certainty in the brighter area. It’s quite apptly called, the

Confidence Interval. The Confidence Interval is calulated by adding and subtracting the Margin of

Error from the point estimate. In text, the

margin of error is often reported in parentheses

next to the point estimate, e.g 44 (±2,5) %When the Confidence Intervals

of two parties overlap a lot,

like in the example...

..it is not certain that we can claim that the parties are of

different size. To avoid confusion, we never claim that a party is bigger than another if this can not be verified by the

poll.

4%

SOME EXAMPLES AND USEFUL FACTS :

- We cannot claim ”Party B is larger than Party A, according to the survey”. We can say that B is larger than A amongst the people included in the poll, but statements like these can be easily be confused as statements concerning the population, so it is wise to avoid them.

- We can assert that both Party A and B are larger than Party C

- We cannot claim that Party C has lower than 4% support, according to the poll. The Point Estimate is under 4% but this number only concerns the people included in the poll, that is, ca 1 500 of several million voters.

- The Confidence Intervals of larger parties are wider than those of smaller parties, as can be easily seen by comparing the confidence interval of Party C with that of Party B or Party A.

- For the support for a party to have changed in comparison to a previous poll, the difference has to be statistically significant. When we claim that a party has grown, what we mean is that the increase in the support of the party has been shown to be statistically significant. If it’s not, we cannot claim that the polls are telling us that the support for the party has shifted.

1

2

3

4

�44

In summation. The reader is more often than not given a poor basis from which to assess the information. It is hard not to think that this is because the reporting of change, substantiated or not, is more enticing to the newspapers than providing the reader with robust information. ”As used by the media, opinion polls very seldom serve as vox populi [voice of the people]. Rather, opinion polls serve as vox media [voice of the media]. Even if we assume that opinion polls do a good job at measuring public opinion, which often is highly questionable, the media do a poor job at using opinion polls to give voice to the people.” (Strömbäck 2009). $

What should be done? We feel that restricting the use of polls by law is not the right way to go. Surprisingly (to us at least), 45% of the surveyed Swedish politicians in Aalberg and van Aelst (2014) agree with the claim that ”opinion polls just before the elections should be forbidden”. In addition to effectively withholding potentially vital information from the voters, this would be a substantial intrusion on the freedom of the press. Regulating how certain types of polls must be reported could be a way to improve the current situation, but would likely be very hard to implement. We think there is a way where best practise methods can line up with the incentives of both pollsters and newspapers. A race to the top. Examples like the story of Nate Silver show that rigours statistical analysis, bringing forth the complexity of polling, rather than hiding it, can be presented in an entertaining way and we think that there is a lot to gain for the newspapers and/or pollsters that take aim on the pole position. $

4.2. Conclusions(and(recommenda9ons(

The objective of this thesis has been to assess the state of the pre-election poll in a Swedish context. We have done this by reviewing the theoretical foundations of polls and then contrasting this with how polls are produced and reported.$

In analysing how polls are conducted in Sweden we find that while most seem to rely on well-established techniques, a proper evaluation is hard as many pollsters are reluctant to release details. We also note that both non-probability approaches and the Bayesian method seem to be gaining ground.$

�45

Evaluating the survey error is hard, thought provoking and frustrating. The non-sampling errors highlighted in this thesis; i.e, specification, measurement, non-response and coverage error, might be sources of substantial bias. However, as both the errors themselves and the techniques for measuring them overlap, the specifics and sizes of the errors are very difficult to determine. The solution employed by the pollsters in our study to the problem of non-sampling errors is weighting, which seems to be working fairly well. The accuracy of Swedish pre-election polls seem to be fair and to have improved over the last elections. Applying well established measures on Swedish polling data yield results in line with international counterparts, though the different context requires us to label this comparison speculative.$

In line with many previous studies we find substantial shortcomings with regards to the media's reporting of polls. There is an awareness of, as well as a rudimentary adherence to, that certain basic information should be included in the reporting. But a commitment to illustrating the polls inherent uncertainty or otherwise educating the readers is virtually non-existent. $

Though there are several successful examples of using a more rigorous approach to the reporting of polls as a competitive advantage, so far, the Swedish newspapers we have examined seems to be focused on explaining as little rather than as much as possible. There seems to be a quite sizeable support for regulating the publishing of polls among Swedish politicians. However, we feel that the way forward is to highl ight the f laws in the producing and report ing of pol l s and make recommendations on how to improve. We think there are large gains for both newspapers and pollsters that take the lead in developing the field.$

1. Recommendations aimed at the Swedish pollsters examined in our thesis. $

We think that it is of paramount importance for the public's confidence in polls, as well as the development of polling, that these steps or others like them, are taken.$

�46

a) Take the ICC/ESOMAR Code seriously, especially article 4e: ”Researchers shall ensure that market research projects are designed, carried out, reported and documented accurately, transparently and objectively.” (ICC/ESOMAR 2007). The ISO 20252:2012 standard on Market, Opinion and Social Research, is an alternative.$

b) Publish a detailed methodological description that allow researchers to properly evaluate the methods. $

c) Partner with academia to initiate research to further the field. Examples of interesting research topics include: comparing web panel and landline/cellphone approaches, how to better utilise new technology to aid sampling and assessing the specification error. $

d) Take an active role in educating the media on how to report survey results (for example pre-election polls) without distorting their statistical meaning.$

2. Recommendations aimed at the newspapers examined in our thesis. $

a) Formulate a policy that can serve as a basis for all reporting on polls, based on U.S. standards.$

b) Report and quantify the margin of error. For certain articles it might not be necessary to always do this, but regarding the types of articles we have examined, the rate should be close to 100%. It should also be made clear that the sampling error is an estimate of the total error and that the real error might be notably larger. In Aftonbladet's case we recommend the ISO 26362:2009 on Access panels.$

c) Always qualify if statements are supported by significant statistical results or not, unlike the current practise of making sweeping statements about significance.$

d) Use graphics that highlight the inherent uncertainty of the point estimates, for example by including confidence intervals in the presentation. $

�47

e) Educate your readers. Regularly publish articles that explain the method of polling, highlighting its strengths and weaknesses. $

We hope that this thesis can inspire the interested public to demand better reporting, entice newspapers to compete for the market leader position and pollsters to increase transparency and partner with academia to improve the field.$

!!!!!!!!!!!!!!!!

�48

5. References(ABC. "ABC News' Guide to Polls & Public Opinion". January 18 2008. Web. April 10 2014. http://abcnews.go.com/Politics/PollVault/story?id=43943$

ABC. "ABC News' Polling Methodology and Standards". May 3 2013. Web. April 10 2014. http://abcnews.go.com/US/PollVault/abc-news-polling-methodology-standards/story?id=145373$

Andersen, Robert. "Reporting public opinion polls: The media and the 1997 Canadian election." International Journal of Public Opinion Research 12.3 (2000): 000285-298.$

Baker, Reg, et al. "Research Synthesis AAPOR Report on Online Panels." Public Opinion Quarterly 74.4 (2010): 711-781.$

Baker, Reg, et al. "Summary Report of the AAPOR Task Force on Non-probability Sampling." Journal of Survey Statistics and Methodology 1.2 (2013): 90-143.$

Bellhouse, David R. "The Reverend Thomas Bayes, FRS: A biography to celebrate the tercentenary of his birth." Statistical Science 19.1 (2004): 3-43.APA$ $

Berg, Joyce E., Forrest D. Nelson, and Thomas A. Rietz. "Prediction market accuracy in the long run." International Journal of Forecasting 24.2 (2008): 285-300.$

Berg, Joyce, et al. "Results from a dozen years of election futures markets research." Handbook of experimental economic results 1 (2001): 486-515.$

Bergman, Jakob. “Spelar det någon roll vem som mäter opinionen?” Lund Business Review. 27 November 2013. Web. 16 Apr 2014. http://review.ehl.lu.se/spelar-det-nagon-roll-vem-som-mater-opinionen/ $

Biemer, Paul P., and Lars E. Lyberg. Introduction to survey quality. Vol. 335. John Wiley & Sons, 2003.$

Bülow, Erik. "Use and Theory of Random Digit Dialing in Sweden." Baltic-Nordic-Ukrainian Summer School on Survey Statistics (2009). $

Converse, Philip E., and Michael W. Traugott. "Assessing the accuracy of polls and surveys." Science 234.4780 (1986): 1094-1098.$

�49

http://abcnews.go.com/Politics/PollVault/story?id=43943

http://abcnews.go.com/US/PollVault/abc-news-polling-methodology-standards/story?id=145373

http://review.ehl.lu.se/spelar-det-nagon-roll-vem-som-mater-opinionen/

Erikson, Robert S., and Christopher Wlezien. "Are political markets really superior to polls as election predictors?." Public Opinion Quarterly 72.2 (2008): 190-215. $

Erikson, Robert S., and Christopher Wlezien. Patters of Poll Movement. Nuffield College, Oxford University (2002). $

Erikson, Robert S., and Christopher Wlezien. "Presidential polls as a time series: the case of 1996." Public Opinion Quarterly (1999): 163-177.$

European Society for Opinion and Market Research (ESOMAR). ESOMAR/WAPOR Guide to opinion polls and published surveys. 2009. https://www.esomar.org/uploads/public/knowledge-and-standards/codes-and-guidelines/WAPOR-ESOMAR_Guidelines.pdf (Accessed 2014-05-03)$

Fumagalli, Laura, and Emanuela Sala. The total survey error paradigm and pre-election polls: The case of the 2006 Italian general elections. No. 2011-29. ISER Working Paper Series, 2011.$

Gelman, Andrew. ”If you’re already using sophisticated non-Bayesian methods...”. Statistical Modeling, Causal Inference, and Social Science. 27 Jan 2013. Web. April 20 2014. http://andrewgelman.com/2013/01/27/if-youre-already-using-sophisticated-non-bayesian-methods-such-as-those-of-tibshirani-efron-and-others-that-bayes-is-more-of-an-option-than-a-revolution-but-if-youre-coming-out-of-a-pure-hypo/$

Hald, Anders. A history of parametric statistical inference from Bernoulli to Fisher, 1713-1935. Springer, 2008.$

Hillygus, D. Sunshine. "The evolution of election polling in the United States."Public opinion quarterly 75.5 (2011): 962-981.$

ISO, 2006. New ISO International Standard for the market research industry. 9 May 2006. web. 20 Apr 2014 http://www.iso.org/iso/home/news_index/news_archive/news.htm?refid=Ref1005$

Jackman, Simon. "Pooling the polls over an election campaign." Australian Journal of Political Science 40.4 (2005): 499-517.$

�50

https://www.esomar.org/uploads/public/knowledge-and-standards/codes-and-guidelines/WAPOR-ESOMAR_Guidelines.pdf

http://andrewgelman.com/2013/01/27/if-youre-already-using-sophisticated-non-bayesian-methods-such-as-those-of-tibshirani-efron-and-others-that-bayes-is-more-of-an-option-than-a-revolution-but-if-youre-coming-out-of-a-pure-hypo/

http://www.iso.org/iso/home/news_index/news_archive/news.htm?refid=Ref1005

Larson, Stephanie Greco. "Misunderstanding Margin of Error Network News Coverage of Polls during the 2000 General Election." The International Journal of Press/Politics 8.1 (2003): 66-80.$

Lau, Richard R. "An analysis of the accuracy of “trial heat” polls during the 1992 presidential election." Public Opinion Quarterly 58.1 (1994): 2-20. $

Lavrakas, P. J., S. Blumberg, and M. Battaglia. "New considerations for survey researchers when planning and conducting RDD telephone surveys in the US with respondents reached via cell phone numbers. Deerfield, IL: American Association for Public Opinion Research; 2010."$

Martin, Elizabeth A., Michael W. Traugott, and Courtney Kennedy. "A review and proposal for a new measure of poll accuracy." Public Opinion Quarterly69.3 (2005): 342-369.$

McGrayne, Sharon Bertsch. The theory that would not die: how Bayes' rule cracked the enigma code, hunted down Russian submarines, & emerged triumphant from two centuries of controversy. Yale University Press, 2011.$

Mitofsky, Warren J. "Review: was 1996 a worse year for polls than 1948?."Public Opinion Quarterly 62.2 (1998): 230-249.$

Moon, Nick. Opinion polls: History, theory and practice. Manchester University Press, 1999.$

Mosteller, Frederick. The Pre-election Polls of 1948: The Report to the Committee on Analysis of Pre-election Polls and Forecasts. Vol. 60. Social Science Research Council, 1949.$

Neyman Jerzy ,.“On the Two Different Aspects of the Representative Method: The Method of Stratified Sampling and the Method of Purposive Selection”. Journal of the Royal Statistical Society. 1934$

Newport, Frank. “Questions Answered about Gallup’s Presidential Election Tracking Poll.” Gallup. 4 October 2000. Web. 22 August 2004. http://www.gallup.com/poll/4651/questions-answered-about-gallups-presidential-election-tracking-poll.aspx $

NYT 2011. ”The New York Times Polling Standards”. 2011. Web. 1 May 2014. http://www.documentcloud.org/documents/286713-nytimes-polling-standards-2011.html$

�51

http://www.gallup.com/poll/4651/questions-answered-about-gallups-presidential-election-tracking-poll.aspx

http://www.documentcloud.org/documents/286713-nytimes-polling-standards-2011.html

Oscarsson, Henrik, and Sören Holmberg. "Swedish voting behavior."Gothenburg: Swedish Election Studies Program (2010).$

Panagopoulos, Costas. "Polls and elections: preelection poll accuracy in the 2008 general elections." Presidential Studies Quarterly 39.4 (2009): 896-907.$

Post och tele-Styrelsen (PTS). Svenskarnas användning av telefoni och internet PTS ind iv idundersökning 2013 : 20. ht tp : / /p t s . se /up load /Rappor ter /Te le /2013 /individundersokning-pts-er-2013_20.pdf $

Radio Sweden. "Checklista för enkätundersökningar/opinionsundersökningar". 2013a. Web. 20 April 2014. http://sverigesradio.se/diverse/appdata/isidor/files/4097/13300.pdf

Radio Sweden. "Checklista för enkätundersökningar/opinionsundersökningar". 2013b. Web. 20 April 2014. http://sverigesradio.se/diverse/appdata/isidor/files/4097/13497.pdf$

Rosenstiel, Tom. "Political polling and the new media culture: A case of more being less." Public Opinion Quarterly 69.5 (2005): 698-715.$

Rothschild, David. "Forecasting elections comparing prediction markets, polls, and their biases." Public Opinion Quarterly 73.5 (2009): 895-916.$

Ruiz Restrepo, Jaime. "FUNDAMENTALS OF POLLING." La Sociología en sus escenarios 19 (2010).$

Sniderman, Paul M., ed. Studies in Public Opinion: Attitudes, Nonattitudes, Measurement Error, and Change. Princeton University Press, 2004.$

Sonck, Nathalie, and Geert Loosveldt. "Research Note: Making News Based on Public Opinion Polls The Flemish Case." European Journal of Communication 23.4 (2008): 490-500.

Statistics Sweden. "Democracy Statistics Report no 13, One hundred years of Swedish voter turnout". 2012.$

Studieförbundet Näringsliv och Samhälle (SNS), Olof Petersson och Sören Holmberg. Svenska partibarometrar. 2008$

Traugott, Michael W. "Assessing poll performance in the 2000 campaign." Public Opinion Quarterly 65.3 (2001): 389-419.$

�52

http://pts.se/upload/Rapporter/Tele/2013/individundersokning-pts-er-2013_20.pdf

http://sverigesradio.se/diverse/appdata/isidor/files/4097/13300.pdf

http://sverigesradio.se/diverse/appdata/isidor/files/4097/13497.pdf

Traugott, Michael W. "The accuracy of the national preelection polls in the 2004 presidential election." Public Opinion Quarterly 69.5 (2005): 642-654.$

Traugott, Michael W., and Paul L. Lavrakas. The voter's guide to election polls. Rowman & Littlefield Publishers, 2007.$

Tidningsutgivarna. ”Fakta om marknad och medier”. Svenska Mediehus 2013/14. Web. 27 April 2014, http://www.dagspress.se/images/stories/TU_Svenska_Mediehus_2013-14.pdf$

United States Census Bureau. Statistical Abstract. 2012. Web. 16 May 2014. http://www.census.gov/compendia/statab/2012/tables/12s0397.pdf$

Wanga, Wei, et al. "Forecasting Elections with Non-Representative Polls."Submitted to International Journal of Forecasting (2013).$

Wasserman, Larry. "Nate Silver is a Frequentist: Review of “the signal and the noise”. 2012. Web. 6 May 2014. http://normaldeviate.wordpress.com/2012/12/04/nate-silver-is-a-frequentist-review-of-the-signal-and-the-noise/$

!Interviews

Modin, Lina. (Expressen). E-mail interview. (08-05-2014).$

Melin, Lena. (Aftonbladet). E-mail interview. (07-05-2014).$

Pollster interviews. Felix Åberg (United Minds), Johanna Laurin Gulled (Ipsos), Toivo Sjörén (TNS Sifo), Camilla Sandberg (Demoskop). Phone and e-mail interviews. 2014.$

!Bibliography

Biemer, Paul P., and Lars E. Lyberg. Introduction to survey quality. Vol. 335. John Wiley & Sons, 2003.$

Biemer, Paul P., and Sharon L. Christ. "Weighting survey data." International handbook of survey methodology 2008.$

Biemer, Paul P. Latent class analysis of survey error. Vol. 556. John Wiley & Sons, 2010.$

�53

http://www.dagspress.se/images/stories/TU_Svenska_Mediehus_2013-14.pdf

http://www.census.gov/compendia/statab/2012/tables/12s0397.pdf

http://normaldeviate.wordpress.com/2012/12/04/nate-silver-is-a-frequentist-review-of-the-signal-and-the-noise/

Crespi, Irving. “Pre-election polling: Sources of accuracy and error”. Russell Sage Foundation, 1988. Groves,$

Groves, Robert M. Survey errors and survey costs. Vol. 536. John Wiley & Sons, 2004.$

Krzanowski, Wojtek. "Statistical principles and techniques in scientific and social research." 2007$

Lambert, Langer, McMenemy. “CellPhone Sampling: An Alternative Approach”. Paper presented at the annual conference of the American Association for Public Opinion Research, Chicago, IL, May 14, 2010. $

Lepkowski, James M., et al. Advances in telephone survey methodology. Vol. 538. John Wiley & Sons, 2007.$

Newport, Frank, Lydia Saad, and David Moore. "How are polls conducted."Where America Stands (1997).$

Robert M. Survey errors and survey costs. Vol. 536. John Wiley & Sons, 2004.$

Stoltenberg, Emil Aas. "Bayesian Forecasting of Election Results in Multiparty Systems." (2013)$

!!!!!!

�54

Date post:	18-Oct-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

Kandidatuppsats/menu/standar… · We also examine diﬀerent approaches to evaluate the...

Documents