Quick and Painless Introduction to Survey Methodology

Post on 21-Jan-2016

25 views 0 download

Tags:

description

Quick and Painless Introduction to Survey Methodology. R. Michael Alvarez PS 120. Testing Theories or Models. Experimental data: expensive, and has validity problems Quasi-experimental data: aggregate election statistics, other data. Suffers from various problems. - PowerPoint PPT Presentation

transcript

Quick and Painless Introduction to Survey Methodology

R. Michael Alvarez

PS 120

Testing Theories or Models

• Experimental data: expensive, and has validity problems

• Quasi-experimental data: aggregate election statistics, other data. Suffers from various problems.

• Survey data: data about individual voters

Fundamentals of Surveying

• Population: all elements of interest, usually in a geographic area

• Sample: subset of population• Sample frame: list of sample (addresses,

phone numbers, email addresses, etc)

Basic Typology of Surveys

• Probability designs: population elements have a known (at least in theory) probability of selection into the sample.

• Nonprobability designs: population elements have an unknown probability of selection into the sample All of the statistical tools we use to study

survey data are based on probability designs!

Literary Digest 1936: What Went Wrong?

0

10

20

30

40

50

60

70

Roosevelt Vote

Election

Literary DigestGallup Poll

Crossley PollFortune

Literary Digest Methodology

• Sent out 10 million straw ballots, using a list drawn from auto registration lists and telephone books.

• 2.3 million were returned, about 25% response. Flawed sample (overrepresented rich and

Republicans) Low response rate

Literary Digest Fiasco Reforms Polling

• Underlying flaws of Literary Digest straw polls revealed --- not using a scientific sampling procedure

• Others, especially Gallup, Roper and Crossley began to work to find better ways of generating samples …

• The Literary Digest soon folded!

Problems Continue, 1948

0

10

20

30

40

50

60

Truman Dewey Thurmond Wallace

Crossley

GallupRoper

Election

New Sampling Techniques Were Flawed!

• Before 1948, they used “quota sampling”• Each interviewer is assigned a fixed quota

of subjects to interview from certain demographic categories … gender, age, education, residential location.

• Once they met their quota, the interviewer could select anyone they desired until they conducted all their required interviews

Quota Sampling

• It’s not necessarily a stupid idea, as long as the underlying data (Census data?) used to construct the parameters of the sample are okay.

• But, what can happen is that interviewers end up working to talk with people who are easy to contact. In 1948 that tended to be people in nice neighborhoods, with fixed addresses and phones (ie, Republicans).

Random Sampling

• In the 1950’s, most scientific surveys shifted to the use of random sampling

• For example, Gallup in 1956 moves to the use of random selection methods and seems to generate more accurate presidential election forecasts thereafter

Gallup’s Track Record

-8

-6

-4

-2

0

2

4

6

19361940 194419481952 19561960 196419681972 19761980 198419881992

Difference

Basic Introduction to Sampling

• Concept: The population (or universe or target population).

• The population is the entire set of units to which a survey will be applied. Individual members of the population are called units or elements.

More on sampling ...

• Next, we need a list of population units from which we can draw a sample.

• This list is called the SAMPLE FRAME• The basic property of a sample frame is that

every unit in the population has some known chance of being selected into the sample by whatever method is used to select units

Then ...

• Probability sample: units are selected using a method that insures that each unit has a known, nonzero probability of being included.

• Nonprobability sample: units are selected and inclusion probabilities are unknown (quota sampling …)

Simple Random Sampling

• All elements of population have equal probability of being sampled Cluster sampling: population is divided into

clusters or groups, and clusters are sampled. Why? Cost and simplicity.

Stratified sampling: population is divided into subpopulations, or strata, and sampling occurs within strata. Why? Strata might be of interest or require different methods of analysis.

Sampling Error

• Best way to think of survey error is in the context of proportions (percent saying “yes” or “no”).

• Standard error of a proportion in SRS: se(p) = sqrt[ ( p(1-p) )/( n - 1 )]

An Example of Survey Error

+- 2 Standard Deviation Example

0

2

4

6

8

10

12

Sample size

Standard deviation of P

P=.5

P=.7

P=.9

P=.5 10.05 7.089 5.783 5.006 4.477 4.086 3.782 3.538 3.335 3.164 3.016 2.888 2.775 2.674 2.583 2.501 2.426 2.358 2.295 2.237

P=.7 9.211 6.497 5.3 4.588 4.103 3.745 3.467 3.242 3.057 2.9 2.765 2.647 2.543 2.45 2.367 2.292 2.224 2.161 2.103 2.05

P=.9 6.03 4.253 3.47 3.004 2.686 2.452 2.269 2.123 2.001 1.898 1.81 1.733 1.665 1.604 1.55 1.5 1.456 1.415 1.377 1.342

100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000

An Example of Nonresponse Error? March 2001 CSLP RDD

NES Response and Refusal Rates

0

10

20

30

40

50

60

70

80

90

195219601964196819721976198019841988199219962000

ResponseRefusal

Response rate: interviews net of refusals and respondentswho cannot provide an interview (e.g., language, etc)

Misreporting: Voting in Recent Federal Elections

0102030405060708090

1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004

Official Turnout CPS TurnoutNES Turnout McDonald-Popkin

Note: Percentage of voting age population

Item Nonresponse

• Don’t know is necessary in any survey, so that people can tell you if they don’t have an opinion

• Due to uncertainty, vague questions, or respondent unwillingness to answer some questions

Should Gov’t Provide More Services?

050

100150200250300350400450500

Fewer More

No Opinion

Number

1996 NES

Certainty of Responses?

05

10

15

2025

30

3540

45

50

1 2 3 4 5 6 7

Not

PrettyCertain

Senator Position on Abortion Scale, Alvarez and Franklin 1993

Question Wording and Order?

• Would you say that traffic contributes more or less to air pollution than industry? (45% traffic primary contributor to 32% industry).

• Would you say that industry contributes more or less to air pollution than traffic? (57% industry primary contributor to 24% traffic) Wanke et al. 1995

Types of Surveys

• Self-administered questionnaires (mail, web) cheap but:

low response rates uncertainty about who completes questionnaire

Types of Surveys

• Telephone: RDD/CATI quick, random? Uncertainty about respondent, difficult to ask

complex questions, must be short

Types of Surveys

• Face-to-face (on doorstep, exit polls) highly accurate, high response rates very expensive to implement interviewer biases are problematic

Internet surveying --- the future?

• Cheap to implement• Quick in the field, quick with analysis• Can implement complex designs, for

example, use multimedia

Basic types of Internet surveys

• Probability designs• Nonprobability designs• Mixtures of probability and nonprobability

Probability-based Internet surveys

• Intercept-based surveys of visitors to particular web sites

• known email lists (students, etc)

Nonprobability Internet surveys

• Entertainment surveys• Self-selected surveys• Volunteer survey panels

Surveys are not perfect!

• Sampling error (difference between sample and pop.)

• Coverage error (deviation between sample and frame)

• Systematic sampling error; error in frame

• Nonresponse (unit) bias

• Nonresponse (item) bias

• Question wording or ordering effects

• Interviewer error; coding mistakes

How do I evaluate survey results?

• Sample size• Sampling

methodology (probability or non-probability)

• Estimated sampling error

• Survey response rate

• Questionnaire design and question wording

• Item response rates• Intuition: do the

results make sense?

Caltech’s National

Public Relations Initiatives

March 11, 2003

Brief recap of survey methodology

• Survey conducted by ICR• Wednesday, February 12-Sunday February

15• Omnibus survey• N=1010• Tabulation presents weighted results,

weighted to map to American adult population

Questions

1 Considering what you might have seen or heard about the California Institute of Technology, also known as Caltech, in Pasadena, California, which of the following best describes your opinion of Caltech’s reputation. Would you say Caltech’s reputation is excellent, good, fair, or poor?

Questions

2 How did you hear about Caltech? (not asked to those unable to answer 1).

3 What do you think Caltech is best known for (not asked to those unable to answer 1).

Questions

4 Now, as I read each of the following topics, please tell me, generally speaking, whether or not you are interested in the topic: voting, the brain, climate changes, astronomy, earthquakes, nano-technology, detecting gravity waves

Questions

5 And considering those topics in which you said you had an interest, how do you usually get news and information about these topics? (asked to only those who were interested in at least one topic)

National Awareness of Caltech

0

10

20

30

40

50

60

Level of Awareness of Caltech Among General Public

AwareUnaware

0

10

20

30

40

50

60

70

80

High Income College West 55-64

AwareUnaware

Caltech Awareness Successes

Caltech Awareness Challenges

0

10

20

30

40

50

60

70

North East > 64

AwareUnaware

Caltech’s Reputation

0102030405060708090

Caltech's Reputation as Judged by ThoseAware of the Institute

Excellent/Good

Fair/Poor

Media Relations Focus

• National and northeast TV - visit them, pitch them, invite them to campus.

• Households with children• Senior-oriented media

Evaluate the Caltech Awareness Survey

• Technical evaluation• Substantive evaluation• Policy evaluation