+ All Categories
Home > Documents > 1 4 where do we get the data

1 4 where do we get the data

Date post: 13-Dec-2014
Category:
Upload: ken-kretsch
View: 408 times
Download: 4 times
Share this document with a friend
Description:
 
16
Where do we get the data? Census vs sample • Observations “Watching” real activity and collecting data Opinion polls • Experiments Running the activity and measuring the results Relatively easy to control
Transcript
Page 1: 1 4 where do we get the data

Where do we get the data?

• Census vs sample• Observations

– “Watching” real activity and collecting data

– Opinion polls

• Experiments– Running the activity and measuring the results

– Relatively easy to control

Page 2: 1 4 where do we get the data

For Example

TV watching and test scores

• Observation– Use a survey that asks your

sampled students their TV watching habits and their test scores.

• Experiment– Design varied TV-watching

schedules for your samples

– Design and/or administer an test to measure learning

Car crashworthiness and make

• Observation– Collect accident data and auto

repair data

• Experiment– Deliberately crash cars and

measure the results

Page 3: 1 4 where do we get the data

Live Example

Movie popularity

• Observation

• Experiment

Cell Phone Reception

• Observation

• Experiment

Page 4: 1 4 where do we get the data

Variables

• Variable refers to any characteristic that could effect an outcome being tested.– Variables have to be measureable

• What characteristics affect SAT scores?

• What characteristics affect car crashworthiness?

Page 5: 1 4 where do we get the data

Varying and Controlling

• In a statistics study, we test if one variable really has an affect on the outcome.

• We will vary the test variable– Change the value to see if the outcome also changes

• To prevent confounding, we will control the other variables– Confounding: The effects of two or more variables can not

be distinguished

– Control: Samples with similar values for the kother variables may be grouped

Page 6: 1 4 where do we get the data

Treatment

• When running a experiment that tests a variable:– The sample will be split into groups

– Each group will be administered one level of the variable

– Who or what is assigned to each group is randomly determined.

• In some experiments the test variable is all or none.– E.g., a drug

– One group, the treatment group, receives all (called the treatment)

– The other group, the control group, receives nothing or a pretend treatment called a placebo

Page 7: 1 4 where do we get the data

Placebo Effect

• The subject, but especially the control group, might think they are being given the treatment and start to act accordingly.

• If the experiment is blinded the subjects are not told if they are receiving the real treatment or placebo.– The subjects should also not be told the outcome

• If the experiment is double blinded the people administering the experiment are also not told

Page 8: 1 4 where do we get the data

Sampling

• Sampling: picking a subset of a population • Sample’s characteristics should reflect the

population’s in the same proportion• E.g., our school’s demographic break-down is

Frosh Sophomore Junior Senior

Male 13% 12% 12% 13%

Female 13% 13% 11% 13%

Page 9: 1 4 where do we get the data

Sample Scheme Characteristics

• Random sample– Each member of the population has an equal chance to be

selected

• Simple random sample– Each subset a population has an equal change of being

selected.

Page 10: 1 4 where do we get the data

Sampling Strategies

• Self-selected– Population members volunteer

– E.g., Call-in phone lines

– Easy to implement

– Difficult to get a proportional sample

– Susceptible to bias

• Convenience sampling– Whoever happens by

– E.g., Mall surveys

– Also susceptible to bias

Page 11: 1 4 where do we get the data

Sampling Strategies

• Random sample– Each member of the population is selected at random

– E.g., Generate random student id’s

• Systematic sampling– Population is put into some order

– Select some starting point, then select every nth individual in a population

– The starting point and maybe the interval (n) are picked at random

Page 12: 1 4 where do we get the data

More Sampling Selection and Collection

• Stratified sampling– Divide the population into groups.

• Groups are determined by control variables

– Randomly sample within each group

• Cluster sample– Divide the population into clusters, randomly pick a

cluster, then sample all (or most) members of the cluster

Page 13: 1 4 where do we get the data

Example: Student Opinion Poll

• Self-selecting

• Random sampling

• Systematic sampling

• Convenience sampling

• Stratified sampling

• Cluster sample

Page 14: 1 4 where do we get the data

Example: Crashworthiness

• Self-selecting

• Random sample

• Systematic sampling

• Convenience sampling

• Stratified sampling

• Cluster sample

Page 15: 1 4 where do we get the data

Bias

• Sampling members of a population…– With a specific characteristic

– That will give a specific outcome

– “Rigging the game”

• Selection and undercoverage bias– E.g., FOX news and health care

• Non-response bias– Counting non-response as one answer

• Voluntary response bias

Page 16: 1 4 where do we get the data

More on Bias

• If I want my test to support the claim that watching too much TV hurts SAT scores, how do I rig the sample?

• If I want my test to support the claim that US cars are safer that Japanese cars, how do I rig the sample?


Recommended