Date post: | 18-Dec-2014 |
Category: |
Documents |
Upload: | ken-kretsch |
View: | 353 times |
Download: | 0 times |
Basic Definitions: Population
• Population– The entire group to be studied
• Census– A collection of data and information from the population
• Parameter– A numeric measurement or calculation of census data
– Quantifies an attribute of the population
Basic Definitions: Sample
• Fundamental concept– A census may not be possible or practical
• Sample– (Noun) A subset of a population that (we hope) represents
the population
– (Verb) The process of collecting data from the subset.
• Statistic– A numeric measurement or calculation of sample data
– Estimates the parameter of a population
Another Fundamental Concept
• We are never 100% sure that our sample exactly represents the population
• So a statistic is just an estimate
• We will learn many techniques to deal with this uncertainty
Where do we get the data?
• Census vs sample• Observations
– “Watching” real activity and collecting data
– Opinion polls
• Experiments– Running the activity and measuring the results
– Relatively easy to control
For Example
TV watching and test scores
• Observation– Use a survey that asks your
sampled students their TV watching habits and their test scores.
• Experiment– Design varied TV-watching
schedules for your samples
– Design and/or administer an test to measure learning
Car crashworthiness and make
• Observation– Collect accident data and auto
repair data
• Experiment– Deliberately crash cars and
measure the results
Live Example
Movie popularity
• Observation
• Experiment
Cell Phone Reception
• Observation
• Experiment
Homework
• Describe an experiment to gather data tests the following claims.– Reading books improves school performance
– Blondes have more fun
Variables
• Variable refers to any characteristic that could effect an outcome being tested.– Variables have to be measureable
• What characteristics affect SAT scores?
• What characteristics affect car crashworthiness?
Varying and Controlling
• In a statistics study, we test if one variable really has an affect on the outcome.
• We will vary the test variable– Change the value to see if the outcome also changes
• To prevent confounding, we will control the other variables– Confounding: The effects of two or more variables can not
be distinguished
– Control: Samples with similar values for the other variables may be grouped
C and A: How do you raise a smart kid?
• Economics professor has correlated test scores with family characteristics– Educated parents
– High socio-economic status
– 30 year old mom
– Books in home
– English in the house
– PTA participation
– Birth weight
– Adopted
Your Turn, Home Work.
• Lets assume we are designing a study of car crashworthiness. Your assignment is to to the following.
1. List 6 variables of a car or driver that you feel affect
2. Of these variables, pick one that you would like to test.
3. Using the control variables, describe three groups of cars and/or drivers you would create to test your variable.
Treatment
• When running a experiment that tests a variable:– The sample will be split into groups
– Each group will be administered one level of the variable
– Who or what is assigned to each group is randomly determined.
• In some experiments the test variable is all or none.– E.g., a drug
– One group, the treatment group, receives all (called the treatment)
– The other group, the control group, receives nothing or a pretend treatment called a placebo
Placebo Effect
• The subject, but especially the control group, might think they are being given the treatment and start to act accordingly.
• If the experiment is blinded the subjects are not told if they are receiving the real treatment or placebo.– The subjects should also not be told the outcome
• If the experiment is double blinded the people administering the experiment are also not told
Your turn/homework
• You are charged with testing a new SAT prep course
1. Describe how the placebo effect might come into play in your experiment
2. Describe how you would counteract that effect
Sampling
• Sampling: picking a subset of a population • Sample’s characteristics should reflect the
population’s in the same proportion• E.g., our school’s demographic break-down is
Frosh Sophomore Junior Senior
Male 13% 12% 12% 13%
Female 13% 13% 11% 13%
Sample Scheme Characteristics
• Random sample– Each member of the population has an equal chance to be
selected
• Simple random sample– Each subset a population has an equal change of being
selected.
Sampling Strategies
• Self-selected– Population members volunteer
– E.g., Call-in phone lines
– Easy to implement
– Difficult to get a proportional sample
– Susceptible to bias
• Convenience sampling– Whoever happens by
– E.g., Mall surveys
– Also susceptible to bias
Sampling Strategies
• Random sample– Each member of the population is selected at random
– E.g., Generate random student id’s
• Systematic sampling– Population is put into some order
– Select some starting point, then select every nth individual in a population
– The starting point and maybe the interval (n) are picked at random
More Sampling Selection and Collection
• Stratified sampling– Divide the population into groups.
• Groups are determined by control variables
– Randomly sample within each group
• Cluster sample– Divide the population into clusters, randomly pick a
cluster, then sample all (or most) members of the cluster
Example: Student Opinion Poll
• Self-selecting
• Random sampling
• Systematic sampling
• Convenience sampling
• Stratified sampling
• Cluster sample
Example: Crashworthiness
• Self-selecting
• Random sample
• Systematic sampling
• Convenience sampling
• Stratified sampling
• Cluster sample
Bias
• Sampling members of a population…– With a specific characteristic– That will give a specific outcome– “Rigging the game”
• Selection and undercoverage bias– E.g., FOX news and health care
• Non-response bias– Counting non-response as one answer
• Voluntary response bias• Only people who feel strongly might respond to a
survey.
More on Bias
• If I want my test to support the claim that watching too much TV hurts SAT scores, how do I rig the sample?
• If I want my test to support the claim that European cars are safer that Japanese cars, how do I rig the sample?