Post on 01-Apr-2015
transcript
"In earlie
r times they had
nostatistics, and so th
ey
had to fall back on lie
s.
Hence the huge
exaggerations of
primitiv
e literature -
giants or miracles or
wonders! They did it with
lies and we do it with
statistics; but it is all th
e
same."
Stephen LeacocK (1869-
1944)
Facts are stubborn,
but statistics are more pliable."
Mark Twain (1835-1910)“POLLS ARE FOR DOGS”John G. Diefenbaker
(13th Prime Minister of Canada)
What is Statistics?
Statistics is the gathering, organizing, analyzing, and presenting of numerical information.
The data gathered by statistical studies are used to guide decisions, explain events, predict future courses of action, or provide the basis for a solution to a problem.
Population vs. SampleOnce you have decided on the topic you wish to study, the first major step of your study involves gathering the data. From whom you are going to gather the data is your first decision.
Population
all individuals who belong to a group being studied
Group being studied
Sample
a selection of individuals taken from a population
People that are actually asked or polled
Identify the population for each of the following questions
a) Whom do you plan to vote for in the next Ontario election?
_____________________________
b) Do women prefer to wear ordinary glasses or contact lenses?
_______________________________
All Canadian Citizens that live in Ontario of voting age
Women who require corrective eyewear
Determine if the following is a sample or a population
a) A representative from each hockey team is asked to complete a survey on game times__________________
b) Canada census survey__________________
c) One in every 10 bottles of pop are tested for defects in a factory________________________
Sample
Population
Sample
Types of Data and Sampling
Once you have determined the population that you are considering for your study. The next step in completing your study is obtaining a sample that best represents your population.
Sample selection is one of the key factors that will determine if your survey is valid and will produce legitimate conclusions
Types of DataRaw Data
This is the name given to data that has not yet been analyzed, only collected.
Discrete Data
There is a limit to the categories that data can be placed in. Ex. The soft drink size at the movie theatre
There are only the 4 categories and it is not possible to go in between them.
Continuous Data
All rational values.
The data can take on any value, particularly decimal values of infinite place value.
Population numbers
Counts of physical objects where fractions don’t make sense (people)
Time ( can win a race in 3 seconds or 3.4 seconds or 3.148 etc..)
Length
Mass
Discrete Data Continuous Data
4 Types of Data
Nominal DataOrdinal Data
Interval Data
Ratio Data
This is data that can be linked into categories but those categories can not be ranked or quantified
Ex: if a survey asks what type of food you prefer: Chinese, Italian, American or Indian.
Nominal Data Discrete
Data is organized into rankings.
Ex: Rank your top five favourite movies. Matrix = 1
Batman Begins = 2 etc…
The order doesn’t matter as long as the data can be ranked the way that you want it to be.
Ex: Matrix = 100Batman Begins = 300
Ordinal DataDiscrete
Data is categorized into numerical groupings in which the distance between these groupings is the same
The initial or zero point is arbitrary
Ex: Intervals 2006-2007 is the same as 2005-2006
Ex: IQ intervals
Interval Data Discrete
All continuous data is Ratio Data.
The name ratio comes from Rational, the number system which contains decimal values
Ex: Your time in the 100 m dash
Ratio DataContinuous
Sampling The method used to collect sample
data from a population is very important and can mean the difference between a credible conclusion or a biased one
Simple Random Sampling
Gives all the elements of the population an equal chance of being a part of the sample.
Must be as impartial as possible and not favouring one over the other
Systematic Sample
Selecting a sample from a population is done systematically or through a constant counting process
Ex: picking every 100th person from a phone book
To determine if you should choose ever 5th or 100th item find the ratio of the population and sample
If you wanted a tenth of the population then select every 10th item.
Ex: A telephone company is planning a marketing survey of its 760 000 customers. For budget reasons, the company wants a sample size of about 250.
a) Determine the interval that should be used for a systematic sample.
Therefore the company should be selecting every 3040th customer for their survey
population sizeinterval =
sample size
760000
250 3040
Stratified Sample Takes into account that a population
is made up of many demographics that tend to react differently
If a population of turtles has more females than males, then if the sample is purposely weighted with more females than males in a proportional number to the population, it is stratified sample.
To determine how many subjects from each subgroup to select determine the percent of that subgroup is in the population and multiply by the number desired in the sample
##
subgroupsample
population
Ex: Before booking bands for the school dances, the students’ council at Statsville H.S. wants to survey the music preferences of the student body. The following table shows the enrolment at the high school
a) Design a stratified sample for a survey of 25% of the student body
Grade # Students
9 255
10 232
11 209
12 184
Total 880
25% of the student body is
880 x 0.25 = 220
Complete this step for each grade and you should get that there should be:• 64 gr 9’s• 58 gr 10’s• 52 gr 11’s• 46 gr 12’s
##
255220
880
63.75
= 64 gr 9's should be selected
subgroupsample
population
To check they should add up to 220
Cluster Sample Takes advantage of groups that have
similar characteristics of other similar groupings
Randomly selecting whole classes assuming they are random
Multi-Stage Sample
Uses compound randomization
A study that determines passenger safety in cars randomly picks a car manufacturer (stage 1), then randomly picks a vehicle type like a van, compact, truck (stage 2), then randomly picks a type of car in that class (stage 3).
Ex: Suppose that your population consisted of all Ontario households. How would you create a Multi-Staged Sample?
You could first randomly select from the different towns/cities in Ontario
Then randomly select a sample of blocks or subdivision within the selected cities
Finally you could then select from individual homes on that block
Voluntary-Response Sample
Depends on the initiative of the sample itself
Internet and mail polls
Elements selected for the sample may or may not respond
This creates a potential bias
Convenience Sample
Samples local elements that are nearby or elements that are accessible with little or no cost
Telephone or internet
Homework
Pg 117 #4,6,8,9,11Pg 123 # 1-6