Study vs. Experiment Observational Study Based on data in which
no manipulation of factors has been employed Experiment Manipulates
factors to create treatments Randomly assigns subjects to the
treatments Compare the responses of the subjects across treatment
levels
Slide 3
Study or Experiment? Researchers have linked an increase in the
incidence of breast cancer in Italy to dioxin released by an
industrial accident in 1976. The study identified 981 women who
lived near the site of the accident and were under age 40 at the
time. Fifteen of the women had developed breast cancer at an
unusually young average age of 45. Medical records showed that they
had heightened concentrations of dioxin in their blood and that
each tenfold increase in dioxin level was associated with a
doubling of the risk of breast cancer. Observational study
Slide 4
Study or Experiment? Is diet or exercise effective in combating
insomnia? Some believe that cutting out desserts can help alleviate
the problem, while others recommend exercise. Forty volunteers
suffering from insomnia agreed to participate in a month-long test.
Half were randomly assigned to a special no-desserts diet; the
others continued desserts as usual. Half of the people in each of
these groups were randomly assigned to an exercise program, while
the others did not exercise. Those who ate no desserts and engaged
in exercise showed the most improvement. Experiment
Slide 5
The Cycle of Statistics Population Sample Statistic
Parameter
Slide 6
Principles of Experimental Design Control aspects of the
experiment that we know may have an effect on the response, but
that are not the factors being studied. Randomize to even out
effects that we cannot control Replicate over as many subjects as
possible.
Slide 7
Types of Sampling Random Sample Simple Random Sample Stratified
Random Sample Probability Random Sample
Slide 8
Random Sampling Simple Random Sample (SRS) Every member of the
population has an equal chance of being chosen for the sample
Method Assign a random number to each individual in the sampling
frame Select only those whose random numbers satisfy some rule
Slide 9
Simple Random Sample Example There are 80 students enrolled in
an introductory Statistics course; you are to select a sample of 5
Sampling frame The roster of all students enrolled in the course
Label each student 01 - 80 Use a random number generator and choose
the first 5 students from the list that match the random numbers.
Ignore numbers not on the list and repeats.
Slide 10
Stratified Random Sample Population is divided into similar
groups of individuals These are called strata Then a SRS is
completed in each strata These are combined for the overall
sample
Slide 11
Probability Random Sample A sample is chosen by chance Each
sample has a probability of being chosen We have to know this
Slide 12
What is the population? What is the sample? Which random sample
was used? A company packaging snack foods maintains quality control
by randomly selecting 10 cases from each days production and
weighting the bags. Then they open on a bag from each case and
inspect the contents. Population: All snack foods produced at the
company Sample: 10 cases from each days production Random Sample:
SRS
Slide 13
What is the population? What is the sample? Which random sample
was used? Dairy inspectors visit farms unannounced and take samples
of the milk to test for contamination. If the milk is found to
contain dirt, antibiotics, or other foreign matter, the milk will
be destroyed and the farm re-inspected until purity is restored.
Population: All milk at the dairy (in the tank) Sample: sample from
the milk tank Random Sample: SRS
Slide 14
Terminology Experimental Units:individuals on which the
experiment is done Subjects:Human experimental units
Treatment:specific experimental condition applied Control
group:Group that receives no treatment or a placebo Placebo:A
treatment known to have no effect Factor:What is being manipulated
Response:What is being measured
Slide 15
Analyzing Experiments Aspirin Study Replication: Control:
Treatment : Blinding: Randomization: Subjects/Units:1000 male
volunteers Aspirn Patients not know which pill they are taking A
group will take a placebo pill The men will be randomly assigned to
either the treatment group or placebo group. Each treatment will be
replicated 500 times Factor:AspirinResponse:Number of heart attacks
Levels:Low dose and none (Placebo)
Slide 16
Slide 17
Categorical: places an individual into one of several groups or
categories. Ex: Eye color, favorite food Quantitative : takes a
range of numeric values Ex: Height, weight, income Types of
Variables Discrete: finite possible values Continuous: infinite
possible values EX: number of goals in soccer EX: Height of males
at Enloe
Slide 18
Gender Telephone area code Amount of electricity used Zip code
Ticket sales at Mylie Cyrus concert Number of chicken eggs hatched
on Nov. 17, 2006 at 3:00 am What kind of variable? Categorical
Quantitative (C) Categorical Quantitative (D) Does it make sense to
average the values?
Slide 19
Have a title Axes labeled Units identified Legend For
categorical data Every graph I ever make will always
Slide 20
Bar Chart Graphs for categorical data Bars never touch!
Slide 21
Pie Chart Graphs for categorical data *Used for comparing parts
to a whole
Slide 22
Create a graph for the following A survey was conducted of 1000
individuals regarding their favorite color. The results are as
follows: Red 367 Yellow 100 Green 68 Blue 159 Purple 200 Grey 26
Pink 80
Slide 23
Data Representation of Favorite color survey
Slide 24
Dot Plot Useful for small sets of data Stem and Leaf Plot
Useful for small sets of data More information than dot plot
Histograms Box Plots More about these tomorrow! Graphs for
Quantitative Variables
Slide 25
Sort the data Identify the min and max values to establish what
kind of stems and leaves to use If leaves become too long split
them Create a legend Creating a Stem and Leaf plot
Slide 26
Back to back stem and leaf plots Used for comparing two similar
sets of data Stems are in the middle and the leaves expand to the
left for one data set and to the right for the other data set
Slide 27
Histograms Groups nearby values and displays frequencies
National SAT scores 2007
Slide 28
How to construct Histograms Determine the bin size Divide the
range into equal sections Min of 5 bins Create a frequency table
Draw the graph
Slide 29
Wake County 2008 SAT scores 1633159016071622130413941324
1766151414121680154413781662 153116461604147215681541 1.Sort the
data 2.Identify the range of the data 3.Identify a bin size that
makes sense and will produce at least 5 bins
Slide 30
Relative Frequency Table ScoreCount 1300 - 1399 1400 - 1499
1500 1599 1600 1699 1700 1799 Use this table to help draw your
histogram!
Slide 31
Slide 32
Graphs can be MISLEADING! Number of deaths in Iraq as Published
by AOL news in March of 2006
Slide 33
Slide 34
Slide 35
Slide 36
Shape Mound, symmetrical, skewed, single peak, multiple peaks
Outlier Any observation that appears to not belong with the others
Center The middle of the data Spread Min value to max value
(including or excluding outliers) Describing Data
Slide 37
Describing Graphs (Shape) Symmetric: If the right and left
sides of the histogram are approximately mirror images Skewed
right: If the right side has outliers Skewed left: If the left side
has outliers Bi-modal: If there are 2 peaks Uniform: There are the
same number of observations for each value
Slide 38
Measures of center Median Exact middle of a set of data Mean
Arithmetic average of all of the observations in a data set
Slide 39
Ex: 1,2,3,4,5,6,7,8,9 What is the median? 5 What is the mean?
What if 10 is added to the data set? What is the median and
mean?
Slide 40
Resistant measures Def: A measure is resistant if it is not
easily influenced by extreme observations Is the median a resistant
measure? Yes Is the mean a resistant measure? No
Slide 41
Measures of spread Standard deviation Find this in your
calculator under 1 variable stats! Quartiles IQR (Inter Quartile
Range) Q3-Q1 These are found in your 5 # Summary! Range
=Max-min
Slide 42
5 number summary Min Q1: quartile 1, median of the lower half
Median (Q2) Q3: quartile 3, median of the upper half Max
Slide 43
Components of a box plot 5 number summary Min, Q1, Median, Q3,
Max Outliers Q1-1.5(IQR) Q3+1.5(IQR)
Slide 44
MinQ1Med Q3Max
Slide 45
25% 50% 25% Wheres the data?
Slide 46
What about outliers? Q1MedQ3Q3 MaxMi n Smallest obs. That is
not an outlier Largest obs. That is not an outlier
Slide 47
Slide 48
Standard deviation Gives a measure of how far the data varies
from the mean on average Is only used if the mean is the chosen
measure of center Is the standard deviation a resistant measure?
No!
Slide 49
Beginning pulse in class (n=23) 5050556062667272
7676767980808182 85879196108110110 Min = 50 Max = 110Median = 79Q1
= 66Q3 = 87
Slide 50
End pulse in class (n=24) 545558586364 656768686970
707072747676 7980808587109 Min = 54 Max = 109Median = 70Q1 = 64.5Q3
= 77.5
Slide 51
Outliers Interquartile range (IQR) = Q3 Q1 An observation is an
outlier if it lies 1.5(IQR) above Q3 or 1.5(IQR) below Q1 End Class
Data Q1 = 64.5 Q3 = 77.5 IQR = 77.5 64.5 = 13 1.5(13) = 19.5 Q1
1.5(IQR) = 64.5 19.5 = 45 Q3 + 1.5(IQR) = 77.5 + 19.5 = 97
Slide 52
Outliers Any observation below 45 or above 97 will be an
outlier 109 is an outlier 54555858636465 67686869707070
72747676798080 8587109
Slide 53
What is Normal? A bell shaped curve Standard Normal
distribution is when Mean=0 Standard Deviation=1
Slide 54
68-95-99.7 Rule The normal curve can give us an idea of how
extreme a value is based on how far away from the mean it is. 68%
95% 99.7% mean 0 Standard Deviations 21 3 -2 -3
Slide 55
Homework P. 65 # 12, 13 Make graph (box plot for #12 and
histogram for #13) Describe the shape Find any outliers Find mean,
and median Find range, standard deviation and IQR