Date post: | 26-Dec-2015 |
Category: |
Documents |
Upload: | dominick-newman |
View: | 214 times |
Download: | 0 times |
STATISTICSSTATISTICSFor ResearchFor Research
Why Statistics?Why Statistics?
1.1. QuantitativelyQuantitatively describe describe
and summarize dataand summarize data
A Researcher Can:A Researcher Can:
A Researcher Can:A Researcher Can:2.2. Draw conclusionsDraw conclusions about about
large sets of data by large sets of data by sampling only sampling only small small portions of themportions of them
3.3. ObjectivelyObjectively measure measure differencesdifferences and and relationships between relationships between sets of data.sets of data.
A Researcher Can:A Researcher Can:
• Samples should be taken at Samples should be taken at randomrandom
• Each measurement has an Each measurement has an equal equal opportunityopportunity of being selected of being selected
• Otherwise, sampling Otherwise, sampling procedures may be procedures may be biasedbiased
Random SamplingRandom Sampling
• A characteristic CANNOT be A characteristic CANNOT be estimated from a single data estimated from a single data pointpoint
• ReplicatedReplicated measurements measurements should be taken, at least should be taken, at least 1010..
Sampling ReplicationSampling Replication
MechanicsMechanics1.1. Write down a Write down a formulaformula
2.2. Substitute numbers Substitute numbers into the into the formulaformula
3.3. Solve Solve for the for the
unknownunknown..
The Null HypothesisThe Null Hypothesis
• HHoo = There is no difference = There is no difference
between 2 or more sets of databetween 2 or more sets of data– any difference is due to chance any difference is due to chance
alonealone
– Commonly set at a probability Commonly set at a probability
of 95% (P of 95% (P .05) .05)
The Alternative HypothesisThe Alternative Hypothesis
• HHAA = There = There isis a difference a difference
between 2 or more sets of databetween 2 or more sets of data– the difference is due to more the difference is due to more
than just chancethan just chance
– Commonly set at a probability Commonly set at a probability
of 95% (P of 95% (P .05) .05)
AveragesAverages• Population Average = mean ( Population Average = mean ( x x ))
• a a Population meanPopulation mean = ( = ( ))– take the mean of a take the mean of a random random
samplesample from the population ( from the population ( n n ))
Population MeansPopulation MeansTo find the population mean ( To find the population mean ( ),),• add up (add up () the values ) the values
((x x = grasshopper mass, tree = grasshopper mass, tree height) height)
• divide by the number of values (divide by the number of values (nn):):
= = xx — —
nn
Measures of VariabilityMeasures of Variability• Calculating a mean gives only a Calculating a mean gives only a
partialpartial description of a set of data description of a set of data
– Set A = 1, 6, 11, 16, 21Set A = 1, 6, 11, 16, 21
– Set B = 10, 11, 11, 11, 12Set B = 10, 11, 11, 11, 12
•Means for A & B Means for A & B ????????????
• Need a measure of how variable Need a measure of how variable the data are.the data are.
RangeRange• DifferenceDifference between the largest between the largest
and smallest valuesand smallest values
– Set ASet A = 1, 6, 11, 16, 21 = 1, 6, 11, 16, 21
•Range = Range = ??????
– Set BSet B = 10, 11, 11, 11, 12 = 10, 11, 11, 11, 12
•Range = Range = ??????
Standard Standard DeviationDeviation
Standard DeviationStandard Deviation• A measure of the deviation of A measure of the deviation of
data data from their mean.from their mean.
The FormulaThe Formula
__________SDSD = = NN ∑∑XX2 - 2 - ((∑∑XX))22
________ ________
NN ( (NN-1)-1)
SD SymbolsSD SymbolsSDSD = = Standard DevStandard Dev
= = Square RootSquare Root
∑∑XX2 2 = = Sum of xSum of x22’d’d
∑∑((XX))2 2 = = Sum of x’s, Sum of x’s, then squared then squared
NN = = # of samples# of samples
The FormulaThe Formula
__________SDSD = = NN ∑∑XX2 - 2 - ((∑∑XX))22
________ ________
NN ( (NN-1)-1)
XX XX22
297 297 88,209 88,209301 301 90,601 90,601306 306 93,636 93,636312 312 97,344 97,344314 314 98,596 98,596317 317 100,489 100,489325 325 105,625 105,625329 329 108,241 108,241334 334 111,556 111,556350 350 122,500122,500XX = 3,185 = 3,185 XX22 = 1,016,797 = 1,016,797
You can use your You can use your calculatorcalculator to find to find SD! SD!
Once You’ve got the Idea:Once You’ve got the Idea:
The Normal The Normal CurveCurve
The Normal The Normal CurveCurve
SD & the Bell CurveSD & the Bell Curve
% Increments% Increments
Skewed CurvesSkewed Curves
medianmedian
Critical ValuesCritical Values
Standard Deviations Standard Deviations 2 2 SD SD
above or below the mean =above or below the mean =
due todue to MORE THAN CHANCE MORE THAN CHANCE
ALONE.ALONE.
Critical ValuesCritical Values
The data lies The data lies outsideoutside the the 95%95% confidence confidence limits for probability.limits for probability.
Chi-SquareChi-Square
22
Chi-Square Test Chi-Square Test RequirementsRequirements
• Quantitative dataQuantitative data
• Simple random sampleSimple random sample
• One or more categoriesOne or more categories
• Data in frequency (Data in frequency (%%) form) form
Chi-Square Test Chi-Square Test RequirementsRequirements
• Independent observationsIndependent observations
• All observations must be usedAll observations must be used
• Adequate sample size (Adequate sample size (10)10)
ExampleExampleTable 1 - Color Preference for 150
Customers for Thai?s Car DealershipCategory
ColorObserved
FrequenciesExpected
Frequencies
YELLOW 35 30RED 50 45
GREEN 30 15BLUE 10 15WHITE 25 45
Chi-Square SymbolsChi-Square Symbols 22 = = (O - E)(O - E) 22
EE
OO = = Observed FrequencyObserved Frequency
EE = = Expected FrequencyExpected Frequency
= = sum ofsum of
d fd f = = degrees of freedomdegrees of freedom ( (nn-1) -1) 22 = = Chi SquareChi Square
Chi-Square WorksheetChi-Square Worksheet
CATAGORY O E (O - E) (O - E)2 (O - E)2
E
YELLOW 35 30 5 25 0.83
RED 50 45 5 25 0.56
GREEN 30 15 15 225 15
BLUE 10 15 -5 25 1.67
WHITE 25 45 -20 400 8.89
2 = 26.95
Chi-Square AnalysisChi-Square AnalysisTable value for Chi Square = Table value for Chi Square = 9.499.49 44 d fd f
P=.05P=.05 level of significancelevel of significance
Is there a significant difference in car Is there a significant difference in car preference????preference????
SD & the Bell CurveSD & the Bell Curve
T-TestsT-Tests
T-TestsT-Tests
For populations that For populations that do do follow a follow a normalnormal distribution distribution
T-TestsT-TestsDrawing conclusions aboutDrawing conclusions about
similarities or differences similarities or differences
between between population meanspopulation means
( ( ))
T-TestsT-Tests• Is average plant biomass Is average plant biomass
the same in two different the same in two different geographical geographical areas areas ??????
• Two different Two different seasons seasons ??????
T-TestsT-Tests• COMPLETELYCOMPLETELY confident confident
answer = answer = – measure measure allall plant biomass in plant biomass in
each areaeach area
• Is this PRACTICALIs this PRACTICAL??????????
Instead:Instead:• Take one sample from each Take one sample from each
populationpopulation
• InferInfer from the sample means and from the sample means and SD whether the populations have SD whether the populations have the the samesame or or differentdifferent means. means.
AnalysisAnalysis• SMALLSMALL t t values = values = high high probability probability
that the two population means are that the two population means are the the samesame
• LARGE LARGE tt values = values = low low probability probability (means are different)(means are different)
AnalysisAnalysis TTcalculatedcalculated > > ttcritical critical = reject = reject HHoo
ttcriticalcritical ttcriticalcritical
We will be using We will be using computer analysis to computer analysis to perform the perform the tt-test -test
Simpson’s Simpson’s Diversity IndexDiversity Index
Nonparametric TestingNonparametric Testing
• For populations that For populations that do do NOTNOT follow a follow a normalnormal distribution distribution – includes includes most wild most wild populationspopulations
Answers the QuestionAnswers the Question
• If 2 indiv are taken at RANDOM If 2 indiv are taken at RANDOM from a community, what is the from a community, what is the probability that they will be the probability that they will be the SAME speciesSAME species????????
The FormulaThe Formula
D = 1 - D = 1 - nni i (n(ni i - 1)- 1) ————— —————
N (N-1)N (N-1)
ExampleExample
Species,N
Abundance,ni
RelativeAbundance, Pi
1 50 50/ 85 = 0.588
2 25 25/ 85 = 0.294
3 10 10/ 85 = 0.118
N = 3 n = 85
ExampleExample
D = D = 1- 50(49)+25(24)+10(9)1- 50(49)+25(24)+10(9)
——————————————————————
85(84)85(84)
DD = 0.56 = 0.56
AnalysisAnalysis• Closer to Closer to 1.01.0 = =
– more more HomoHomogeneous communitygeneous community
• Farther away from Farther away from 1.01.0 = = – more more HeteroHeterogeneous communitygeneous community
•You can You can calculate by hand calculate by hand to to find “D”find “D”
•School Stats package School Stats package MAYMAY calculate it.calculate it.
© 2000 Anne F. Maben© 2000 Anne F. Maben
All rights reservedAll rights reserved