Date post: | 08-Apr-2018 |
Category: |
Documents |
Upload: | fidos-didos-jr |
View: | 221 times |
Download: | 0 times |
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 1/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 26
measures of averages
and variation
• lies, damn lies and ...mode(s), med ian and mean
• square peoplevariance and stand ard
deviation
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 2/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 27
average?three typ ical measures:
• mode(s):“more people use dogo than any other dog food”
• med ian“half of all salaries are greater than £15000 p .a.”
• mean“if salaries were d ivided evenly . . .”
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 3/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 28
income £
nos of peoplethree modes
mode(s)
• not w id ely u sed
• may have morethan one mod e
• the bump maybe anyw here!
• sensitive
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 4/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 29
J. Bloggs 3500F. Mole 5600K. Giles 8000J. Smith 8300B. Roberts 8450S. Claus 8450A. Jones 8680H. Lee 15750M. Warren 17500
T. Smyth-Boule 20000028423
mediansalary£8450
meansalary!
sensitivity of mean
• one big value ...• union qu otes
median• employer the
mean
• lies, d amn lies ...
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 5/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 30
why use the mean?
• median is more robust• mean is more manipulable
numberof people
meansalary
mediansalary
group 1 10 15000 12500
group 2 10 23000 16000grp 1 & grp 2 19000 ?
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 6/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 31
measures of variation
inter-quartilerange = 14–9
27
81 01 11 21 21 3
1 31 51 82 31 2 mean
2 1 0 1 0 07 5 2 58 4 1 6
1 0 2 4
1 1 1 11 2 0 01 2 0 01 3 1 11 3 1 11 5 3 91 8 6 3 62 3 1 1 1 2 1
4 2 6
differencefrom m ean
square of difference
varianceaverage
difference
stand ard d eviationσ = √ variance
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 7/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 32
which is best?
a bit like averages . . .• inter-quartile range is robust
• variances ad d u p
• standard deviations meaningful
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 8/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 33
square people
if data is people buying ‘d ogo’
variance is 26 square people!
stand ard deviation
σ = √ variance= 5.1 p eop le
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 9/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 34
the ‘real’ world
• the sample – actual measured d ata• the population
– large set from w hich the data is draw n
– especially for su rveys etc.
• the id eal
– the ‘typical’ user, the fair coin– unrepeatable events – the fall of a raind rop– a theoretical d istribution
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 10/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 35
the job of statisticsreal world
sample data
measurement
statistics!
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 11/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 36
different means
x average of the measured d ata~ sample mean
y average of the ‘real’ world~ popu lation mean
z theoretical mean of the ‘d istribu tion’e.g. mean d ie score = 3.5
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 12/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 37
real mean
sample mean
estimator
µ
µ
estimating the mean
sample mean estimatesreal (popu lation) mean
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 13/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 38
real mean
sample mean
estimator
µ
µ
strange but true
the meanof the mean
is the mean
i.e. theoretical meanof sample mean
is real mean!!!!!
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 14/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 39
real mean
sample mean
estimator
µ
µ
law of large numbers
if samples are ind epend ent(or nearly so)
bigger sample ⇒ better estimate
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 15/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 40
how good an estimate
• each d ata item has some variabilityhead=1/ tail=0: 0 0 0 1 1 1 0 1 1 1 0 1 1 1 0 0 1 0 1 1
• sum s of data items have variabilitynos of head s: 12 11 9 13 8 8 8 11 8 11
• means of data item have variabilityaverages: 0.6 0.65 0.45 0.65 0.4 0.4 0.4 0.55 0.4 0.55
better = less variability
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 16/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 41
variability of sumsvariances ad d up*:
variance(sum of 100 items)= 100 × variance(each item)
standard deviation = √ variances.d . of sum of 100 items= 10 × s.d . each item
square root ru le: σ(n items) = √ n σ(eachitem)
i.e. bigger, bu t prop ortionately less
* only if items are ind epend ent (actually closely related to Pythagoras' theorem!)
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 17/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 42
variability of mean
mean is sum/ nos. of items:σ(mean of 100 items)= σ(sum each item)/ 100= σ(each item) / 10
square root ru le for means:σ(mean of n item s)* =
1
√ n σ(each item)
* called stand ard error (s.e.) of mean
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 18/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 43
so what?
experiments, d ata collection etc....
to halve the variationneed 4 times as many subjects
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 19/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 44
solved it?x seeing through rand omness
use sample mean as estimator
y know ing w hen you haveσ(mean) = σ(item)/ √ n
? w hat is σ(item)estimate it from sample!
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 20/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 45
real variance
sample data
estimator
σ2
Σ µ(x- ) 2
n–1
estimating σ(item)
use sample variance/ s.d .as estimateof real variance
N.B. only an estimate
OK . . . bu t a tid bit small on average(biased estimator)
✰ that’s why stats. form ulae are fu ll of √ n-1
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 21/22
8/6/2019 Stats OHPs 3
http://slidepdf.com/reader/full/stats-ohps-3 22/22
Avoiding Damned Lies – Understanding Statistical Ideas, Alan Dix www.meandeviation.com 47
drunkard's walk
• a d ru nk w and ers hom e❖"""" sometimes he takes one step forw ard s
sometimes one step back ❙
? after n stepshow far is he from w here he started
! another example of √ n behaviour