+ All Categories
Home > Documents > Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The...

Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The...

Date post: 31-Mar-2015
Category:
Upload: marco-stokely
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
62
Basic Statistics
Transcript
Page 1: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Basic Statistics

Page 2: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

“I always find that statistics are hard to swallow and impossible to digest.

The only one I can remember is that if all the people who go to sleep in church were laid end to end they would be a lot more comfortable.”

[Mrs Robert A Taft]

Page 3: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

“Data! Data! Data!”he cried impatiently.

“I can’t make bricks without clay”

[Sherlock Holmes]

Page 4: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Qualitative

a) Nominal data

(dead/alive, blood group O,A,B,AB)

b) Ordered categorical/ranked data

(mild/moderate/severe)

Page 5: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Quantitative

a) Numerical discrete

(no. of deaths in a hospital per year)

b) Numerical continuous

(age, weight, blood pressure)

Page 6: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Presenting data

• Graphs

• Summary statistics

• Tables

Page 7: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Graphical methods

•Piechart

•Barchart

•Histogram

•Scattergram

Page 8: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Pie chart

Self-reported pain

extreme pain

moderate pain

no pain

Page 9: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Bar chart

self-reported pain

extreme painmoderate painno pain

No

. o

f su

bje

cts

5000

4000

3000

2000

1000

0

Page 10: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Age

65.060.055.050.045.040.035.030.025.020.0

50

40

30

20

10

0

Histogram

Page 11: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Boxplot

36004465N =

Gender

malefemale

Ag

e (

yea

rs)

100

80

60

40

20

0

Page 12: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Error bar plot

2818714785N =

mobility

severe problemsome problemsno problem

95

% C

I h

ea

lth s

tatu

s sc

ore

400

300

200

100

0

Page 13: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Scattergram of creatinine vs. digoxin

Digoxin

120100806040200

Cre

atin

ine

140

120

100

80

60

40

20

0

Scattergram

Page 14: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Graph Example

Page 15: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

SF36 sub-scale

General

Pain

Vitality

Mental

Role emo

Role phys

Soc fun

Phys fun

SF

36

sco

re

100

80

60

40

20

0

Not ill

Long term ill

Graph

Page 16: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

SF36 sub-scale

General

Pain

Vitality

Mental

Role emo

Role phys

Soc fun

Phys fun

SF

36

sco

re

100

80

60

40

20

0

Not ill

Long term ill

Solution

Page 17: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Summary statistics

Qualitative data

• Percentages

• Numbers

Page 18: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Secondary prevention of coronary heart disease

Respondents(n=1343)

Non-respondents(n=578)

Male 58% (782) 54% (314)

Urban Practice 54% (720) 57% (331)

Practice size:

< 5,000 14% (190) 18% (105)

5,000 – 10,000 39% (523) 41% (238)

> 10,000 47% (630) 41% (235)

Page 19: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Summarizing data example I

Page 20: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Summary StatisticsQuantitative data

• Non-normal

median

range

inter-quartile range• Normal

mean

standard deviation

variance

Page 21: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Boxplot

36004465N =

Gender

malefemale

Ag

e (

yea

rs)

100

80

60

40

20

0

Page 22: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Summary StatisticsNormal data

Approximately 95% of observations lie between the mean plus or minus

2 standard deviations

Page 23: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Age

65.060.055.050.045.040.035.030.025.020.0

50

40

30

20

10

0

Histogram

Page 24: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

IgM

3.002.752.502.252.001.751.501.251.00.75.50.250.00

140

120

100

80

60

40

20

0

Histogram of IgM values

Page 25: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

How to test for Normality

• Mean = Median

• (mean-2sd, mean+2sd) reasonable range

• -1 < skewness < 1

• -1 < kurtosis < 1

• Histogram shows symmetric bell shape

Page 26: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Checking for Normality

Age Length of stay

Satisfaction score

Mean 66.2 12.1 5.2

Median 67 8 9

SD 8.2 9.0 4.3

Minimum 49 4 1

Maximum 80 36 10

Skewness -0.2 1.8 -2.5

Kurtosis 0.5 1.3 4.6

Page 27: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Secondary prevention of coronary heart disease

Mean (sd)

Respondents

(n=1343)

Non-respondents

(n=578)

Age (years) 66.2 (8.2) 66.6 (8.7)

Time since MI (mths) * 10 (6, 35) 15 (8, 47)

Cholesterol (mmol/l) 6.5 (1.2) 6.6 (1.2)

[* Median (range)]

Page 28: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Summary statistics example II

Page 29: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Natural log transformation

• Can transform +vely skewed data to ‘Normal’ data

• Use transformed data in analysis

• Resulting mean value transformed back (using ex) to give geometric mean

• Present geometric mean and range

Page 30: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Effect of loge transformation

Length of stay

Loge length of stay

Mean 12.1 2.2

Median 8 2.1

SD 9.0 0.5

Minimum 4 1.4

Maximum 36 3.6

Skewness 1.8 0.4

Kurtosis 1.3 0.7

[Geometric mean = e 2.2 = 9.0]

Page 31: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Secondary prevention of coronary heart disease

Mean (sd)

Respondents

(n=1343)

Non-respondents

(n=578)

Age (years) 66.2 (8.2) 66.6 (8.7)

Time since MI (mths) * 10 (6, 35) 15 (8, 47)

Cholesterol (mmol/l)

Length of stay #

6.5 (1.2)

9.0 (4, 36)

6.6 (1.2)

11.2 (6, 83)

[* Median (range), # Geometric mean (range)]

Page 32: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Confidence Interval

“ The estimated mean difference in systolic blood pressure between 100 diabetic and 100 non-diabetic men was 6.0 mmHg

with 95% confidence interval

(1.1mmHg, 10.9mmHg)”

Page 33: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Confidence Interval

• Contains information about the (im)precision of the estimated effect size

• Presents a range of values, on the basis of the sample data, in which the population value for such an effect size may lie

Page 34: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Confidence Interval

95% CI for mean = mean +/- 1.96 SEM90% CI for mean = mean +/- 1.64 SEM

SEM = sd / sqrt(n)

Page 35: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Confidence Interval

• The 95% CI is a range of values which we are 95% confident covers the true population mean

• There is a 5% chance that the ‘true’ mean lies outside the 95% CI

Page 36: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Error bar plot

2818714785N =

mobility

severe problemsome problemsno problem

95

% C

I h

ea

lth s

tatu

s sc

ore

400

300

200

100

0

Page 37: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Confidence Interval Example

Page 38: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Significance/hypothesis tests

Measure strength of evidence provided by the data for or against some proposition of interest

Eg. Is the survival rate after X better than after Y?

Page 39: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Significance/hypothesis tests

Null hypothesis:

“Effects of X and Y are the same”

Alternative hypothesis:

“Effects of X and Y are different”

Page 40: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Significance/hypothesis tests

One-sided :

“X is better than Y”

Two-sided:

“ X and Y have different effects”

Page 41: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

P-value

P is the probability of how true is the null hypothesis

Page 42: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

P-value

P <= 0.05

• null hypothesis is not true

• there is a difference between X and Y

• result is statistically significant

Page 43: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

P-value

P > 0.05

• null hypothesis may be true

• there is probably no difference between X and Y

• result is not statistically significant

Page 44: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

P-value

Power of study

• probability of rejecting null hypothesis when false

• increased by increasing sample size

• increased if true difference between treatments is large

Page 45: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

P-value

Statistical significance does not imply clinical significance

Page 46: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

A statistician is a person whose lifetime ambition is to be wrong

5% of the time

Page 47: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Types of significance tests

Chi-square test:

“28 out of 70 smokers have a cough compared with 5 out of 50 non-smokers

- is there a significant difference?”

[28/70 = 40% compared with 5/50=10%]

Page 48: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Chi-square test result

“P=0.001”

There is a significant relationship between smoking and cough

Page 49: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Types of significance tests

Two-sample t-test:

“Is there a difference in the 24 hour energy expenditure between groups of lean and

obese women?”

Page 50: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Types of significance tests

Mann-Whitney U-test:

“Is there a difference in the nausea score between chemo patients receiving an active anti-emetic treatment and those receiving

placebo?”

Page 51: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Types of significance tests

Paired t-test:

“Is there a difference in the dietary intake of a group of students in the week before

and after Finals?”

Page 52: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Types of significance tests

Wilcoxon matched pairs signed rank test or the Sign test:

“Is there a difference in the units of alcohol consumed by students in the week before

and after finals?”

Page 53: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Significance test example

Page 54: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Correlation

Measures the strength of the relationship between two variables

Page 55: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Scattergram of creatinine vs. digoxin

Digoxin

120100806040200

Cre

atin

ine

140

120

100

80

60

40

20

0

Scattergram

Page 56: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Correlation

Pearson correlation:

• Used for Normally distributed data

• Measures linear relation between variables

Page 57: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Correlation

• r = 0 no relationship

• r = 1 perfect +ve relationship

• r = -1 perfect –ve relationship

Page 58: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Scattergram of creatinine vs. digoxin

Digoxin

120100806040200

Cre

atin

ine

140

120

100

80

60

40

20

0

Scattergram

Page 59: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Correlation

Spearman correlation:

• Used for non-Normally distributed data

• Measures monotonic relationship between variables

Page 60: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Correlation Example

Page 61: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

Correlation

Change in IGF-1 (ng/ml)

2001000-100-200

Ch

an

ge

in le

ft-v

en

tric

ula

r m

ass

(g

)

120

100

80

60

40

20

0

-20

-40

placebo

rhGH

Page 62: Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who.

“The government are very keen on amassing statistics.They collect them, add them, raise them to the n’th power, take the cube root and prepare wonderful diagrams.But you must never forget that every one of these figures comes in the first instance from the village watchman, who just puts down what he damn pleases”

[Comment of a judge on the subject of government statistics, 1920]


Recommended