Lund 2009

Post on 29-May-2015

169 views 1 download

Tags:

transcript

Biostatistical analysissome basic issues

Jonas Ranstam PhD

Plan

Statistical principles and their consequences when writing manuscripts for publication in scientific journals

How does this information relate to your research?

General discussion

Statistics is not mathematics

Mathematics is about deduction; statistics starts with data.

John Nelder

Litmus test

A simple test for the acidity of a substance.

Student's t-test

A simple test for the significance of a finding.

Litmus test

A simple test for the acidity of a substance.

Student's t-test

A simple test for the significance of a finding.

Differences between litmus tests and statistical tests

Concentrated hydrochloric acid always turns blue litmus paper red.

A clinically significant difference in systolic blood pressure is not always statistically significant.

Differences between litmus tests and statistical tests

Whenever a blue litmus paper remains blue it has not been exposed to concentrated hydrochloric acid (evidence of absence).

A statistically insignificant difference in systolic blood pressure may be clinically significant (absence of evidence).

Differences between litmus tests and statistical tests

The interpretation of a blue litmus paper turning red is independent of the number of tests performed (no multiplicity issues).

The interpretation of a statistical test showing significance depends on the number of tests performed (multiplicity issues).

Differences between litmus tests and statistical tests

A sample is litmus tested to know more about the sample.

A sample is statistically tested to know more about the population from which the sample is drawn.

A population of black and white dots

Population types

Finite (having 100 dots)

Superpopulation (infinite, but symbolized by the 100 dots)

Sampling dots from the population

?

Sampling uncertainty

What is the sampling uncertainty?

The sample is usually obvious, but what is the population?

What do the sampled items represent, a finite or a super-population?

Sampling uncertainty

The Central Limit Theorem (CLT) states:

The mean of the sampling distribution of means equals the mean of the population from which the samples were drawn.

The variance of the sampling distribution of means equals the variance of the population from which the samples were drawn divided by the size of the samples.

The sampling distribution of means will approximate a Gaussian distribution as sample size increases.

yi ~ N(μ

p; σ

p/√n)

Measurement uncertainty

A measurement, Yi, has two components: the measured

objects true measure, μ0, and a measurement error, e

i:

Yi = μ

0 + e

i

A measurement error is typically unknown, but the population of errors may have a known a distribution:

ei ~ N(μ

e; σ

e)

If μe ≠ 0 measurements will be biased and the greater σ

e the

lower the measurement's precision.

Combining sampling uncertainty and measurement errors

As long as measurements are not biased, measurement errors reduce statistical precision.

This may, however, in turn lead to (dilution) bias or "attenuation by errors"

Alt. 1. Statistical hypothesis testing

Sampling uncertainty described by a probability

H0: π = π

0

H1: π ≠ π

0

p = Prob(drawing a sample looking like H1 | H

0)

If unlikely reject H0 (“statistically significant”)

Alt. 2. Confidence interval

Sampling uncertainty described by a range of values likely (usually 95%) to include an estimated parameter

πE (π

L, π

U)

Effect

Statistically significant effect

Inconclusive

Statistically significant reversed effect

p < 0.05

n.s.

p < 0.05

P-value Conclusion from confidence intervals

P-values vs. confidence intervals

0

0Effect

Clinically significant effect

Statistically and clinically significant effect

Statistically, but not necessarily clinically, significant effect

Inconclusive

Neither statistically nor clinically significant effect

Statistically significant reversed effect

p < 0.05

p < 0.05

n.s.

n.s.

p < 0.05

P-value Conclusion from confidence intervals

P-values vs. confidence intervals

Statistically, but clinically insignificant effect p < 0.05

ICMJE Uniform requirements for manuscripts...

Results

“When possible, quantify findings and present them with appropriate indicators of measurement error or uncertainty (such as confidence intervals).”

“Avoid relying solely on statistical hypothesis testing, such as the use of P values, which fails to convey important information about effect size.”

Present observed data using

a) central tendency (mean, median or mode), b) variability (SD or range) and c) number of observations (n).

Describe parameter estimates

a) with 95% confidence intervals (2SEM) and b) number of observations in the sample (n).

Many scientists do not understand the difference between sample and population

+/-SD or +/-SEM?

SD is a measure of variability.

SEM is a measure sampling uncertainty.

Note

+/- 1SEM corresponds to a 68% confidence interval, +/- 2SEM corresponds to a 95% confidence interval.

ICMJE Uniform requirements for manuscripts...

Methods

Describe statistical methods with enough detail to enable a knowledgeable reader with access to the original data to verify the reported results.

This part of the manuscript could be written prior to the experiment.

Sample size calculation

An essential part of designing a study, or experiment, is planning for uncertainty.

Both too much and too little uncertainty in results are unethical, because research resources are scarce and should be used rationally.

Sample size calculation

Sampling uncertainty increases with the variability of the studied variable.

Sampling uncertainty decreases with the number of independent observations in the sample.

Smaller sampling uncertainty is required to detect small differences than large ones.

Example 1. A vaccine trial

Without protection 30% will fall ill.

Investigating a protective effect, with 5% false positive and 20% false negative error rate requires.

Protection Nr of patients 90% 72 80% 94 70% 128 60% 180 50% 268 40% 428

Example 2. Side effect surveillance

Guillain-Barrés syndrome: Incidence = 1x10-5 personyrs.

To investigate the side effect with 5% false positive and 20% false negative error rate requires.

Risk increase Number of patients100 times 1 098 50 times 2 606 20 times 9 075 10 times 26 366 5 times 92 248 2 times 992 360

Multiplicity

Each tested null hypothesis have, with 5% significance level, 5% chance of being false positive.

Testing n null hypotheses at 5% significance level leads to an overall chance of at least one false positive test of 1 - 0.95n

n overall p1 0.0502 0.0983 0.1434 0.1865 0.226

Multiplicity

Each tested null hypothesis have, with 5% significance level, 5% chance of being false positive.

Testing n null hypotheses at 5% significance level leads to an overall chance of at least one false positive test of 1 - 0.95n

n overall p1 0.0502 0.0983 0.1434 0.1865 0.226

Some experiments have multiplicity problems severe enough to make conventional statistical testing meaningless.

Sampling from a population without variation

Sampling from a population without variation

Without variation there is no sampling uncertainty to evaluate.

Do not test deterministic outcomes, like sex distribution after matching for sex or baseline imbalance after randomisation.

How to report laboratory experiments

Guidelines to promote the transparency of research has been developed in clinical medicine and epidemiology: CONSORT, STROBE, PRISMA, etc.

No such guidelines for reporting laboratory experiments.

Would reporting guidelines be useful?

A systematic review (Roberts et al. BMJ 2002;324:474-476) shows that the reporting generally is inadequate: only 2 of 44 papers described the allocation of analysis units.

General principles

The experiment should be described in a way that makes it possible for the reader to repeat the experiment.

The statistical analysis should be described with enough detail to allow a reader with access to original data to verify reported results.

Introduction section

What is the purpose of the experiment?

What hypotheses will you test (in general terms)?

Material and methods section

What is the design of your experiment?

What is your sample and population?

What is your analysis unit?

Statistics section

What statistical methods did you use?

Did you check whether the assumptions were fulfilled?

How did you do that?

Were the assumptions fulfilled?

Results section (observation)

What was the mean or median value?

What was the variation (SD or range)?

How many observations did you have?

Results section (inference)

What hypotheses did you test (specifically)?

What was the effect size (parameter estimate)?

What was its 95% confidence intervals?

If you present p-values:

Present the actual p-value, e.g. p = 0.45 or p = 0.003, unless p < 0.0001

Do NOT write p > 0.05 or ns or p = 0.0000 or stars

Discussion section

What is your strategy for multiplicity?

What are your interpretation of presented p-values?

Can you see any problems with bias?

What are your overall conclusion?

Discussion1. Describe an experiment you have performed or plan to perform and state the purpose of this.

2. Present the sample and the population from which the sample is drawn.

3. Suggest how the sampling uncertainty could be presented in a manuscript submitted for publication in a scientific journal using p-values and confidence intervals.

HelpRanstam J. Sampling uncertainty in medical research. Osteoarthritis Cartilage 2009;17:1416-1419.

Ranstam J. Reporting laboratory experiments. Osteoarthritis Cartilage 2009 (in press) doi:10.1016/j.joca.2009.07.006.