Statistical Writing
Sven Sandin,Dpt of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholmhttp://ki.se/en/people/svesan
What will I talk about
The scientific (statistical) method
Manuscript as part of the work flow …. PLANING !
Statistical issuesStatistical significancePowerStatistics section
STROBE
Types of Presentations
Original research papers
- Hypothesis testing
- Descriptive
Meta-analysis
Review
Poster
Conference proceeding
What is important ? Why ?
What is a statistical method ?
What will good procedures achieve
Prevent fraud …..
QA. Increase the value of our own work
Communicating our results with our colleagues
Efficiency and speed
Getting published !
Deduction or Induction
Inductive vs Deductive
Deductive
Top-down
Mathematical logic
A > B …. so A > C
Induction
Bottom-up
Probabilistic
Induction - Statistical inference
Induction
1. Most women are shorter than 185cm2. Maria is a woman3. Maria is most likely shorter than 185
This involves assumptions and earlier knowledge (a) Fact ? assumption ? speculation ? Direction ? Interesting ?(b) Results from the study(c) Conclusion
Where is the weak spot ?
Lawyer ?
Design
What results can the design give us ?
Cause vs association
Generalizability
Confounding
Precision
Types
Cross-sectional
Cohort, case-control
Longitudinal, Cross-over
What is important ? Why ?
What is a statistical method ?
What will good procedures achieve
Prevent fraud …..
QA. Increase the value of our own work
Communicating our results with our colleagues
Efficiency and speed
Getting published !
What is important ? Why ?
What will good procedures achieve
Prevent fraud …..
QA. Increase the value of our own work
Communicating our results with our colleagues
Efficiency and speed
Getting published !
Most important ?
Repeatability and truth, given this anything is allowed.
State what was done. Pre-specified !
How can we assure that it is possible to repeat ?
........ YOU WILL be asked to re-calculate …..
Structure of the manuscript
Objective ↔ Conclusion
Objective ← Variables & Methods → Results
Don't be afraid of formulating a hypothesis and objectiveIs “Will explore....” good ?Quality grade (strength) ?
Mathematical ...The type phase 3 trial ....Causality
What can the design give us
Need to focus and limit the study.
THIS IS THE SCIENTIFIC (WORK)PROCESS !
Structure of the manuscript
Objective ↔ Conclusion
Objective ← Variables & Methods → Results
Results → Conclusion
Keep focus - Online material
What is a result
Separate what is supported by the data and what is not"A strikingly similar pattern appears for .....""... observed a dramatic increase ...""... it is clear from the data...""... it is borderline significant"
Results arise from the planned process of data collection - else discussion
Discussion section show your expertise judgement
…... sometimes, as a reader, you are not interested in this …..
Is it statistically significant ?
What does it mean ?
Valuable concept - don't trash it !
Be exact and careful“statistical significant” or “significant” ?year or light year ?borderline not statistically significant ??
How much alpha do we have ?
Multiplicity
Is it statistically significant ?
Four steps1) Set up null and alternative hypothesis. E.g.
H0: m1 = m2, m1 & m2: mean group 1 and 2HA: m1 ≠ m2
2) Define what test to use, e.g. t-test or Wilcoxon rank sum, including significance level
3) Calculate the test statistic and p-value4) Decide: Reject HA or not
Two possible errors in the decision1) Type I (alpha): Conclude difference where there is none. False reject.2) Type II (beta): Fail to conclude a true difference. Fail to reject.
alpha is the risk associated with our final statement
Is it statistically significant ?
What does it mean ?
Valuable concept - don't trash it !
Be exact and careful“statistical significant” or “significant” ?light year ?borderline not stat sign...
How much alpha do we have ?
Multiplicity
Is it statistically significant ?
What does it mean ?
Valuable concept - don't trash it !
Be exact and careful“statistical significant” or “significant” ?light year ?borderline not stat sign...
How much alpha do we have ?
Multiplicity
Power
The power has a meaning before the data are collected
Calculating power for the observed data is not meaningful
The information sits in the confidence interval
Observed 1-power is the probability that the observed result was false. In other words, there is an effect but sample is to small ...... WRONG!
Power
Hoenig JM, Heisey DM. The Abuse of Power. The American Statistician 2001; 55: 19–24.
Levine M, Ensom MH. Post hoc power analysis: an idea whose time has passed? Pharmacotherapy 2001; 21: 405–9.
Choice of methods/technique
Minimize testing
Test for baseline differences ?
Do you need the assumptions ? ..... check them
Don't dichotomize everything … loss of power .... interpretation.
Missing valuesNot an exclusion criteria … part of methods .... Maximum likelihood and MAR ....Biases ...
The "Statistical methods" section ?
Spell out what you are testingHypothesis ? Method/technique, e.g. logistic regression, Cox regressionWhat assumptions are there ?How have you tested the assumptionsCovariates in the model including categories and how chosenSignificance levelSoftware + version
Any supplementary analysis. What is primary vs supplementary. What was pre-specified and not ?
Missing values
Software
Use standard softwareSASRStataDon't use Excel!
Standard methods
Secure your dataArchive your data, CSV... word ?Save log filesWrite versions in the manuscript
Other softwareLibreoffice, www.libreoffice.orgZotero, www.zotero.org
Manuscript as part of the work flow
When do the manuscript writing start ?
Study plan is crucial !
The results are published independent on the outcome !! Plan for this!
(Statistical) Analysis plan
Statistical Analysis Plan
What is it ?Communication
(Statistical) Analysis plan
Objective
Clarifies roles - list of authors
Methods and need for competences
Pre-specify: Guide later analyses
Write the introduction and methods before the data are available