Guidelines for statistical analysis of arthroplasty data
Jonas Ranstam PhD
Source: Pubmed
Source: Pubmed
Source: Pubmed
All modern medical science publications
60%
Source: Pubmed
Randomized clinical trial of streptomycin and tubercolosis (1948)Bradford Hill & MRC
Source: Pubmed
Cohort study of smoking and lung cancer (1954)Bradford Hill & Doll
Case-control study of smoking and lung cancer (1950)Bradford Hill & Doll
Randomized clinical trial of streptomycin and tubercolosis (1948)Bradford Hill & MRC
Source: Pubmed
Cohort study of smoking and lung cancer (1954)Bradford Hill & Doll
Case-control study of smoking and lung cancer (1950)Bradford Hill & Doll
Randomized clinical trial of streptomycin and tubercolosis (1948)Bradford Hill & MRC
Evidence based medicineSystematic reviews and
meta analyses(The Cochrane
collaboration 1993)
Evidence levels1. Strong evidence from at least one systematic review of multiple well-designed randomized controlled trials.
2. Strong evidence from at least one properly designed randomizedcontrolled trial of appropriate size.
3. Evidence from well-designed trials such as pseudo-randomized or non-randomized trials, cohort studies, time series or matched case-controlled studies.
4. Evidence from well-designed non-experimental studies from more than one center or research group or from case reports.
5. Opinions of respected authorities, based on clinical evidence, descriptive studies or reports of expert committees.
Any claim coming from an observational study is most likely to be wrong
12 randomised trials have tested 52 observational claims (about the effects of vitamine B6, B12, C, D, E, beta carotene, hormone replacement therapy, folic acid and selenium).
“They all confirmed no claims in the direction of the observational claim. We repeat that figure: 0 out of 52. To put it in another way, 100% of the observational claims failed to replicate. In fact, five claims (9.6%) are statistically significant in the opposite direction to the observational claim.”
Stanley Young and Allan Karr, Significance, September 2011
Guidelines
Systematic reviews and meta analyses benefit from a standardized, transparent and accurate reporting of studies.
STREGA, STROBE, STARD, SQUIRE, MOOSE, PRISMA, GNOSIS, TREND, ORION, COREQ, QUOROM, REMARK, CONSORT...
Guidelines
Experi-mental
Study design
Obser-vational
Internal validity by design (blocking of known risk factors and randomization of
unknown)
Potential for confounding: none
Internal validity by statistical analysis (confounding adjustment for known and
measured risk factors)
Potential for confounding: massive
Internal validity
Confounder (or case-mix) adjustment
How much of the variation in endpoints can be explained by known factors, and how much has unknown causes?
Unexplained variation (1-r2)
95%-99% Arthroplasty revision
85%-95% EQ-5D, SF36
50%-70% Coronary heart disease risk
Pre-specified hypotheses
Confirmation
Legislation, regulatory guidelines
Uncertainty intolerance
Hypothesis generation
Exploration
Academic analysis freedom
Uncertainty tolerance
Aims and characteristics
Aetiology Study scope Treatment
Aetiology Study scope Treatment
Randomized clinical trials
Patient registerstudies
Epidemiologicalstudies
Laboratory experiments
Research areas
Experi-mental
Study design
Obser-vational
Aetiology Study scope Treatment
Randomized clinical trials
CONSORT
Patient registerstudies
?
Epidemiologicalstudies
STROBE
Laboratory experiments
ARRIVE
Analysis strategies and publication guidelines
Experi-mental
Study design
Obser-vational
Analysis strategies and publication guidelines
NARA
The Nordic Arthroplasty Register Association (NARA) study group decided in September 2009 at a meeting in Lund, Sweden, to develop guidelines for statistical analysis of arthroplasty quality register data.
The guidelines were published In April, 2011.
Acta Orthopaedica 2011;82:253-267.
The NARA Guidelines
A collaborative effort by
1. Independent observations (Pulkkinen & Mäkelä ) 2. Competing risks (Mehnert & Pedersen)3. Proportional hazard rates (Espehaug & Furnes)4. Rankable revision risk estimates (Ranstam & Kärrholm)
The NARA study group
LI Havelin, LB Engesæter AM Fenstad (NO)S Overgaard, A Odgaard (DA)A Eskelinen, V Remes, P Virolainen (FI)G Garellick, M Sundberg, O Robertsson (SE)
The NARA Guidelineshave been developed to
– define a NARA consensus view on statistical analysis– describe foreseeable problems and recommend solutions– improve the comparability of reports– facilitate reading, writing and reviewing of reports
The NARA Guidelinesare not intended to
– stifle creativity– promote uniformity
NARA Guidelines
Structure
1. Review of underlying assumptions2. Consequences of departure from these assumptions3. Verifying that the assumptions are fulfilled4. Possible methodological alternatives 5. Practical recommendations
1 – Independent observations
Independent observations
Independent observations
Pseudoreplication
Two rats are sampled from a population with a mean (μ) of 50 and a standard deviation (σ) of 10, and ten measure-ments of an arbitrary outcome variable are made on each rat.
- Biological variability.- Measurement errors.
Independent observationsRipatti S and Palmgren J. Estimation of multivariate frailty models using penalized partial likelihood. Biometrics 2000, 56:1016-1022.
Schwarzer G, Schumacher M, Maurer TB and Ochsner PE. Statistical analysis of failure times in total joint replacement. J Clin Epidemiol 2001, 54:997-1003.
Visuri T, Turula KB, Pulkkinen P and Nevalainen J. Survivorship of hip prosthesis in primary arthrosis: influence of bilaterality and interoperative time in 45,000 hip prostheses from the Finnish endoprosthesis register. Acta Orthop Scand 2002, 73:287-290.
Robertsson O and Ranstam J. No bias of ignored bilaterality when analysing the revision risk of knee prostheses: Analysis of a population based sample of 44,590 patients with 55,298 knee prostheses from the national Swedish Knee Arthroplasty Register. BMC Musculoskeletal Disorders 2003, 4:1.
Lie SA, Engesaeter LB, Havelin LI, Gjessing HK and Vollset SE. Dependency issues in survival analyses of 55,782 primary hip replacements from 47,355 patients. Stat Med. 2004 Oct 30;23(20):3227-40.
Independent observations
Recommendations
The inclusion of bilateral observations in analysis of knee- and hip prosthesis survival does not seem to affect the reliability of the results, but this need not be the case with other types of prostheses.
The number of bilateral observations should always be presented. Sensitivity analyses can be useful when the results robustness against departures from the assumption of independence.
2 – Competing risks
Competing risks
Kaplan-Meier analysis
The statistical analysis of arthroplasty failure is primarily about the length of time from primary operation to revision.
Not all patients are revised during follow up. The length of follow up usually differ, and some patients are withdrawn before end of follow up; these observations are “censored”.
With Kaplan-Meier analysis censored observations are included in the analysis, until their censoring.
Competing risks
Kaplan-Meier-analys
Competing risksKaplan-Meier assumption
The time at which a patient gets a revision is assumed to be independent of the censoring mechanism. Other events than the one studied are competing risk events if they alter the risk of being revised.
Primary operation
Revision
Death
Re-revision
Death
Competing risksAlternative method: Cumulative incidence
The probability that a particular event, such as revision or a competing risk event, has occurred before a given time.
The cumulative incidence function for an event of interest can be calculated by appropriately accounting for the presence of competing risk events.
Censored observations can be included in the analysis.
Competing risksWith competing risk events Kaplan-Meier estimates will overestimate the real failure risk.
Competing risks
Recommendations
With competing risks the Kaplan-Meier failure function over-estimates the revision risk.
An alternative method can be to calculate the cumulative incidence of revisions. However, from the patient's perspective this may be less relevant.
The presence of competing risks should always be presented and both the number and types of censored observations should be described.
Competing risks
- do not condition on the future; - do not regard individuals at risk after they have died; and- stick to this world.
3 – Proportional hazard rates
Proportional hazard rates
Adjustment for case-mix effects
Risk estimates can be adjusted for the confounding effect of an imbalance of known and measured risk factors using statistical modeling.
This is usually achieved using a Cox model.
Proportional hazard rates
Cox model
The Cox model is a regression model for revision times (or more specifically, hazard rates).
The purpose of the model is to explore the simultaneous effects of different factors on the revision risk.
Proportional hazard rates
The Cox model is based on the assumption of proportional hazards (conditional revision risks). It is also known as the “proportional hazards model”.
The assumption of proportional hazards is not always fulfilled.
Proportional hazard rates
Proportional hazard rates
Schoenfeld residual
The covariate value for the implant that failed minus its expected value.
Proportional hazard rates
Consequences
When the effect of one or more of the prognostic factors in a Cox regression model changes over time, the average hazard ratio for such a prognostic factor is under- or overestimated.
Weighted estimation in Cox regression (Schemper's method) is a parsimonious alternative without additional parameters.
Proportional hazard rates
Recommendations
Non-proportional hazards may be an interesting finding in itself.
In register studies with large sample sizes, analyses can usually be performed by partitioning follow up time, by stratification, or by including time dependent covariates.
If the average relative risk is of interest, Schemper's method can be an alternative.
It should always be evaluated whether the assumption on proportional hazard is fulfilled or not. Testing the Schoenfeld residuals may be a solution.
4 – Rankable revision risk estimates
Rankable revision risk estimates
Performance monitoring
Swedish Knee Arthroplasty Register Annual Report 2011
Rankable revision risk estimates
9/24/11
9/24/11
9/24/11
Rankable revision risk estimates
Rankable revision risk estimates
Ranking is a problematic method for comparisons. If ranking is performed, the uncertainty in the ranks should be clearly indicated, preferably with confidence intervals.
Consequences of misclassification (registration errors) should be evaluated and case-mix effects considered as far as possible.
Finally
Revisions and updates
The guidelines should be open for revision and updating.
They have been developed as a consensus and should evolve as a consensus.
Experience and feedback is essential.
Forward your suggestions to the NARA study group!
Thank you for your attention!
Guidelines
Guidelines are particularly prevalent in clinical trials.
CONSORT
ICH E9 - Statistical Principles for Clinical Trials
EMA Points to Consider, on multiplicity, baseline covariates, superiority and non-inferiority, etc. and similar documents from the FDA
Etc.