Date post: | 27-Dec-2015 |
Category: |
Documents |
Upload: | laurence-melvyn-cross |
View: | 217 times |
Download: | 2 times |
RATING QUALITY OF EVIDENCE AND STRENGTH OF RECOMMENDATIONS IN
HEPATOLOGY USING THE GRADE FRAMEWORK
AASLD Practice Guidelines Committee Meeting, Chicago
1 May 2009
Yngve Falck-Ytter, M.D.Case Western Reserve University
Disclosure
In the past 5 years, Dr. Falck-Ytter received no
personal payments for services from industry. His
research group received research grants from
Three Rivers, Valeant and Roche that were
deposited into non-profit research accounts. He is a
member of the GRADE working group which has
received funding from various governmental
entities in the US and Europe. Some of the GRADE
work he has done is supported in part by grant # 1
R13 HS016880-01 from the Agency for Healthcare
Research and Quality (AHRQ).
Content
Part 1 Background and rationale for
revisiting guideline methodology
GRADE approach Quality of evidence Strength of recommendations
Content (continued)
Part 2 – practical consideration Ideal vs. practical ad hoc approaches Funding guideline work Creating GRADE evidence profiles
with GRADEpro GRADE and diagnostic tests
Reassessment of clinical practice guidelines
Editorial by Shaneyfelt and Centor (JAMA 2009) “Too many current guidelines have
become marketing and opinion-based pieces…”
“AHA CPG: 48% of recommendations are based on level C = expert opinion…”
“…clinicians do not use CPG […] greater concern […] some CPG are turned into performance measures…”
“Time has come for CPG development to again be centralized, e.g., AHQR…”
Evidence based clinical decisions
Research evidence
Patient values and preferences
Clinical state and circumstances
Expertise
Equal for allHaynes et al. 2002
Confidence in evidence
There always is evidence “When there is a question there is
evidence” Evidence alone is never sufficient to
make a clinical decision Better research greater confidence in
the evidence and decisions
Hierarchy of evidence
STUDY DESIGN Randomized Controlled
Trials Cohort Studies and
Case Control Studies Case Reports and Case
Series, Non-systematic observations
BIAS
Expert Opinion
Exp
ert O
pin
ion
Expert Opinion
Reasons for grading evidence?
People draw conclusions about the quality of evidence and strength of
recommendations
Systematic and explicit approaches can help to protect against errors, resolve disagreements communicate information and fulfill needs be transparent about the process
Change practitioner behavior However, wide variation in approaches
GRADE working group. BMJ. 2004 & 2008
Which grading system?
P: In patients with acute hepatitis C … I : Should anti-viral treatment be used … C: Compared to no treatment …O: To achieve viral clearance?Evidence Recommendation Organization
B Class I AASLD (2009)
VA (2006)II-1 -/-
SIGN (2006)1+ A
AGA (2006)-/- “Most authorities…”
Scenario (2)
Should patients with risk factors for viral hepatitis be screened with a hepatitis C antibody (ELISA) test to identify patients with past hepatitis C exposure?
13
Level of evidence in GI CPGsAASLD AGA ACG ASGE
A Multiple RCTs or meta-analysis
Good Consistent, well-designed, well conducted studies […]
1. Multiple published, well-controlled (?) randomized trials or a well designed systemic (?) meta-analysis
A. RCTs
B Single randomized trial, or non-randomized studies
C Only consensus opinion of experts, case studies, or standard-of-care
Fair Limited by the number, quality or consistency of individual studies […]
Poor … important flaws, gaps in chain of evidence…
2. One quality-published (?) RCT, published well-designed cohort/ case-control studies
3. Consensus of authoritative (?) expert opinions based on clinical evidence or from well designed, but uncontrolled or non-rand. clin. trials
B. RCT with important limitations
C. Obser-vational studies
D. Expert opinion
Limitations of existing systems
Confuse quality of evidence with strength of recommendations
Lack well-articulated conceptual framework
Criteria not comprehensive or transparent
GRADE unique breadth, intensity of development process wide endorsement and use conceptual framework comprehensive, transparent criteria
Focus on all important outcomes related to a specific question and overall quality
GRADE Working GroupDavid Atkins, chief medical officera Dana Best, assistant professorb Martin Eccles, professord Francoise Cluzeau, lecturerx
Yngve Falck-Ytter, associate directore Signe Flottorp, researcherf Gordon H Guyatt, professorg Robin T Harbour, quality and information director h Margaret C Haugh, methodologisti David Henry, professorj Suzanne Hill, senior lecturerj Roman Jaeschke, clinical professork Regina Kunx, Associate ProfessorGillian Leng, guidelines programme directorl Alessandro Liberati, professorm Nicola Magrini, directorn
James Mason, professord Philippa Middleton, honorary research fellowo Jacek Mrukowicz, executive directorp Dianne O’Connell, senior epidemiologistq Andrew D Oxman, directorf Bob Phillips, associate fellowr Holger J Schünemann, professorg,s Tessa Tan-Torres Edejer, medical officert David Tovey, Editory
Jane Thomas, Lecturer, UKHelena Varonen, associate editoru Gunn E Vist, researcherf John W Williams Jr, professorv Stephanie Zaza, project directorw
a) Agency for Healthcare Research and Quality, USA b) Children's National Medical Center, USAc) Centers for Disease Control and Prevention, USAd) University of Newcastle upon Tyne, UKe) German Cochrane Centre, Germanyf) Norwegian Centre for Health Services, Norwayg) McMaster University, Canadah) Scottish Intercollegiate Guidelines Network, UKi) Fédération Nationale des Centres de Lutte Contre le Cancer, Francej) University of Newcastle, Australiak) McMaster University, Canadal) National Institute for Clinical Excellence, UKm) Università di Modena e Reggio Emilia, Italyn) Centro per la Valutazione della Efficacia della Assistenza Sanitaria, Italyo) Australasian Cochrane Centre, Australia p) Polish Institute for Evidence Based Medicine, Polandq) The Cancer Council, Australiar) Centre for Evidence-based Medicine, UKs) National Cancer Institute, Italyt) World Health Organisation, Switzerland u) Finnish Medical Society Duodecim, Finland v) Duke University Medical Center, USA w) Centers for Disease Control and Prevention, USAx) University of London, UKY) BMJ Clinical Evidence, UK
Where GRADE fits inPrioritize problems, establish panel
Systematic review
Searches, selection of studies, data collection and analysis
Assess the relative importance of outcomes
Prepare evidence profile: Quality of evidence for each outcome and summary
of findingsAssess overall quality of evidence
Decide direction and strength of recommendation
Draft guideline
Consult with stakeholders and / or external peer reviewer
Disseminate guideline
Implement the guideline and evaluate
GR
AD
E
20
GRADE: Quality of evidence
The extent to which our confidence in an estimate of the treatment effect is adequate to support particular recommendation.
Although the degree of confidence is a continuum, we suggest using four categories:
High Moderate Low Very low
I B II V III
Quality of evidence across studies
Outcome #1Outcome #2Outcome #3
Quality: HighQuality: ModerateQuality: Low
Determinants of quality
RCTs start high
Observational studies start low
What lowers quality of evidence? 5 factors: Detailed design and execution Inconsistency of results Indirectness of evidence Imprecision Publication bias
24
Types of studiesDid investigator assign exposure?
Experimental study
Yes
Observational study
No
Random allocation? Comparison group?
RCT
Yes
CCT
No
Analytical study
Yes
Case-series
No
Direction?Cohort study
Exposure Outcome
Case-control study
Exposure Outcome
Cross-sectional study
Exposure and outcome
at the same time
Before and after study
Variations:
cBAS
ITS
E O
1. Design and execution
Study limitations (risk of bias)For RCTs: Lack of allocation concealment No true intention to treat principle Inadequate blinding Loss to follow-up Early stopping for benefit
For observational studies: Selection Comparability Exposure/outcome
Avoid
critic
al ap
prais
al scoring
tools
!
Jadad AR et al. Control Clin Trials 1996 26
Tools: scales and checklists
Example: Jadad score
Was the study described as randomized?1
Adequate description of randomization? 1Double blind? 1
Method of double blinding described? 1Description of withdrawals and dropouts?
1
Max 5 points for quality
Schulz KF et al. JAMA 1995 27
Allocation concealment
250 RCTs out of 33 meta-analysesAllocation concealment:Effect
(Ratio of OR)
adequate 1.00 (Ref.)unclear 0.67 [0.60
– 0.75]not adequate 0.59
[0.48 – 0.73]
*
* significant
2. Consistency of results
Look for explanation for inconsistency patients, intervention, comparator, outcome,
methods
Judgment variation in size of effect overlap in confidence intervals statistical significance of heterogeneity I2
3. Directness of Evidence
Indirect comparisons Interested in head-to-head comparison Drug A versus drug B Tenofovir versus entecavir in hepatitis B
treatment
Differences in patients (early cirrhosis vs end-stage cirrhosis) interventions (CRC screening: flex. sig. vs
colonoscopy) comparator (e.g., differences in dose) outcomes (non-steroidal safety: ulcer on
endoscopy vs symptomatic ulcer complications)
4. Imprecision
Small sample size small number of events wide confidence intervals uncertainty about magnitude of effect
36
Control group event rate
Tota
l nu
mb
er
of
eve
nts
re
qu
ire
d
0.0 0.2 0.4 0.6 0.8 1.0
02
00
40
06
00
RRR=30%
RRR=25%
RRR=20%
300 events
5. Reporting Bias (Publication Bias)
Reporting of studies publication bias
number of small studies Reporting of outcomes
Egger M, Smith DS. BMJ 1995;310:752-54 38
I.V. Mg in acute myocardial infarction
Publication bias
Meta-analysisYusuf S.Circulation 1993
ISIS-4Lancet 1995
Egger M, Cochrane Colloquium Lyon 2001 39
Funnel plotS
tand
ard
Err
or
Odds ratio0.1 0.3 1 3
3
2
1
0
100.6
Symmetrical:No reporting bias
Egger M, Cochrane Colloquium Lyon 2001 40
Funnel plotS
tand
ard
Err
or
Odds ratio0.1 0.3 1 3
3
2
1
0
100.6
Asymmetrical:Reporting bias?
Egger M, Smith DS. BMJ 1995;310:752-54 41
I.V. Mg in acute myocardial infarction
Reporting bias
Meta-analysisYusuf S.Circulation 1993
ISIS-4Lancet 1995
42
Quality assessment criteria
Lower if…Quality of evidence
High (4)
Moderate (3)
Low (2)
Very low (1)
Study limitations(design and execution)
Inconsistency
Indirectness
Imprecision
Publication bias
Observational study
Study design
Randomized trial
Higher if…
What can raise the quality of evidence?
44
Quality assessment criteria
Lower if… Higher if…Quality of evidence
High (4)
Moderate (3)
Low (2)
Very low (1)
Study design
Randomized trial
Observational study
Study limitations
Inconsistency
Indirectness
Imprecision
Publication bias
Large effect (e.g., RR 0.5)Very large effect (e.g., RR 0.2)
Evidence of dose-response gradient
All plausible confounding would reduce a demonstrated effect
45
Categories of quality
Further research is very unlikely to change our confidence in the estimate of effectHigh
LowFurther research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate
ModerateFurther research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate
Very low Any estimate of effect is very uncertain
46
Judgments about the overall quality of evidence Most systems not explicit
Options: Benefits Primary outcome Highest Lowest
Beyond the scope of a systematic review
GRADE: Based on lowest of all the critical outcomes
Going from evidence to recommendations
Deliberate separation of quality of evidence from strength of recommendation
No automatic one-to-one connection as in other grading systems
Example: What if there is high quality evidence, but the balance between benefit and risks are finely balanced?
48
Strength of recommendation
“The strength of a recommendation reflects the extent to which we can, across the range of patients for whom the recommendations are intended, be confident that desirable effects of a management strategy outweigh undesirable effects.”
Although the strength of recommendation is a continuum, we suggest using two categories :
“Strong” and “Weak”
Desirable and undesirable effects Desirable effects
Mortality reduction Improvement in quality of life, fewer
hospitalizations/infections Reduction in the burden of treatment Reduced resource expenditure
Undesirable effects Deleterious impact on morbidity, mortality or
quality of life, increased resource expenditure
4 determinants of the strength of recommendation
Factors that can weaken the strength of a recommendation
Explanation
Lower quality evidence The higher the quality of evidence, the more likely is a strong recommendation.
Uncertainty about the balance of benefits versus harms and burdens
The larger the difference between the desirable and undesirable consequences, the more likely a strong recommendation warranted. The smaller the net benefit and the lower certainty for that benefit, the more likely is a weak recommendation warranted.
Uncertainty or differences in values
The greater the variability in values and preferences, or uncertainty in values and preferences, the more likely weak recommendation warranted.
Uncertainty about whether the net benefits are worth the costs
The higher the costs of an intervention – that is, the more resources consumed – the less likely is a strong recommendation warranted.
Implications of a strong recommendation
Patients: Most people in this situation would want the recommended course of action and only a small proportion would not
Clinicians: Most patients should receive the recommended course of action
Policy makers: The recommendation can be adapted as a policy in most situations
Implications of a weak recommendation
Patients: The majority of people in this situation would want the recommended course of action, but many would not
Clinicians: Be prepared to help patients to make a decision that is consistent with their own values/decision aids and shared decision making
Policy makers: There is a need for substantial debate and involvement of stakeholders
6 main misconceptions
1. Isn’t GRADE expensive to realize?
2. Isn’t GRADE more complicated, takes longer and requires more resources?
3. Isn’t GRADE eliminating the expert?
4. But what about prevalence/burden of disease, diagnosis, cost?
5. But GRADE does not have an “insufficient evidence to make recommendation” category! (or: the “optional” category), no?
6. But we only “recommend” – we can’t possibly give weak recommendations!
Systematic review
Guideline development
PICO
OutcomeOutcomeOutcomeOutcome
Formulate
question
Rate
importa
nce
Critical
Important
Critical
Not important
Create
evidence
profile with
GRADEpro
Summary of findings & estimate of effect for each outcome
Rate overall quality of
evidence across outcomes based
on lowest quality of critical outcomes
Panel
RCT start high, obs. data start
low1. Risk of bias2. Inconsisten
cy3. Indirectnes
s4. Imprecision5. Publication
bias
Gra
de
dow
nG
rad
e
up
1. Large effect
2. Dose response
3. Confounders
Rate quality
of evidence
for each
outcomeSelect
outcomes
Very low
LowModerate
High
Formulate recommendations:
• For or against (direction)• Strong or weak (strength)
By considering: Quality of evidence Balance
benefits/harms Values and
preferences
Revise if necessary by considering:
Resource use (cost)
• “We recommend using…”• “We suggest using…”• “We recommend against using…”• “We suggest against using…”
Outcomes
across
studies
Conclusions
1. GRADE is gaining acceptance as international standard
2. GRADE has criteria for evidence assessment across questions (e.g., public health interventions) and outcomes
3. Criteria for moving from evidence to recommendations
4. Simple, transparent, systematic5. Balance between simplicity and
methodological rigor