Session 4: Assessing a Document on Diagnosis

transcript

Peter Tarczy-Hornoch MDHead and Professor, Division of BHIProfessor, Division of Neonatology

Adjunct Professor, Computer Science and Engineering

faculty.washington.edu/pth

October 29, 2008

Using Questionmark Software See e-mail “[MIDM] Important - testing your

Questionmark login id/web browser before MIDM final exam” for more details

First: get your login id/password from MyGrade Second: test your login id/password and your

computer’s/browser’s ability to save and retrieve you exam: https://primula.dme.washington.edu/q4/perception.dll

2 question “Test” exam up until 5P Monday 11/3

Assessing a Document on Diagnosis

Context for Assessing a Diagnosis Document Diagnosis Statistics Applying to a Scenario

Diagnostic vs. Therapeutic Studies

Patient Data & Information

General Information & Knowledge

Case specificdecision making

Diagnostic Testing(What is it?)(Session 4)

Therapy/Treatment(What do I do for it?)(Session 3)

Steps to Finding & Assessing Information

1. Translate your clinical situation into a formal framework for a searchable question (Session 1)

2. Choose source(s) to search (Session 2)

3. Search your source(s) (Session 2)

4. Assess the resulting articles (documents) Therapy documents (Session 3) Diagnosis documents (Session 4) Systematic reviews/comparing documents (Session 5)

5. Decide if you have enough information to make a decision, repeat 1-4 as needed (ICM, clinical rotations, internship, residency)

PubMed – Finding a Diagnostic Article

Assessing a DocumentSearch Result Comparison

Problem at hand Problem studied Are they really the same?

Patient characteristics Population characteristics Is patient similar enough to population studied?

Intervention most relevant to patient/provider

Intervention studied (primary one)

Are they the same?

Comparison – other alternatives considered

Comparison – alternatives studied

Are alternatives studied those of interest to you?

Outcomes – those important to pat/prov

Outcomes – those looked at by study

Are outcomes studied those of interest to you?

Number of subjects Does study have enough subjects to trust results?

Study design hoped for Statistics – study design and statistical results

Is study design good? What do results mean?

Sponsor – who paid for study

Is there potential bias?

Many Different Kinds of Tests Tests predict presence of disease Types of tests

Screening test: before symptoms appear look for disease Example: Screening mammograms

Diagnostic test: given symptoms/suggestion of a disease help rule in (confirm) or rule out (reject) a diagnosis Example: ultrasound of appendix in face of abdominal pain

Gold Standard: a “perfect” test that “definitively” categorizes a patient as having one disease Example: Surgery to remove appendix and then pathologic exam

Can’t always use Gold Standard => Use diagnostic tests E.g. high risk/cost, only rules in/out one disease vs. multiple, etc.

The 2x2 Table: Diagnostic Test vs. Gold Standard

Disease Present (+)

Disease

Absent (-)

Test Positive (+)

TP: True Positive

FP: False Positive

Test Negative (-)

FN: False Negative

TN: True Negative

Non-intuitive labels: Disease Present = Disease “Positive” (+) = Dz(+) Test Positive = Test predicting disease present From patient/provider point of view neither Disease

Positive nor Test Positive (+) are good things!

“Gold Standard Test”

“Diagnostic Test”

Sensitivity (Sn)Disease Present (+)

Disease

Absent (-)

Test Positive (+)

TP: True Positive

FP: False Positive

Test Negative (-)

FN: False Negative

TN: True Negative

Sensitivity is proportion of all people with disease who have a positive test

Sensitivity =TP/(TP+FN) SnNOut - sensitive test, if negative, rules out disease Sensitivity useful to pick a test – sensitivity key for

screening test

Specificity (Sp)Disease Present (+)

Disease

Absent (-)

Test Positive (+)

TP: True Positive

FP: False Positive

Test Negative (-)

FN: False Negative

TN: True Negative

Specificity is proportion of all people without disease who have negative test

Specificity = TN/(FP+TN) SpPIn – A specific test, if positive, rules in disease Specificity useful to pick a test – specificity key for

diagnostic test

“Cut Off Values” Impact Sn/SpExample: blood sugar to predict diabetes

Blood Sugar Sensitivity Specificity

70 98.6% 8.8%

100 88.6% 69.8%

130 64.3% 96.9%

160 47.1% 99.8%

200 27.1% 100%

Sensitivity key for screening test

Specificity key for diagnostic test

Positive Predictive Value (PPV)Disease Present (+)

Disease

Absent (-)

Test Positive (+)

TP: True Positive

FP: False Positive

Test Negative (-)

FN: False Negative

TN: True Negative

PPV is proportion of all people with a positive test who have a disease

PPV=TP/(TP+FP) PPV is useful to use a test: if you have a positive result for

your patient, what % of people with positive results actually have the disease

Negative Predictive Value (NPV)Disease Present (+)

Disease

Absent (-)

Test Positive (+)

TP: True Positive

FP: False Positive

Test Negative (-)

FN: False Negative

TN: True Negative

NPV is proportion of all people with a negative test who don’t have a disease

NPV=TN/(FN+TN) NPV is useful to use a test: if you have a negative result

for your patient, what % of people with negative results actually don’t have the disease

Prevalence, pre-test & post-test probabilities

Prevalence: total cases of disease in the population at given time 2x2 table: [disease (+)])/[disease (+) + disease (-)]

Pre-test probability: Estimate of probability/likelihood your patient has a

disease before you order your test Often an estimation based on experience or prevalence Screening test: pre-test probability = prevalence

Post-test probability: The probability/likelihood that your patient has a

disease, after you get the results of the test back

Sn=TP/(TP+FN)= 45/(45+5)=90%Sp=TN/(FP+TN)= 912/(912+38)=96%PPV=TP/(TP+FP)= 45/(45+38)=54.2%of those with T(+) have Dz(+)

PPV/NPV Dependence on Disease PrevalencePPV Example

Sn=TP/(TP+FN)= 450/(450+50)=90%Sp=TN/(FP+TN)= 480/(20+480)=96%PPV=TP/(TP+FP)= 450/(450+20)=95.7%of those with T(+) have Dz(+)

Dz (+) Dz (-)

Test(+)TP FP

Test(-)FN TN

500 500

50 480

Dz (+) Dz (-)

Test(+)TP FP

Test(-)FN TN

50 950

Prevalence = 50% Prevalence = 5%

Pros/Cons Sn/Sp/PPV/NPV Relative Pros:

PPV/NPV useful for diagnosis - probability of disease after (+ ) or (–) test

Sn/Sp useful for choosing a test (screening/diagnosis)

Relative Cons: PPV/NPV vary with prevalence of disease Prevalence of disease in general population may not be

the same as that of patients you see in clinic/ER Your estimation of probability of disease (pre-test

probability) may not match prevalence in a population

Current tendency therefore => use likelihood ratios

Note: Sn/Sp/PPV/NPV on boards

Bayes Theorem Note: this slide is here for completeness, likelihood

ratios better, this slide is thus not on the exam Bayes Theorem

How to update or revise beliefs in light of new evidence http://plato.stanford.edu/entries/bayes-theorem/

Related to Bayes is an alternate form of PPV/NPV as f(Sn, Sp, pre-test) that “pulls out” pre-test probability or prevalence P(Dz)=probability of disease (e.g. prevalence, pre-test) PPV=Sn*P(Dz)/[Sn*P(Dz) + (1-Sp)*(1-P(Dz))] NPV=Sp*(1-P(Dz))/[Sp*(1-P(Dz)) + (1-Sn)*(P(Dz))]

Likelihood Ratios

Likelihood Ratio does NOT vary with prevalence Likelihood Ratio (LR)

LR+ = Sn/(1-Sp) – likelihood ratio for a positive test LR- = (1-Sn)/Sp – likelihood ratio for a negative test

Applying LR given a pre-test disease probability: Pre=Pre-test probability (can be prevalence) Post=Post-test probability Post=Pre/(Pre+(1-Pre)/LR) Same as Bayes & PPV/NPV but cleanly separates test

characteristics (LR) from disease prevalence/pre-test probabilities

Interpreting Likelihood Ratios (I)

LR=1.0Post-test probability = the pre-test probability (useless)

LR >1.0Post-test probability > pre-test probability (helps rule in)Test result increases the probability of having the disorder

LR <1.0Post-test probability < pre-test probability (helps rules out)Test result decreases the probability of having the disorder

LR+ (Likelihood Ratio for a Positive Test) vs. LR- (Likelihood Ratio for a Negative Test) => See Appendicitis Slide

Interpreting Likelihood Ratios (II) Likelihood ratios >10 or <0.1

Test generates large changes in pre- to post-test probability Test provides strong evidence to rule in/rule out a diagnosis

Likelihood ratios of 5-10 and 0.1-0.2 Test generates moderate changes in pre- to post-test probability Test provides moderate evidence to rule in/rule out a diagnosis

Likelihood ratios of 2-5 and 0.2-0.5 Test generates small changes in pre- to post-test probability Test provides minimal evidence to rule in/rule out a diagnosis

Likelihood ratios 0.5-2 Test generates almost no changes in pre- to post-test probability Test provides almost no evidence to rule in/rule out a diagnosis

Interpreting Likelihood Ratios (III)

From slide on impact of prevalence: Sn=90%, Sp=96% LR+ =0.90/(1-0.96)=22.5 Post=Pre/(Pre+(1-Pre)/LR)

If prevalence (pre-test) is 50% => post-test 95.7%

If prevalence (pre-test) is 5% => post-test 54.2%

LR Nomogram =>

Likelihood Ratios for Physical Exam for Appendicitis

Present=“Moderate evidence” for appendicitis

Present=“Moderate evidence” for appendicitis BUT 95% CI of LR includes <2 thus includes “Minimal Evidence”

Present=“Almost no evidence” for appendicitis

Learning to Diagnose Pneumonia Medical School

Preclinical: anatomy, histology, pathology, microbiology, pharmacology, physiology,…

Clinical: medicine, pediatrics, family medicine, surgery,….

Residency: Outpatient, inpatient, specialty rotations, general rotations, emergency room,….

Fellowship: More of the same Result: a number of items on history, physical

exam, laboratory studies that suggest pneumonia with chest X-ray as gold standard

Literature on Diagnosis of Pneumonia Clinical query for “pneumonia” “diagnosis” (1478) Change to “community acquired pneumonia”

(181) Add in “likelihood ratio” (27) Find “Derivation of a triage algorithm for chest

radiography of community-acquired pneumonia patients in the emergency department.” Acad Emerg Med. 2008 Jan;15(1):40-4.

Paper: Background/Objectives BACKGROUND: Community-acquired

pneumonia (CAP) accounts for 1.5 million emergency department (ED) patient visits in the United States each year.

OBJECTIVES: To derive an algorithm for the ED triage setting that facilitates rapid and accurate ordering of chest radiography (CXR) for CAP.

Paper: Methods METHODS: The authors conducted an ED-based

retrospective matched case-control study using 100 radiographic confirmed CAP cases and 100 radiographic confirmed influenzalike illness (ILI) controls. Sensitivities and specificities of characteristics assessed in the triage setting were measured to discriminate CAP from ILI. The authors then used classification tree analysis to derive an algorithm that maximizes sensitivity and specificity for detecting patients with CAP in the ED triage setting.

Paper: Results (I) RESULTS: Temperature greater than 100.4 degrees

F (likelihood ratio = 4.39, 95% confidence interval [CI] = 2.04 to 9.45), heart rate greater than 110 beats/minute (likelihood ratio = 3.59, 95% CI = 1.82 to 7.10), and pulse oximetry less than 96% (likelihood ratio = 2.36, 95% CI = 1.32 to 4.20) were the strongest predictors of CAP. However, no single characteristic was adequately sensitive and specific to accurately discriminate CAP from ILI.

Evidence: LR>10: Strong, LR 5-10 moderate 2-5 minimal, 1-2 scant evidence

Paper: Results (II) RESULTS (continued): A three-step algorithm

(using optimum cut points for elevated temperature, tachycardia, and hypoxemia on room air pulse oximetry) was derived that is 70.8% sensitive (95% CI = 60.7% to 79.7%) and 79.1% specific (95% CI = 69.3% to 86.9%).

LR+ =Sn/(1-Sp)=0.708/(1-0.791)=3.39 (minimal) LR- =(1-Sn)/Sp=(1-0.708)/0.791= 0.37 (minimal) Post=Pre/(Pre+(1-Pre)/LR) Post if Pre 1/10 = 0.1/(0.1+(1-0.1))/3.39)=0.27 Post if Pre 1/2 = 0.5/(0.5+(1-0.5))/3.39)=0.77

Paper: Conclusions CONCLUSIONS: No single characteristic

adequately discriminates CAP from ILI, but a derived clinical algorithm may detect most radiographic confirmed CAP patients in the triage setting. Prospective assessment of this algorithm will be needed to determine its effects on the care of ED patients with suspected pneumonia.

Note: all of these characteristics are among the tried and true findings taught in medical school, residency, fellowship but typically not quantitatively taught

Small Group Monday November 3rd

Students to complete assignment for Small Group Session #5 by Mon 11/3 2-2:50

Small group leads to give examples of recent clinical situations where they had to evaluate one or more documents related to making a diagnosis

Group to review and discuss from assignment short examples related to diagnosis focusing on:

Sensitivity, Specificity, Positive Predictive Value (PPV), Negative Predictive Value (NPV)

Likelihood Ratios (Positive/Negative): LR+, LR-

AND/OR: group to search for treatment article(s) on a topic of interest and assess results

QUESTIONS? Context for Assessing a Diagnosis Document Diagnosis Statistics Applying to a Scenario Small Group Portion

Session 4: Assessing a Document on Diagnosis

Documents