1 R2 ImageChecker CT CAD PMA: Clinical Results Nicholas Petrick, Ph.D. Office of Science and...

transcript

R2 ImageChecker CT CAD PMA:Clinical ResultsNicholas Petrick, Ph.D.Office of Science and Technology

Center for Devices and Radiological Health

U.S. Food and Drug Administration

Outline

• Applicability of Az in analysis• Az is same as area under the curve (AUC)

• Pool of CT cases for clinical study• Defining actionable nodules by panel of experts• Clinical studies

• Primary analysis: analysis using fixed expert panel• Secondary analysis: analysis using random panels of

experts• Measurement of CAD standalone performance

• Algorithm’s performance with no reader involvement

Applicability of Az in analysis

• Average reader ROC Curves (pre/post CAD)

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Pre-CAD ROC

Post-CAD ROC

Applicability of Az in analysis

• Pre and post-CAD curves do not cross• No substantial pre/post-CAD crossing in

either averaged or individual ROC curves• Az is an appropriate performance measure

• Az used as figure of merit in all analysis

Pool of CT Cases

• Nodule cases• Documented cancers

• Primary neoplasm or extrathoracic neoplasm with presumptive spread to lungs

• Cases were allowed to contain non-nodule, pathologic processes (e.g., pneumonia, emphysema, etc.)

• Non-nodule cases• Normal cases

• No nodule deemed present by site P.I.• Primarily relied upon original radiology report

• History of cancer, radiation therapy, or even previous thorocatomy allowed

Defining Actionable Nodules by Panel of Experts

• ‘Actionable’ nodules are objects of interest• Panel of expert radiologists identify

actionable nodules• Nodules defined using a 2-pass process

• 1st reading of CT cases• Cases read independently & blinded by 3 expert radiologists• Radiologist provided subject’s age, gender, and indication

for exam• Marked all findings deemed lung nodules• Radiologist provided rating

• Intervention – Actionable, further workup advised• Surveillance – Actionable, monitor with follow-up studies• Probably Benign, calcified – no action required• Probably Benign, non-calcified – no action required

• 2nd pass• Findings that lacked 100% consensus after 1st pass were

reviewed unblinded by all 3 radiologists• 2/3 or 1/3 radiologists called the location a nodule are

reevaluated• Radiologists rated (or re-rated) the actionability of the

nodule candidates• Thresholds applied to all findings

• >4mm diameter• > -100 HU maximum density

• Each lung quadrant categorized by the highest actionable finding within quadrant

Disposition Unanimous Actionable

Majority Actionabl

Minority Actionabl

Sample Size 142 168 149

• 3 experts per panel

Clinical Studies

• ROC Observer Study• Az is test statistic

• Analysis of a 90 cases dataset (360 quadrants)

• Confidence intervals and significance testing• ANOVA-after-jackknife

• Bootstrap analysis

Clinical Studies Analysis Flowchart

Resampling

Scheme

Jackknife or

Bootstrap DefinitionOf Nodules

MRMC ROC Observer

Pool of Cases

Pool of Experts

Pool of Readers

AzEstimates

ANOVA-after-Jackknife Analysis

• Parametric analysis• Leave-one case out (all 4 quadrants,

quadrant-based analysis)• Analysis assumes modality as a fixed

effect and readers, cases and all interactions as random effects

• Example• Set: [1 2 3], Partitions:[1 2], [1 3], [2 3]

Bootstrap Analysis

• Nonparametric analysis• Randomly generated datasets, based

on original data with replacement• Example

• Set: [1 2 3], Partitions:[3 2 3], [3 1 2], [1 1 2], …

Clinical Studies Primary Analysis

Resampling

Scheme

Jackknife or

Bootstrap

DefinitionOf Nodules

MRMC ROC Observer

Pool of Cases

Pool of Experts

Pool of Readers

AzEstimates

• Fixed 3-member nodule definition panels (unanimous consensus)• ANOVA-after-jackknife and Bootstrap analysis

Clinical Studies Primary Analysis

• Fixed 3-member nodule definition panels

VarianceAnalysis

Pre-CADAz

Post-CADAz

ΔAzp-

valueLower C.L.

Upper C.L.

Jackknife 0.881 0.905 0.024

0.003 0.008 0.040

Bootstrap

0.879 0.903 0.025

<0.001

0.009 0.045

Clinical StudiesPrimary Analysis

• Statistically significant improvement in Az pre- to post-CAD• ΔAz~0.025

• ANOVA-after-jackknife and bootstrap analysis is consistent

• Analysis limited because it did not take into account any variation in the expert panel• Variability of panel would add uncertainty to performance

estimates• How would performance change with a different panel makeup?

• Different number of panel members• Different set of experts

Clinical Studies Secondary Analysis

Resampling

Scheme

BootstrapDefinitionOf Nodules

MRMC ROC Observer

Pool of Cases

Pool of Experts

Pool of Readers

AzEstimates

• Random 3, 2, 1-member nodule definition panels (unanimous consensus)

• Only bootstrap analysis possible

Clinical StudiesSecondary Analysis

• Bootstrap analysis• Random 3-member nodule definition

panelsRandom

Panel Size

Pre-CADAz

Post-CADAz

ΔAzp-

valueLower C.L.

Upper C.L.

3-members

0.845 0.868 0.022

<0.001

0.008 0.040

2-members

0.832 0.854 0.022

0.002 0.008 0.039

1-member

0.817 0.838 0.021

<0.001

0.008 0.037

Clinical StudiesSecondary Analysis

• Sponsor's analysis takes into account random nature of expert panel for defining ‘actionable’ nodules• Different number of panel members: 3, 2, 1-member panels• Different panel makeup: bootstrap selection of panel

• All variations of panel makeup confirm a statistically significant improvement in Az from pre to post-CAD • ΔAz~0.02

• Likely to be a more appropriate analysis for assessment of devices when only panel truth is available

CAD Standalone Performance

• Performance of the CAD algorithm alone• Algorithm sensitivity and specificity (no reader

involvement)• Standalone CAD performance is important

• Radiologist needs this information to appropriately weight their confidence in the CAD markings

• Benchmark for future revisions to the algorithm • What is an appropriate performance measure

for this device?

• Many of 142 findings (Fixed 3-member panel) did not meet criteria as a solid discrete, spherical density

• Second panel reevaluated nodules for appearance• 5 independent radiologists• 2 Categories

• Classic nodule: discrete solid, spherical or ovoid• Non-classic:

• Not discrete• Hyperdense• Irregularly shaped• Normal structure• Not a nodule

No. Panelists defining as

classic

No. of Findings

CADTPF (%)

CADFalse

Marker Rate

TP Median Diamete

<3/5 65 32.3

~3 per-case

7.6-9.0

3/5 13 69.2 7.4

4/5 11 81.8 11.2

5/5 53 83.0 6.9

All 142 58.5 7.9<3/5 65 32.3

~3 per-case7.6-9.0

≥3/5 77 80.5 6.9-11.2

• Large variation in performance of the CAD based on physicians assessment of nodule appearance as “classic”

Summary

• Az appropriate test statistic for clinical analysis

• No substantial crossing of pre/post-CAD ROC curves

• Primary Analysis• Nodule definition panel

• Fixed 3-member expert panel

• Shows statistically significant Az improvement in detection with CAD

• ANOVA-after-jackknife and bootstrap are comparable

Summary

• Secondary Analysis• Nodule Definition panel

• Varied number of panel members• Varied the panel makeup (bootstrap selection of panel

members)• Confirmed statistically significant Az improvement in

detection with CAD• Standalone performance

• Large variation in CAD performance based on reassessment of nodule appearance

• Necessary for appropriate utilization of the device by clinicians in the field and assessment of future algorithm revisions

1 R2 ImageChecker CT CAD PMA: Clinical Results Nicholas Petrick, Ph.D. Office of Science and...

Documents