+ All Categories
Home > Documents > What I have learned from the RSNA Bone Age Challenge · What I have learned from the RSNA Bone Age...

What I have learned from the RSNA Bone Age Challenge · What I have learned from the RSNA Bone Age...

Date post: 03-Jun-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
30
What I have learned from the RSNA Bone Age Challenge Alexandre Cadrin-Chênevert, MD, B.Ing, FRCPC In collaboration with : Alexander Bilbily, MD, PGY5 Mark Cicero, MD, BESc, FRCPC
Transcript

What I have learned from the RSNA Bone Age Challenge

Alexandre Cadrin-Chênevert, MD, B.Ing, FRCPCIn collaboration with :

Alexander Bilbily, MD, PGY5Mark Cicero, MD, BESc, FRCPC

Disclosures

Dr. Alexandre Cadrin-ChênevertNo financial conflict of interest to disclose

Dr. Alexander BilbilyCo-founder and CEO of 16 BitDr. Mark CiceroCo-founder and COO of 16 Bit

Learning Objectives

1. To identify a new research model : public machine learning challenges applied to medical imaging

2. To describe state of the art results from the RSNA bone age challenge

3. To list educational resources and learning tools to participate in future competitions

Key drivers of deep learning

GPUAlgorithmsData

Expertise

Performance/Accuracy

Amount of Data

Perfo

rman

ce

Traditional machine learning

Shallow neural networks

Medium neural networks

Deep neural networks

Deep learning scalability

Adapted from blog.easysol.net/building-ai-applications/ with permission

AI Research gap in healthcare/radiology

Data Expertise

HEALTHCAREFACILITY ACADEMIA INDUSTRY

GAP

LabelingSecurityStandard

TECHNICALPatient consentDeidentification

Data sharing

ETHICAL

Machine learning competitions• Expanding research paradigm• Goal : Finding the best performing algorithm on a

specific machine learning problem• Tool : Publicly available dataset• Open values : collaboration, education,

communication, algorithm sharing

Expertise

Data Algorithms GPU

Performance/Accuracy

Imagenet competition

Source : http://cs231n.github.io/understanding-cnn/

Imagenet – CNN architectures

Source : Eugenio Culurciello, medium.com/towards-data-science/neural-network-architectures-156e5bad51ba

RSNA bone age challenge• Goal : Develop an algorithm which can most

accurately determine skeletal age on a validation set of pediatric hand radiographs

• 260 participants registered

• Datasets : From 2 children hospitals– Lucile Packard Children’s Hospital at Stanford

University– Children’s Hospital ColoradoLarson DB et al. Radiology 2018; 287(1)313-322.

Bone age• Degree of maturation of a child’s bone to evaluate for a potential

advanced or delayed growth compared to chronological age.• Most frequent evaluation method using left hand xray

By a radiologist AutomatedGreulich-Pyle atlas (2nd edition, 1959) E.g. CE approved BoneXpert

Phases/datasets

Larson DB et al. Radiology 2018; 287(1)313-322.

PHASE TRAINING LEADERBOARD TEST

DATASET SIZE 12,611 1,425 200

NO. HOSPITALS 2 2 1

GROUND-TRUTH REPORT REPORT REPORT + 5 REVIEWS

MEAN BONE AGE(years) 10.6 10.6 11.0

SD BONE AGE(years) 3.4 3.5 3.6

GENDER RATIO(M:F) 1.18 : 1 1.19 : 1 1 : 1

Dataset division

@alexandrecadrin

Dataset Learning Performance

TRAINING YES Duringtraining

VALIDATION NO Duringtraining

TEST NO Aftertraining

Age distribution

Larson DB et al. Radiology 2018; 287(1)313-322.

Competition metrics1st : Mean Absolute Distance (MAD) in months

2nd : Concordance Correlation Coefficient (CCC)

Ground truth bone agePredicted bone ageAbsolute distance = -

Larson DB et al. Radiology 2018; 287(1)313-322 : Example of Bland-Altman plot comparison between model and reviewer

ResultsPHASE LEADERBOARD TEST

No images 1425 200

MEAN AD (BEST) 5.8 4.3

CCC (BEST) 0.979 0.991

MEAN AD (TOP 10) 5.8 – 6.4 4.3 – 4.9

MEAN AD (HUMAN) 6.1*

MEAN AD (PUBLISHED) 5.2*

* Larson DB et al. Radiology 2018; 287(1)313-322.

• Best mean absolute distance of 4.3 months compared to ground-truth• No confidence intervals reported during the competition• Compared to 6.1 months for radiologists and 5.2 months for the best

previous published automated model

Technical approaches• Mostly : deep convolutional neural networks• Variable : network architecture, preprocessing,

pretraining, data augmentation, image resolution, gender input, classification vs regression

Winning solution : 16bit.ai

Mark Cicero, Alex Bilbily : https://16bit.ai/blog/ml-and-future-of-radiology

Hyperparameter ValueInput 500 x 500 image

Weights initialization RandomOptimizer AdamBatch size 16

Data augmentation Rotation, translation, H flipInference 5 best models x 10 crops

Demo available (not approved for clinical use): https://16bit.ai/bone-age

Top 5 solutionsRank 1 2 3 4 5

Team Name 16bit.ai Ian Pan F. Kitamura Visiana Md.ai

MAD (months) 4.27 4.35 4.38 4.51 4.53

Parameters

Model Inception V3 Resnet50 9 layers CNN PCA + LinearRegression

Multiple deepCNNs

Input 500x500 49x(224x224) 550x500 Hand-crafted 299x299

Gender Model Input 2 Models Model Input Model Input Model Input

Optimizer Adam Adam NS NS Adam

Augmentation Yes Yes Yes NS Yes

Batch size 16 NS NS NS 32

Inference 5 best models x 10 crops

Xth percentile of 49 patches x 9

models

Ensemble of 4 models Single model

Weightedensemble of 6

models

Data augmentation

Horizontal flip

Crop

Original

Rotation

Visualization – Activation maps

Larson DB et al. Radiology 2018; 287(1)313-322.

Lee, H., Tajmir, S., Lee, J. et al. J Digit Imaging (2017) 30: 427. https://doi.org/10.1007/s10278-017-9955-8

Discussion• Deep learning models matching human performance in

research conditions for bone age estimation

• Reference gold-standard for bone age ?– Human interpretation based on Greulich-Pyle atlas– Chronological age from normal subjects– Double-reading : radiologist + model

• Bone age : low-hanging fruit for deep learning– Ground-truth labels easily extracted from written reports– Single 2D image– Single numerical output value– Relatively simple pattern recognition

What we have learned

• Machine learning requires extensive experimentation : model architecture, data augmentation, image resolution, ensemble of models

• Large public labeled datasets have likely high impact for research and future clinical applications

• Radiologists should be involved in machine learningresearch/challenges:– To define clinically significant use case scenarios– To help create large datasets with high quality ground truth

labels

Deep learning = statistical learning

• Imaging • Machine vendor• Protocol• Contrast, Noise

• Population• Age, Gender• Genetic• Lifestyle habits

• Diagnosis• Pretest probability• Prevalence• Ground-truth

Performance optimized for specific statistical research conditionsResearch performance ≠ Clinical performance

TRAINING

VALIDATION

TEST

TRAINING

VALIDATION

TEST

Research data Clinical data

From research to clinical applications

TRAINING

VALIDATION

TEST

TRAINING

VALIDATION

TEST

Research data Local data

Optimal : Retrain

Minimal : Retest

Educational toolsField Online Resources Conferences

Deep learning in radiology ACR Data Institute SIIM, MICCAI, EuSOMII, C-MIMI, MIDL

Convolutional neural networks

Coursera, Udemy,Stanford CS231n online

Computer vision Stanford CS231n online CVPR, ECCV, ICCV

Deep learning Coursera, Udemy, Fast.ai ICLR, NIPS

Machine learning Coursera, Udemy ICML, KDD

Computer programming Coursera, Udemy –Python

Machine learning challengesKaggle Grand challenges

Driven dataDream challenges

Take home messages

• Machine learning challenge = new significantresearch paradigm using public data

• Bone age : deep learning models matchinghuman performance in research conditions

• Research performance ≠ clinical performance • Online educational resources available

Alexandre Cadrin-Chênevert, MD, B.IngEmail : [email protected] : @alexandrecadrin

Bone age automated model demo : https://16bit.ai/bone-age

Bone age automated model demo : https://16bit.ai/bone-age


Recommended