+ All Categories
Home > Documents > A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big...

A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big...

Date post: 08-Oct-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
37
A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow, McGill Centre for Integrative Neuroscience, Montreal Neurological Institute, Ph.D. student McGill University, M.S.E. Johns Hopkins University, B.Eng Carleton University 2018-02-05
Transcript
Page 1: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

A Data Driven Approach to Tackling Big Data ConnectomicsGreg Kiar

McGill University Healthy Brains for Healthy Lives Fellow,McGill Centre for Integrative Neuroscience, Montreal Neurological Institute,Ph.D. student McGill University,M.S.E. Johns Hopkins University,B.Eng Carleton University

2018-02-05

Page 2: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Outline

2018-02-05 2

• Context

• The common approach

• An approach based on accessibility, robustness, and scalability

Page 3: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Publicly available MRI datasets• ADNI• ABCD• ABIDE• ADHD-200• Age-ility• AIBL• BRAINS• CamCAN• CMI-HBN• COBRE• CoRR/FCP-INDI• DLBS

• fBIRN• GSP• HCP• IXI• Kirby21• MASSIVE• MindBoggle-101• MIRIAD• MPI-LMBB• MSC• NACC• NCANDA

• NKIRS• OASIS-CS• OASIS-Long• OpenfMRI• PING• PNC• PTBP• SALD• SchizConnect• StudyForrest• UK-Biobank

2018-02-05 3

Source:https://github.com/cMadan/openMorph

Page 4: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Publicly supported BIDS apps• AFNI• ANTS Cortical Thickness• Baracus• Brainiak-srm• BROCCOLI• CPAC• DPARSF• Fibre Density and Cross-section• fMRIprep• Freesurfer• FSL Tools• HCP Pipelines• Hyper Alignment

• MAGeTbrain• MindBoggle• MRIQC• MRtrix3 Connectome• ndmg• NIAK• OPPNI• SRM• SPM• Tracula• QAP

2018-02-05 4

Source:http://bids-apps.neuroimaging.io/apps/

Page 5: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Common approach1. Pose a hypothesis2. Collect + curate dataset3. Manually perform QC on dataset4. Pick processing pipeline and parameters5. Process random subset with pipeline in 4.6. Manually perform QC on derivatives7. Redo from 4. if not happy with 6.8. Process all data with pipeline in 4.9. Answer statistical question10. Publish claim11. Get tenure

2018-02-05 5

Page 6: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Common Honest approach1. Poste ahoc hypothesis2. Collect + curate dataset3. Undergrads Manually perform QC on dataset4. Pick processing pipeline and parameters5. Process random subset first subject with pipeline in 4.6. Grad students Manually perform QC on derivatives7. Don’t redo from 4. if not happy with 6.8. Process all data with pipeline in 4.9. Answer statistical question (see updated 1.)10. Publish claim11. Get tenure

2018-02-05 6

Page 7: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Problems with this approach

1. Manual QC is subjective; we need LOTS to be reliable

2. Pipelines and parameters aren’t ”optimized” objectively

3. Datasets aren’t homogeneous

4. Incentive to publish/graduate rather than redo experiments

5. Computer infrastructures are expensive, and not always equal

2018-02-05 7

Page 8: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Proposed solution

1. Optimize pipeline for stability and robustness; remove bias

2. Automate QC where possible

3. Evaluate on public data with known “truths” (i.e. TestReTest)

4. Automate pipeline deployment

5. Automate data discovery*

2018-02-05 8

Page 9: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Proposed solution

1. Optimize pipeline for stability and robustness; remove bias

2. Automate QC where possible

3. Evaluate on public data with known “truths” (i.e. TestReTest)

4. Automate pipeline deployment

5. Automate data discovery*

2018-02-05 9

Page 10: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

ndmg: one-click connectomes

2018-02-05 10

Page 11: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Registration to MNI152

2018-02-05 11

DWI corr. DWI

T1wint. DWI

Template

xfmreg. DWI QAExternal Dep.

SubjectIntermediateOutput

Page 12: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

2018-02-05 12

Page 13: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Compute Tensor Field

2018-02-05 13

bvals

grad. table

reg. DWI

Mask

Tensors QA

bvecs

External Dep.SubjectIntermediateOutput

Page 14: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

2018-02-05 14

Page 15: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Estimate Streamlines

2018-02-05 15

FA thresh.

Mask

Fibers QA

Tensors

External Dep.SubjectIntermediateOutput

Page 16: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

2018-02-05 16

Page 17: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Graph Generation

2018-02-05 17

Parcellation

Connectome QA

Fibers

External Dep.SubjectIntermediateOutput

Page 18: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

2018-02-05 18

Page 19: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Optimizing for discriminability

2018-02-05 19

g: connectomei: class label (i.e. subject)j: observation label (i.e. session)

“My brain looks more like my brain than my brain looks like your brain”or

“My brain looks more like brains of the same {dataset, sex, age, handedness, etc.} than brains of another {“, “, “,”}”

Page 20: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Reliable structural connectivity

2018-02-05 20

Page 21: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

2018-02-05 21

Page 22: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

2018-02-05 22

Page 23: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

2018-02-05 23

Page 24: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

easy to use/install$ # Installable on Python2.7 if you have FSL installed…$ pip install ndmg$$ # Run on your dataset$ ndmg_bids /data /outs session $ ndmg_bids /data /outs group$$ # Or install and run through Docker$ docker run –ti –v /data –v /outs bids/ndmg /data /outs session$ docker run –ti –v /data –v /outs bids/ndmg /data /outs group

2018-02-05 24

Page 25: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Proposed solution

1. Optimize pipeline for stability and robustness; remove bias

2. Automate QC where possible

3. Evaluate on public data with known “truths” (i.e. TestReTest)

4. Automate pipeline deployment

5. Automate data discovery*

2018-02-05 25

Page 26: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Boutiques

2018-02-05 26

Page 27: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Boutiques

2018-02-05 27

{

"name": "echo",

"tool-version": "1.0",

"description": "A simple script to test output files",

"command-line": "echo [PARAM] > output.txt",

"schema-version": "0.5",

"inputs": [{ "id": "param",

"name": "Parameter",

"value-key": "[PARAM]",

"type": "Number" }],

"output-files": [{ "id": "output_file",

"name": "Output file",

"path-template": "output.txt" }]

}

Page 28: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Clowdr

2018-02-05 28

Page 29: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

2018-02-05 29

Page 30: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

2018-02-05 30

Page 31: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Boutiques on pip, Clowdr soon

2018-02-05 31

$ # Installable on Python2 or 3…$ pip install boutiques$$ # Describe, validate, launch your tool, or more!$ bosh validate descriptor.json$ bosh exec simulate descriptor.json –r$ bosh exec launch descriptor.json invocation.json$$ # Soon… (currently the API is a bit uglier than this)$ clowdr deploy descriptor.json invocation.json s3://dataset

Page 32: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Proposed solution

1. Optimize pipeline for stability and robustness; remove bias

2. Automate QC where possible

3. Evaluate on public data with known “truths” (i.e. TestReTest)

4. Automate pipeline deployment

5. Automate data discovery*

2018-02-05 32

Page 33: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Apine

2018-02-05 33

Page 34: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

So, why are we doing this again?• If tools and platforms are made to be reproducible and robust…

SCIENTISTS CAN FOCUS ON SCIENCE!

• Free validation and summary of the quality of work being done• Enables scaling to datasets beyond reach of manual curation

2018-02-05 34

Page 35: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

The dream• Go to a website• Pick a dataset• Pick an analysis• Design a hypothesis• Launch it• Go outside & run around• Come back to your answer• Share the results, form new hypotheses, and collect new data

2018-02-05 35

Page 36: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

Acknowledgements• McGill Centre for Integrative Neuroscience (Alan Evans, et al.)• Big-Data for Neuroinformatics Lab (Tristan Glatard, et al.)• Jean-Baptiste Poline, Pierre Bellec, Christine Tardif• Montreal Neurological Institute/The Neuro• Healthy Brains for Healthy Lives• Lab-mates, Family, Friends, Universe

2018-02-05 36

Page 37: A Data Driven Approach to Tackling Big Data Connectomics · A Data Driven Approach to Tackling Big Data Connectomics Greg Kiar McGill University Healthy Brains for Healthy Lives Fellow,

All code demonstrated in this presentation is publicly available on GitHub.

Thanks!Find me @

gkiar

g_kiar

[email protected]

2018-02-05 37


Recommended