Post on 16-Apr-2017
transcript
Precision Medicine in Oncology – Cancer Informatics
October 20th, 2015
Warren Kibbe, PhD
NCI Center for Biomedical Informatics
2
1. Precision Medicine
2. Pre-clinical models
3. TCGA / TCIA
4. Genomic Data Commons
5. Cloud Pilots
Slides are from many sources, but special thanks to Drs. Harold Varmus, Doug Lowy, Jim Doroshow, Lou Staudt
Photo by F. Collins
President Obama Announces the Precision Medicine Initiative
The East Room, January 30, 2015
4
TOWARDS PRECISION MEDICINE
(IoM REPORT, NOVEMBER 2011)
5
Definition of Precision Oncology
Interventions to prevent, diagnose, or treat cancer, based on a molecular and/or mechanistic understanding of the causes, pathogenesis, and/or pathology of the disease. Where the individual characteristics of the patient are sufficiently distinct, interventions can be concentrated on those who will benefit, sparing expense and side effects for those who will not.
Modified by D. Lowy, M.D., from IoM’s Toward Precision Medicine report, 2011
6
Understanding Cancer
Precision medicine will lead to fundamental understanding of the complex interplay between genetics, epigenetics, nutrition, environment and clinical presentation and direct effective, evidence-based prevention and treatment.
7
What is Cancer?
Cancer is a disease where cells ‘lose’ the normal controls that enable them to participate as productive, stable members of tissues, organs and organ systems. The process of developing cancer is usually viewed as having three phases Initiation, where some of the normal control processes define the cell
type, inter-cellular communication with other cells, and normal responses to cell-cell signaling are altered
Proliferation, where cancerous cells ‘take over’ normal processes and begin to proliferate
Metastasis, where the cancerous cell leave their normal location and invade other tissues
8
Cancer Statistics
In 2015 there will be an estimated
1,700,000 new cancer cases and
600,000 cancer deaths- American Cancer Society 2015
Cancer remains the second most common cause of death in the U.S.
- Centers for Disease Control and Prevention 2015
9
Survivorship is on the rise
Today there are 14 million Americans alive with a history of cancer.
It is estimated that by 2024, the population of cancer survivors will increase to almost 19 million: 9.3 million males and 9.6 million females
- American Cancer Society 2015
10
The hope
We need to do what the HIV/AIDS research community did in the last 30 years, turn a certain death sentence into a managed, chronic disease.
Like the HIV/AIDS story for the course of the HIV infection, we believe that cancer has multiple pathways that are altered during the course of the disease, and that targeted therapies hitting multiple involved signaling pathways, metabolic pathways, receptors can block and prevent disease progression
Unlike HIV, the course of cancer is more individualized and involves different molecular actors depending on the cancer and the individual
11
The frustration
We already know how to remove >50% of the cancer burden in the U.S.
Consistent nutrition, exercise & sleep – estimate(10-30%)
Tobacco control (smoking cessation) – ~25%
HPV and HCV immunization – ~20%
Behavior change is hard. Impacting the behavior of a broad population is harder.
12
Precision Medicine is not new: An early example
First disease for which molecular defect identifiedSingle substitution at position 6 of ß-globin chainAbnormal Hb polymerization upon deoxygenation
“I believe medicine is just now entering into a new era when progress will be much more rapid than before, when scientists will have discovered the molecular basis of diseases, and will have discovered why molecules of certain drugs are effective in treatment, and others are not effective.”Linus Pauling 1952 2015: still no good treatment
13
The NCI Mission & the PMI for
Oncology
• NCI Mission: To help people live longer, healthier lives by supporting research to reduce the incidence of cancer and to improve the outlook for patients who develop cancer
• Precision Medicine Initiative for Oncology: Significantly expand our efforts to improve cancer treatment through genomics
14
What Problems are We Trying to Solve?
Precision Medicine Initiative in Oncology
• For most of its 70-year history, systemic cancer treatment has relied on therapies that are marginally more toxic to malignant cells than to normal tissues
• Molecular molecules that predict benefit, response, or resistance in the clinic have been lacking for many cancers
15
Proposed Solution to these Problems
Use genomics and other high throughput technologies including
imaging to identify, create predictive signatures, and
target molecular vulnerabilities of individual cancers
Precision Medicine Initiative in Oncology
16
Precision Oncology in Practice
Nature Rev. Clin. Oncol. 11:649-662 (2014)
17
Precision OncologyTrials Launched 2014:MPACTLung MAPALCHEMISTExceptional Responders
2015:NCI-MATCHALK InhibitorMET Inhibitor
NCI-MATCH: Features [Molecular Analysis for Therapy Choice]
•Foundational treatment/discovery trial; assigns therapy based on molecular abnormalities, not site of tumor origin for patients without available standard therapy
• Regulatory umbrella for phase II drugs/studies from > 20 companies; single agents or combinations
•Available nationwide (2400 sites)
•Accrual began mid-August 2015
NCI MATCH
• Conduct across 2400 NCI-supported sites• Pay for on-study and at progression biopsies• Initial estimate: screen 3000 patients to complete
20 phase II trials
1CR, PR, SD, and PD as defined by RECIST2Stable disease is assessed relative to tumor status at re-initiation of study agent3Rebiopsy; if additional mutations, offer new targeted therapy
,2
19
NCI-MATCH: Initial Ten Studies
Agents and targets below grey line are pending final regulatory review; economies of scale—larger number of agents/genes, fewer overall patients to screen
Agent(s) Molecular Target(s) Estimated Prevalence
Crizotinib ALK Rearrangement (non-lung adenocarcinoma) 4%
Crizotinib ROS1 Translocations (non-lung adenocarcinoma) 5%
Dabrafenib and Trametinib BRAF V600E or V600K Mutations (non-melanoma) 7%
Trametinib BRAF Fusions, or Non-V600E, Non-V600K BRAF Mutations (non-melanoma)
2.8%
Afatinib EGFR Activating Mutations (non-lung adenoca) 1 – 4%
Afatinib HER2 Activating Mutations (non-lung adenoca) 2 – 5%
AZD9291 EGFR T790M Mutations and Rare EGFR Activating Mutations (non-lung adenocarcinoma)
1 – 2%
TDM1 HER2 Amplification (non breast cancer) 5%
VS6063 NF2 Loss 2%
Sunitnib cKIT Mutations (non GIST) 4%
≈ 35%
20
MATCH Assay: Workflow for 10-12 Day Turnaround
Tissue FixationPath Review
Nucleic Acid Extraction
Library/Template Prep
Sequencing , QC Checks
Clinical Laboratory aMOI
Verification
Biopsy Received at Quality Control Center
1 DAY
1 DAY
1 DAY1 DAY
3 DAYS
10-12 days
Tumor content >70%
Centralized Data Analysis
DNA/RNA yields >20 ng
Library yield >20 pMTest fragmentsTotal readReads per BCCoverageNTC, Positive, Negative Controls
aMOIs Identified
Rules Engine Treatment Selection
3-5 DAYS
21
The Components of the Precision MedicineInitiative for Oncology
• Dramatically expand the NCI-MATCH umbrella to include new trials, new agents, new genes, and new drug combinations
• Understand and overcome resistance to therapy through molecular analysis and development of new cancer models
• Increase genomics-based preclinical studies, especially in the area of immunotherapy, through the creation of patient-derived pre-clinical models and non-invasive tumor profiling
• Establish the first national cancer database integrating genomic information with clinical response and outcome: to accelerate understanding of cancer and improve its treatment
22
PMI-O: Expanding Genomically-Based Cancer Trials (FY16-FY20)
• Accelerate Launch of NCI-Pediatric MATCH• Broaden the NCI-MATCH Umbrella:
Expand/add new Phase II trials to explore novel clinical signals—mutation/disease context
Add new agents for new trials, and add new genes to panel based on evolving evidence
Add combination targeted agent studies Perform Whole Exome Sequencing, RNAseq,
and proteomic studies on quality-controlled biopsy specimens—extent of research based on resource availability
Add broader range of hematologic malignancies• Perform randomized Phase II studies or hand-off to
NCTN where appropriate signals observed• Apply genomics resources to define new predictive
markers in novel immunotherapy trials• Expand approach to ‘exceptional responders’: focus
on mechanisms of response/resistance in pilot studies
23
PMI-O: Understanding and overcoming resistance to therapy (FY16-FY20)
• Create a repository of molecularly analyzed samples of resistant disease
• Expand the use of tumor profiling methods such as circulating tumor cells (CTCs) and fragments of tumor DNA in blood to understand and monitor disease progression
• Develop new cancer models to identify the heterogeneity of resistance mechanisms
• Use preclinical modeling to determine the effectiveness of new combinations of novel molecularly targeted investigational agents
24
We need sophisticated computational models to
understand patient response, methods of
resistance, and to integrate pre-clinical model data
Drivers for High Performance Computing and
Computational Modeling
25
Develop a Cancer Knowledge System. Establish a national database that integrates genomic information with clinical response and outcomes as a resource.
PMI-O: Informatics Goal
26
Develop molecular, imaging, pathology, and clinical signatures that predict therapeutic response, outcomes, and tumor resistance
PMI-O: Informatics Goal
27
Build multi-scale, predictive computational biology models for understanding cancer biology and informing therapy. Develop detailed cancer pathway models to create targeted combination therapies in cancer. This approach has transformed HIV therapy and has the potential to do the same in cancer
PMI-O: Informatics Goal
28
The Cancer Genomic Data Commons (GDC) is an existing effort to standardize and simplify submission of genomic data to NCI and follow the principles of FAIR -Findable, Accessible, Interoperable, Reusable. The GDC is part of the NIH Big Data to Knowledge (BD2K) initiative and an example of the NIH Data Commons
Genomic Data Commons
NCI Cancer Genomic Data Commons (GDC)
30
Genomic Data Commons (GDC) – Rationale TCGA and many other NCI funded cancer genomics projects each
currently have their own Data Coordinating Centers (DCCs) BAM data and results stored in many different repositories; confusing
to users, inefficient, barrier to research GDC will be a single repository for all NCI cancer genomics data
Will include new, upcoming NCI cancer genomics efforts Store all data including BAMs Harmonize the data as appropriate
Realignment to newest human genome standard Recall all variants using a standard calling method Define data sharing standards and common data elements
Will be the authoritative reference data set Will need to scale to 200+ petabytes
31
Genomic Data Commons (GDC)
First step towards development of a knowledge system for cancer Foundation for a genomic precision medicine platform Consolidate all genomic and clinical data from:
TCGA, TARGET, CGCI, Genomic NCTN trials, future projects Project initiated Spring of 2014
Contract awarded to University of Chicago PI: Dr. Robert Grossman Go live date: Mid 2016 Not a commercial cloud
Data will be freely available for download subject to data access requirements
32
Integration with Imaging - TCIA
The Cancer Imaging Archive has imaging data and some imaging series and cover a number of TCGA disease types Breast invasive carcinoma (BRCA) Glioblastoma (GBM), lower grade glioma (LGG) Head&Neck squamous cell carcinoma (HNSC) Kidney renal clear cell carcinoma (KIRC) Ovarian serous cystadenocarcinoma (OV)
Integration through standardized RESTful APIs Driven by the need to have better computational models
The NCI Cancer Genomics Cloud Pilots
Understanding how to meet the research community’s need to analyze large-scale cancer
genomic and clinical data
Slide courtesy of Deniz Kural, Seven Bridges Genomics
35
NCI GDC and the Cloud Pilots
Working together to build common APIs Working with the Global Alliance for Genomics and Health (GA4GH)
to define the next generation of secure, flexible, meaningful, interoperable, lightweight interfaces
Competing on the implementation, collaborating on the interface Aligned with BD2K and serving as a part of the NIH Commons and
working toward shared goals of FAIR (Findable, Accessible, Interoperable, Reusable)
Exploring and defining sustainable precision medicine information infrastructure
36
Information problem(s) we intend to solve with the Precision Medicine Initiative for Oncology Establish a sustainable infrastructure for cancer genomic
data – through the GDC Provide a data integration platform to allow multiple data
types, multi-scalar data, temporal data from cancer models and patients
Under evaluation, but it is likely to include the GDC, TCIA, Cloud Pilots, tools from the ITCR program, and activities underway at the Global Alliance for Genomics and Health
Support precision medicine-focused clinical research
37
NCI Precision Medicine Informatics Activities
As we receive additional funding for Precision Medicine, we plan to: Expand the GDC to handle additional data types Include the learning from the Cloud Pilots into the GDC Scale the GDC from 10PB to hundreds of petabytes Include imaging by interoperating between the GDC and the
Quantitative Imaging Network TCIA repository Expand clinical trials tooling from NCI-MATCH to NCI-MATCH Plus Strengthen the ITCR grant program to explicitly include precision
medicine-relevant proposals
38
Project Schedule and Deliverables
Selection Design/Build I Design/Build II Evaluation
6 Months
Initial Design and Development
9 Months
Completion of Design, Development and Implementation
9 Months
Provide cloud to researchersNCI evaluationsCommunity evaluations
Jan 14
Sep14
Apr 15
Jan 16
Sep16
39
Three Cancer Genomics Cloud Pilots
• PI: Gad Getz• Google Cloud• Firehose in the cloud• http://firecloud.org
Broad Institute
• PI: Ilya Shmulevich• Google Cloud• Interactive visualization and analysis• http://cgc.systemsbiology.net/
Institute for Systems Biology
• PI: Deniz Kural• Amazon Web Services• > 30 public pipelines• http://www.cancergenomicscloud.org
Seven Bridges Genomics
FireCloud is modeled on Firehose, the cancer genome analysis platform built by the Getz lab at the Broad Institute, which supports both small groups and major projects (e.g. TCGA, GTEx). FireCloud significantly expands on Firehose’s capabilities. Firehose is used by both production managers for large-scale analysis and analysts for interactive analysis, curation and manual review of data for publication.
Free trial workspaces are available
for all new users!
Firehose re-born in the cloud
FireCloud is a collaboration of the Broad Institute, University of California at Santa Cruz and University of California at Berkeley.
This project has been funded by the National Cancer Institute and National Institutes of Health.
41
Pre-loaded FireCloud workspaces
FireCloud will be populated with pre-loaded workspaces, which, when cloned, will allow users to replicate analyses from curated and published works
A sampling of pre-loaded workspaces:
Curated TCGA Tumor Type Analysis Working Group workspaces
TCGA GDAC Analysis Working Group workspaces
TCGA PanCanAtlas analysis workspace
Tutorial workspaces containing paired tumor and normal cell lines
Benchmarking data to enable users to test developing tools and methods
Synthetic BAMs for testing contamination
Synthetic BAMs for testing mutation calling
A sampling of pipelines:
TCGA Production AnalysisPCAWG Pipeline
TCGA GDAC Pipeline
ISB Cancer Genomics Cloud @ isb-cgc.org
TCGA Data in the Cloud
Available Now§: over 400 GB of curated data in open-access BigQuery tables
Before the end of 2015: over 1 PB of DNAseq, RNAseq, and SNP6 controlled-access* data in Google Cloud Storage
Interactive Exploration
Interactively explore all metadata and open-access data
Customize, save, and share data visualizations with colleagues
Define, export, and share custom cohorts – based on: clinical information, molecular characteristics, and/or data type availability
3rd party tools: IGV and NG-CHM
§ email info@isb-cgc.org for details *requires dbGaP authorization, with authentication via NIH Login
Programmatic Access backed by the Power of Google
Easy Access to the Data
ISB-CGC Cloud Endpoint APIs expose a REST interface for search and retrieval, with responses returned as JSON objects
Google APIs provide direct access to BigQuery & Cloud Storage
RStudio and IPython tutorial examples avaiable on github
Use RStudio or IPython Notebooks to script your own analyses
Your own Google Cloud Platform Project
Start with $300 to spend on compute & storage – request additional credits as needed
Invite your students, post-docs & collaborators to join your project
Upload your own data into Cloud Storage and BigQuery
Run your own algorithms and pipelines on Google VMs
Use and share methods via Docker containers Galaxy Workflows
Common Workflow Language
44
Finding and exploring TCGA data
Data browser enables complex queries of 100+ metadata properties (can also be queried programmatically)
“Find data from cases with (RNAseq data from BOTH Primary tumor and Adjacent Normal) AND (WGS on Blood Derived Normal samples).”
Release and Evaluation
47
Cloud Pilots: Coming your way!
Broad • Version 1.0 – 1/20/2016• Version 1.1 – 4/2/2016
ISB
• Pre-release – 11/15/2015 (open-access data)• Version 1.0 – 12/20/2016• Version 2.0 – 3/20/2016
SBG
• Early access – 11/15/2015• Version 1 – 12/28/2015• Version 2 – 3/28/2016
48
NCI-Sponsored Evaluation Activities
Independent Testing & Evaluation
• Highly structured test of functionality, security, load capability• Assessment of the key strengths, weaknesses, issues, and
risks
Administrative Supplements • Active NCI grants: R01, R21, U01, U24, P30 • Proposals due October 18
DREAM Challenge • Launching early 2016• Use TCGA RNA-Seq data to identify somatic mutations
Hands-On Workshop • May 24, 2016 @ NCI Shady Grove• Half day for each Cloud Pilot
NCI Intramural Evaluation • Dedicated resources to support intramural investigators’ research
CGC Pilot Team Principal Investigators • Gad Getz, Ph.D - Broad Institute - http://firecloud.org • Ilya Shmulevich, Ph.D - ISB - http://cgc.systemsbiology.net/ • Deniz Kural, Ph.D - Seven Bridges – http://www.cancergenomicscloud.org
NCI Project Officer & CORs• Anthony Kerlavage, Ph.D –Project Officer• Juli Klemm, Ph.D – COR, Broad Institute• Tanja Davidsen, Ph.D – COR, Institute for Systems Biology • Ishwar Chandramouliswaran, MS, MBA – COR, Seven Bridges Genomics
GDC Principal Investigator• Robert Grossman, Ph.D - University of Chicago
Cancer Genomics Project Teams
NCI Leadership Team• Doug Lowy, M.D.• Lou Staudt, M.D., Ph.D.• Stephen Chanock, M.D.• George Komatsoulis, Ph.D.• Warren Kibbe, Ph.D.
Center for Cancer Genomics Partners• JC Zenklusen, Ph.D.• Daniela Gerhard, Ph.D.• Zhining Wang, Ph.D.• Liming Yang, Ph.D.• Martin Ferguson, Ph.D.
50
National Strategic Computing Initiative
Executive Order announced July 29, 2015 Create a cohesive, multi-agency strategic vision and Federal investment strategy in high-
performance computing (HPC) Lead agencies
DOE, DoD and NSF
Deployment agencies NASA, NIH, DHS, and NOAA Participate in shaping future HPC systems to meet aims of respective missions and support
workforce development needs
Implications for NCI Work cross agency with DOE and others to expand use of HPC to advance research and
clinical applications impacting cancer
50
51
Possible NCI-DOE initiative aligned with NSCI
Three candidate pilot projects identified:
Pre-clinical Model Development and Therapeutic Evaluation (Doroshow) Improving Outcomes for RAS Related Cancers (McCormick) Information Integration for Evidence-based Cancer Precision Medicine
(Penberthy)
Collaboratively developing project plans with DOE computational scientists
51
52
Pilot Project 1: Pre-clinical Models
Pre-clinical Model Development and Therapeutic Evaluation
Scientific lead: Dr. James Doroshow Key points:
Rapid evaluation of large arrays of small compounds for impact on cancer
Deep understanding of cancer biology Development of in silico models of biology and predictive
models capable of evaluating therapeutic potential of billions of compounds
52
53
Pilot Project 2: RAS Related Cancers
Improving Outcomes for RAS Related Cancers Scientific lead: Dr. Frank McCormick Key points:
Mutated RAS is found in nearly one-third of cancers, yet remains untargeted with known drugs
Advanced multi-modality data integration is required for model development
Simulation and predictive models for RAS related molecular species and key interactions
Provide insight into potential drugs and assays
53
54
Pilot Project 3: Evidence-based Precision Medicine
Information Integration for Evidence-based Cancer Precision Medicine
Scientific lead: Dr. Lynne Penberthy Key points:
Integrates population and citizen science into improving understanding of cancer and patient response
Gather key population-wide data on treatment, response and outcomes
Leverages existing SEER and tumor registry resources Novel avenues for patient consent, data sharing and
participation
54
55
Take homes
Genomic Data Sharing policy is an opportunity for standardized, meaningful data exchange
Open APIs and RESTful interfaces enable ‘purpose focused’ data sharing
Genomic Data Commons will be the repository for NCI-funded genomic data
New architectures including discoverable identifiers, baked-in provenance, ad hoc levels of versioning and micro-publication (e.g. DOIs) enable very different incentives for data sharing
Computational models for prediction are critical to the future
56
Questions?
Warren A. Kibbe
warren.kibbe@nih.gov
Thank you
www.cancer.gov www.cancer.gov/espanol
58
Questions?
Warren A. Kibbe
warren.kibbe@nih.gov
Thank you