www.elixir-europe.org
@ELIXIREurope
www.elixir-europe.org
ELIXIR-EXCELERATE is funded by the European Commission within the Research Infrastructures programme of Horizon 2020, grant agreement number 676559.
Integrating genomics into personalised healthcare: a science-for-policy perspectiveSession II: Genomics - opportunities and challenges
12th February 2019
Serena Scollen, Head of Genomics and Translational [email protected]
• Introduce ELIXIR
• Challenges for Human Genomics - data to translation into healthcare
• ELIXIR Human Data Communities - structure and examples of work ongoing
• Minimum recommended requirements when considering ‘ access to 1M genomes across Europe’
Overview
3
medicine
agriculture
bioindustries
environment
ELIXIR connects national bioinformatics centres and EMBL-EBI into a sustainable European infrastructure for biological research data
ELIXIR underpins life science research – across academia and industry
@ELIXIREurope
www.elixir-europe.org
/company/elixir-europe
5
A distributed infrastructure of data-related services
Bioinformatics tools:Bio.tools, software development
Databases Deposition, knowledge-bases, data management support
Compute:Secure data transfer, cloud computing, AAI
Interoperability:Standards, Identifiers, FAIR, Ontologies
Training:Training registry, face to face courses, eLearning
Industry:Staff exchange, Innovation and SME Forum, Bioinformatics Suppliers Forum
ELIXIR’s Communities
• Bring together ELIXIR’s experts in a particular domain, data type or technology
• Ensure that the Platforms develop services that are fit for purpose
• Connect ELIXIR experts with external users including industry
https://www.elixir-europe.org/communities
Using human data
Understand disease Develop and test novel pharmacological hypotheses Patient Stratification and PM
to unleash the possibilities for genomics and health
Human data
Patient Clinician Researcher
DataGenotype
PhenotypeHealth care
Longitudinal
ScopeBiospecimensPatient recall
Size
Access/regulations
Informatics
Human Data - Mission
• To construct and operate a sustainable infrastructure for Human Genomics and Translational data in Europe to support life science research and its translation to
medicine
• To facilitate discoverability, access, sharing and analysis of genomics data, including rare disease, linked to other data types, at scale (4-5M participants)
• To demonstrate how use of infrastructure can impact translation of genomics research into medicine
Human Data - Vision
Partnerships and community formationELIXIR Human Data Communities• Federated Human Data• Rare Diseases• human Copy Number Variation
Federation of human genome data
• Many national datasets from human research participants needs to be stored locally
• ELIXIR developing a federation with shared metadata (FAIR) and local data store (secure)
• Linking local EGA to
• national clouds
• international access (ELIXIR-AAI - Authentication and Authorisation Infrastructure)
ELIXIR RD community main achievements:
• Global Infrastructure for RD research:• Registry of Rare Disease data resources
and analysis tools (https://rare-diseases.bio.tools)
• Integration of ELIXIR resources into RD activities
• Interconnection of secure RD data repositories and resources
• Data sharing and data discovery• Benchmarking activities:
• Datasets• Gold standards• Quality parameters
• Interoperability of RD resources:• Quality in terms of FAIR principles• Standards and ontologies• FAIR data services
• Training:• BYOD workshops• RD researchers focused trainings
human Copy Number Variation (approved Dec 18)
ELIXIR Human Genomics & Translational Data
Data DiscoverabilityFederating lightweight discoverability of data, and datasets across ELIXIR
Data ArchivalUtilising the ELIXIR Deposition Databases to ensure secure, long-term, efficient archival of data
Federated Data AccessCoordinating a collection of interoperable EGA-like resources to ensure secure management of sensitive data across the ELIXIR Nodes
Data AnalysisBringing ‘analysis to data’ via common workflow languages, workflows, containers, and tools
ELIXIR Beacon - GA4GH Driver Project
ELIXIR Federated Human Data Community - htsget/htsref
bio.tools
Aligning with international initiative
Simplify the way people search for and request access to potentially identifiable data in international and national
genomic data resources
8/ 8 GA4GH Workstreams15/23 ELIXIR Nodes
Mapping ELIXIR::GA4GH Interactions
Clinical & Phenotypic Data Capture
Large Scale Genomics*
Genomic Knowledge Standards*
Discovery* CloudData Use & Researcher Identities (DURI)*
Regulatory & Ethics
Data Security
TBC
1 1 8 7 2 5 7 1 1 7 1 2 1 5 1
6 1311 35 6 3 5
15/23 Nodes connected 61 connections
ELIXIR as a route for GA4GH into Europe
European funded projects that are of relevance to Human Data
● Key Use Cases in Human Data, and Rare Diseases
● Building coordinated infrastructure and Communities
of people working together
● Pillar 2 to build a platform for RD data
discovery and access● Bridging biomedical sciences research infrastructures
● EOSC-Life to implement in different fields
● CINCEA - Adoption of GA4GH standards in
collaboration with Canada
● Linked to technology development of
EOSC-Life (WP5, WP7)
● Federated network of aligned and
interoperable infrastructures● IMI project to develop tools and
guidelines for making life science data
FAIR (Findable, Accessible,
Interoperable, Reusable)
IMI FAIRplus project aims to
• Establish a value-based process for prioritisation and selection of IMI project databases
• Develop FAIRification toolkit e.g. develop guidelines, tools and metrics - FAIR Cookbook
18
• Apply this toolkit to FAIRify datasets from selected IMI projects (>20 selected using a value based selection process) and EFPIA companies
• Deliver training for data handlers (academia, SMEs and pharmaceuticals) to change and sustain the data management culture e.g. Fellowship scheme
• Foster and innovation ecosystem on FAIR open data to power future reuse, knowledge generation and societal benefit e.g. FAIR innovation and SME events
Human Genomics - changing environmentPercentage human genomes and exomes that are funded solely by healthcare systems
CHALLENGES
• Data still geographically distributed• Dynamics of how we access data will change
• Clinical data are not interoperable• Healthcare is not used to this type and amount of data: terabyte to exabyte
• Technical knowhow is in the research community• Attitudes and action towards open data need to progress
• Secure access and governance
1: Birney E, Vamathevan J, Goodhand P. Genomics in healthcare: GA4GH looks to 2022. bioRxiv. January 2017
Genomics-based National Initiative projects across ELIXIR Members
Sharing genomic data across borders
Currently signed but not ELIXIR members:Austria, Bulgaria, Croatia, Latvia, Lithuania,Malta
“Leveraging European infrastructures to access one million human genomes by 2022”
• Coordinated, secure, federated environment will enable population scale genomic, phenotypic, and biomolecular data to be accessible across international borders
• Lessons learned & solutions developed should be taken from existing infrastructures, and ongoing data sharing efforts in cancer, population genetics & rare disease areas
• The EU must take a lead on policy-framing and technical standards-setting on a global stage in collaboration with organizations such as GA4GH to enable responsible genomic data sharing
& this will rely on a suite of interoperable standards...
Saunders G et al., pre-submission acceptance to Nature Genetics Reviews
Minimum recommendations for EU-wide infrastructure to access and analyse genomic data
• Genomics data and clinical information standards, geared towards specific disease communities
• Common Application Programming Interfaces (APIs) to enable remote data discovery and access
• Computational resources, including secure, federated cloud computing environments that offer secure access across national boundaries to raw data and interoperable results
• A repository of tools and services, including workflows used to analyse deposited data while enabling these analysis workflows to cover data across national borders
Minimum recommendations for EU-wide infrastructure to access and analyse genomic data
• Joint access rules and procedures that comply with the legal and regulatory frameworks for sharing and
processing of genomic data across borders including the management of transnational user access and
compliance in all countries
• A training and capacity building programme to develop the skills and workforce required for genomics and
big data in healthcare as well as shift the culture towards openness and integration of research data across
national boundaries
WHAT DO WE NEED: OPPORTUNITIES / SOLUTIONS
• Federated Data Management System• Bring analysis to data (not aggregate data to each researcher)
• Develop and maintain standards• Implement FAIR
• Incentivise Adoption • Disseminate and train
Standards Networks of trust Reference archives
www.elixir-europe.org