A European Infrastructure for Biological Information
Genome EmbryoCell
Fruitfly
Protein
Mouse Human Development,
Ageing, Disease
www.elixir-europe.org
ELIXIR MissionTo construct and operate a sustainable infrastructure
for biological information in Europe, to support life science research and its translation to medicine and the environment, the bio-industries and society.
• Partners: 32 partners, 13 member states• Funding: 4.5 M€ from EU FP7 • Deliverable: Consortium agreement to define the
scope of the infrastructure and how it will be constructed
Goals for ELIXIR
• Optimal Data Management• Coordinated Data Resources with improved access• Integration and interoperability of diverse heterogeneous data• Good Value for Money
• Forge Links to data in other related domains
• A single European voice in international collaborations to influence global decisions and maintain open access to data
• Enhance European competitiveness in bioscience industries
• Address need for Increased Funding & its Coordination
ELIXIR: a sustainable infrastructure for biological information in Europe. 4
Stakeholders
• Funders of Infrastructure– National Government Funding Bodies; EMBL; EU
Charities; Industry
• Data Resource Providers – Core Resources– Specialist (Many investigators - distributed)
• Data Providers– Experimentalists
• Tool Providers– Bioinformatics Groups
• Users
Why do we need ELIXIR?
•Data Growth
•Global context
•Very large user community: • 3.3 m web hits/day
• 20,000 unique users per day
•Need to preserve data and make accessible to all
•Impact on medicine & agriculture
•Impact on society & bioindustries
•Need for increased funding for biodata resources
Server Storage
0200400600800
100012001400
2006 time now
TB
Europe
USA
Japan
ELIXIR: a sustainable infrastructure for biological information in Europe. 7
The Preparatory Phase project
Two phases:– Committee meetings of stakeholders to achieve
consensus and make recommendations• Jan 2008 – July 2009 NOW!• Define scope and remit of ELIXIR
– Documentation and negotiation phase• July 2009 – Dec 2010• Develop ‘International Consortium Agreement’• Define funding and legal model
ELIXIR: a sustainable infrastructure for biological information in Europe. 8
1. Project management2. Data providers3. User communities4. Organisation and Legal5. Funding6. Physical infrastructure7. Data interoperability
8. Literature9. Healthcare10. Chemistry, Plants, Agriculture &
Environment 11. Training12. Tools integration13. Feasibility studies14. Reporting and negotiation
Elixir is organised into 14 work packages which have committees of (mainly) European experts associated with them.
The Preparatory Phase project
ELIXIR: a sustainable infrastructure for biological information in Europe.
ELIXIR Scientific and technical Structure
ELIXIR Council
ELIXIRNode 4
ELIXIRNode 1(EMBL-EBI)
ELIXIRNode 3
ELIXIRNode 2
ELIXIRMember States EMBL
ELIXIR ExecutiveManagement (at EMBL-EBI)
Scientific Advisory & GrantCommittee
WP4: ELIXIR Legal and Governance
Heads of ELIXIR Nodes Committee
ELIXIR Secretariat
ELIXIR Council
ELIXIRNode 4
ELIXIRNode 1(EMBL-EBI)
ELIXIRNode 3
ELIXIRNode 2
ELIXIRMember States EMBL
ELIXIR ExecutiveManagement (at EMBL-EBI)
Scientific Advisory & GrantCommittee
ELIXIR Governance
Heads of ELIXIR Nodes Committee
ELIXIR Secretariat
ELIXIR
ELIXIR: a sustainable infrastructure for biological information in Europe.
ELIXIR‘s tasks
What will ELIXIR provide?
• Core and Specialist data resources
• Compute Centres
• Services for the user community
ELIXIR: a sustainable infrastructure for biological information in Europe.
What are the responsibilities of the ELIXIR hub?
• Coordinate and support ELIXIR activities in Europe• Quality control and organisation of selection process
(peer review) for new nodes?• Provide core data resources• Host main data centre• Ensure back up of core data resources• Training and dissemination
ELIXIR: a sustainable infrastructure for biological information in Europe.
ELIXIR Nodes – basic description
• Prerequisites:– National or international legal entity or represented by legal
entity– Capable of entering into contract with ELIXIR Hub– Provide additional funding sources (matching funds)– Capable of supporting one or more components of ELIXIR
Infrastructure, e.g. of scientific or technical nature– ELIXIR components must have European service dimension– Other requirements to be determined (e.g. size, quality)
• Selection and application process to be decided (e.g. calls for proposals)
ELIXIR: a sustainable infrastructure for biological information in Europe.
What are the attributes of ELIXIR nodes?
- scientific excellence - fit of thematic area - reliability- service-mentality- coordinator of national activities?- national contact point for ELIXIR- availability of funding (50-100%)- multi-year commitment (min. 5 years)- supported by host country government- subject to regular reviews- participation in software and standards development
ELIXIR: a sustainable infrastructure for biological information in Europe.
Resources needed for ELIXIR nodes
- Staff (incl. technical support staff)- Buildings- Equipment: computers- Payment of subscriptions (eg for databases) and other
fees for specialised services- Participation in software development projects- Participation in standards development activities
ELIXIR: a sustainable infrastructure for biological information in Europe.
WP5 Activities: Funding
• Engaging European research funders– Initial identification of funders’ aims and priorities – 2008 Survey– 3 meetings of WP5 Committee – Ongoing process of dialogue (& advocacy)
• Working with Work Package 4 to develop ELIXIR model– Identify feasibility and acceptability for funders
• UK Large Facilities Capital Fund support– Develop case for additional data storage facilities for EBI and supporting UK HTP
sequencing activities
WP2: Data Providers Recommendations
PRINCIPLES:• Data sharing is the norm in the biomolecular domain -
– Public domain principles (can this be consistent with industry funding?)– Data should be downloadable in their entirety
• Collaboration and avoidance of duplication in core data archives essential– Data bases produced by research with primary responsibilities only to
their research group are not Elixir
• Creative competition on services desirable
• Global context
• Standardisation
Databases: molecules to systems
GenomesEnsembl, Ensembl
Genomes, EGA
GenomesEnsembl, Ensembl
Genomes, EGA
Nucleotide sequenceEMBL-Bank
Nucleotide sequenceEMBL-Bank
Gene expressionArrayExpress
Gene expressionArrayExpress
ProteomesUniProt, PRIDE
ProteomesUniProt, PRIDE
Protein families, motifs and domains
InterPro
Protein families, motifs and domains
InterPro
Protein structurePDBe
Protein structurePDBe
Protein interactionsIntAct
Protein interactionsIntAct
Chemical entitiesChEBI, ChEMBL
Chemical entitiesChEBI, ChEMBL
PathwaysReactome
PathwaysReactome
SystemsBioModelsSystemsBioModels
Literature and ontologiesCitExplore, GO
Literature and ontologiesCitExplore, GO
531 Databases surveyed
208 Responded, 323 did not
Alive, 390
Dead, 63
Unclear, 78
Dead = no update since 2005
Total European effort• 200 Databases• 700 People• 100 Institutions• 60 million web hits per month• Total investment to date €308 million• Annual cost €35 million
BUT• About 1/3 of responders report NO costs• 60% of polled databases didn’t respond
– It is almost certain that the non-responders are smaller on average
RECOMMENDATIONCoordination and prioritisation, as well as stable funding, is needed for many of these resources
Security of the databases (out of 208)
Not assured, 60
Assured for at least 1 year, 51
Assured for at least 3 years, 43
We are considering commercialisation, 1
Assured for more than 5 years, 6
Assured for at least 5 years, 4
ELIXIR: a sustainable infrastructure for biological information in Europe.
WP3: User Communities
• User Survey: 800 responses– Long term support essential– Top 3 challenges:
• Data integration; Format compatability; Website usability– Concerns
• Data quality and measures; Quality of tools;Training
• Need to consider different needs in different countries• Need for a plan for long-term maintenance of computational tools
– Create mechanisms for long-term maintenance of bioinformatics tools • user-friendly & machine-friendly interfaces
• Need for standards for formats and integration– Increased integration of databases, tools and between infrastructure domains
• Need to provide mechanisms for prioritisation of need for resources
ELIXIR WP12 Committee meeting, November 2008Søren Brunak
WP12 Tool integration priorities
• Short term:– Access: Programatic access, (traditional clickable interfaces,
downloads)– Basic infrastructure should work from the start in order to take off– Not new webservice technologies – winning ones largely identified*– Focus on accessibility, uptime, replication, testing and user scope– Well-maintained catalogs– Benchmarking frameworks (including maintenance)
• Medium term:– User category-sensitive aspects– Recommended hardware for WSs– Advanced benchmarking and tool comparison– Tool termination policies
• Long term:
– End-user data and tool integration– Commercial tools adopt standards defined by public effort
(long term goals hard to obtain from industry in particular big pharma)
ELIXIR WP12 Committee meeting, November 2008Søren Brunak
Elixir Tool integration marketing bait
”Parsed on arrival”
ELIXIR: a sustainable infrastructure for biological information in Europe.
WP8: Literature
Europe will greatly benefit from a pan-European open access repository for scientific articles and related relevant text resources., interlinked with the other core data resources and embedded in text processing environments.
ELIXIR should enable and maintain an infrastructure for the biomedical research community that:
1) brings together textual resources from different origins, 2) integrates the text resources into the biological databases, 3) allows efficient exploitation of the resources with automatic and
interactive means and 4) supports formatting and semantics standards, to support scientific
progress for the whole bio-med-chemical research community.
ELIXIR: a sustainable infrastructure for biological information in Europe.
WP9: Healthcare Recommendations
• Elixir should be a valuable resource for Healthcare– Contribute to translational, clinical research– Close collaboration with related infrastructures (e.g., biobanks,
imaging, population-based registries/databases/trials)
• Elixir must– Address the challenge of heterogeneous, often poorly structured data– develop and maintain nomenclatures and ontologies– support development of processing different data e.g. text & images)– include requirements from related domains (e.g., privacy protection)– facilitate communication with related domains (e.g., import and export of
data, storing of meta data)– explicitly assign the responsibility to continuously and actively seek
collaboration to certain partners or nodes to ensure accountability and transparency.
ELIXIR: a sustainable infrastructure for biological information in Europe.
WP 10: Chemistry, Plants, Agriculture & Environment
• Support / extend current core resources for – Nucleotide/protein sequence, genomes, structures, interactions etc.
• Selected specialist resources migrated to Elixir infrastructure– Reduce complexity of informatics landscape, maintain functionality– Integration allows mining of combined data
• Adopt key data standards and work for common infrastructure– Link to other ESFRI, non ESFRI European projects– Link to non European initiatives (NSF/iPlant, DOE/Camera)
• Free access to Elixir data and core analysis tools– Web based queries, programmatic access, download
ELIXIR: a sustainable infrastructure for biological information in Europe.
WP11: Training
Identified training issues in Europe:• Little or no coordination• Rapid evolution of bioinformatics resources• Lack of a centralised body for guidance;• Lack of recognition of the importance of bioinformatics user training,
even within the bioinformatics community.
Elixir recommendations:Link the development of data resources to the provision of trainingmaterials;Create a training support unit that will: a) provide a centralised training registry;b) provide support for trainers throughout Europe
c) develop benchmarking and evaluation systems; d) provide mechanisms for developing new training programmese) act as a single point of contact for national and pan-European training
ELIXIR: a sustainable infrastructure for biological information in Europe. 30
ESFRI Biology Research Infrastructure proposals.
BBMRI(Biobanking)
INSTRUCT(Structural biology)
ELIXIR
Infrafrontier(Mouse)
ECRIN(Clinical Trials)(Translational Research)
EATRIS
(Biological Information)
Target ID Hit Lead Lead Opt Preclinical Phase I Phase II Phase IIITarget Val
Research Discovery Development
ELIXIR: a sustainable infrastructure for biological information in Europe.
Many meetings!!
Versailles 9 December 2008ECRI 2008. European Conference on Research Infrastructure
Nancy 21 November 2008L’Institut de l’Information Scientifique et TechniqueLaunch of Standards-based Infrastructure with Distributed Resources (SIDR)
Paris 21st & 22nd Octobere-IRG open workshop organized by GIP RENATER under the French EU-presidency Information exchange between the ESFRI PPPs & the e-Infrastructure Community
Bergen October 9-101) Norwegian Bioinformatics Forum 2) Meeting with Stakeholders & Research CouncilMake case for Norwegian Involvement in ELIXIR
Innsbruck, 1-4 Oct 2008 3rd ESF Functional Genomics & Disease ConferencePresentation at BIOSAPIENS symposium
11th-12th September 2008, Duesseldorf, GermanyERASysBio 2nd Meeting of European Systems Biology CentresPresentation on ELIXIR SB Feasibility Study
Edinburgh 8-11th Septembere-Science All hands Meeting Crossing Boundaries: Computational Science, E-Science and Global E-Infrastructures
Munich 20th AugustPARADE partners planning meetingPresentation on EBI contribution to PARADE
Helsinki 11 AugustFinnish Stakeholders MeetingMake case for Finnish involvement in ELIXIR
9th of July in LisbonFirst meeting of national network of bioinformatics Presentation and meet with government representatives
July 4th in Rome.Meeting of the BITS (Bioinformatics Italian Society)Perspectives of Bioinformatics in Europe (e.g. Elixir project, etc.).
ELIXIR: a sustainable infrastructure for biological information in Europe. 32
ELIXIR evolution
Nov 2007
Prep. Phase Interim Phase Permanent Phase
Dec 2010
Dec 2021Dec 2011
Dec 2016
Dec 2016
ELIXIR
EMBL Indicative Scheme
Dec 2020Dec 2013
EU Framework Programme 7 EU FP8
ELIXIR: a sustainable infrastructure for biological information in Europe. 33
Next Steps
• On course to produce reports from all surveys and WPs by mid July
• Many visits to member countries are ‘in progress’ – to identify unique contributions and requirements
• Further discussions needed with international colleagues
• THEN – discussions on International Consortium Agreement and Funding
Benefits of ELIXIRThis infrastructure will contribute to European science by:• Optimising access and exploitation of life-science data.
• Ensuring longevity of the data and protecting investments already made in research which collected the data,
• Increasing the competence and size of the already-large user community by strengthening national efforts in training and outreach.
• Enhance the global success and influence of Europe in life-science research and industry.