Post on 09-Jun-2020
transcript
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Text Mining for Metagenomics A New BioCreative Task
Lynette Hirschman MITRE
BioCreative Workshop
BioCuration 2014 Toronto April 6-9 2014
Approved for Public Release Case No 14-1214
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Metagenomics and Metadata
Metagenomics approach - Handles environmental (heterogeneous) samples - Enables exploration of Microbial communities that canrsquot be cultured Biodiversity in multiple environments eg soil
ocean toxic waste siteshellip Human microbiome studies
Metagenomics data sets must preserve metadata - Context is everything - including naming the
environment eg toxic sludge whalefall ldquoa blade of grass from Raritan River NJ USArdquo
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
The Genomic Standards Consortium1 (GSC) has been a leader in standards and Minimum Information checklists for Metagenomics
To capture computable metadata we need - Minimal data standards (eg GSCrsquos Minimum
Information about any sequence or MIxS)2
- Controlled structured vocabulary or ontologies EnvO (Environmental Ontology)3
- Tools to extract metadata from free text and map into structured vocabulary Prospective for new meta(genomics) data Retrospective from published literature
Text Mining and Metadata Capture
1httpgenscorg wwwnaturecomnbtjournalv29n5fullnbt1823html 3Buttigieg et al Journal of Biomedical Semantics 2013 443 httpwwwjbiomedsemcomcontent4143
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Why BioCreative
BioCreative does Critical Assessment of Information Extraction for Biology
- Metagenomics is an important new area of research Focus of BioCreative is to
- Drive research towards real applications - Supply real(istic) challenge tasks including Applications from biological database curators Community amp standards-based metrics Reusable data and resources Interoperability (eg BioC)
BioCreative V (2015) will include a task on Text Mining for Metagenomics
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
What Metadata to Capture Candidate metadata types
- Species - Sample location Environment Geospatial location
- Phenotype Morphological characteristics Antibiotic resistance
BioCreative Metagenomics Advisory Group identified capture of sample environment
(isolation source) as critical text mining task
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Constructing a BioCreative Task Define the task
- Capture of environmental metadata - Interactive (prospective) capture at data entry time
Define a target output or vocabulary - EnvO (Environmental Ontology)1 EnvO-Lite2
Identify sources of training data - megX3 data annotated with EnvO-Lite terms - ENVIRONMENTS4 project mapping Encyclopedia
of Life to EnvO Identify test data and interested end users
- GOLD MG-RAST BioProject [your data here] Recruit text mining teams to participate
1Buttigieg PL et al J Biomed Semantics 2013 Dec 114(1)43 doi 1011862041-1480-4-43 2HirschmanL et al (2008) Habitat-Lite a GSC case study based on free text terms for environmental metadata OMICS 12 129ndash136 3httpwwwmegxnethabitatshabitatshtml 4httpenvironmentshcmrgr
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Defining a Terminology EnvO and EnvO-Lite (aka Habitat-Lite)
GSC participantsrsquo need - Light-weight structured
terminology to capture high level environment metadata
EnvO-Lite a ldquoslimrdquo from EnvO1 (Environmental Ontology Ashburner Morrison et al)
Extended by Gloumlcknerrsquos group at MPI Bremen (Buttigieg)
In use at - megX GOLD MG-RAST
BioProject Genomic Standards Consortium
1Buttigieg PL et al J Biomed Semantics 2013 Dec 114(1)43 doi 1011862041-1480-4-43
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Training Data megx Experiment Data Tagged with EnvO-Lite Classes
httpwwwmegxnethabitatshabitatshtml
Crocosphaera watsonii WH0002 was isolated from the subtropical Pacific Ocean waters taken at a depth of 50 meters
Marine Habitat
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Training Data ENVIRONMENTS Project (Pafilis Hellenic Centre for Marine Research
httpenvironmentshcmrgr
Environmental data from Encyclopedia of Life tagged
with EnvO terms
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Genomes On Line Database (GOLD) An Example Use Case
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Environmental Metadata in GOLD
Isolation Site a Blade of grass from Raritan River NJ USA Mapping into EnvO-Lite
From Genomes On Line Database (GOLD) D346ndashD354 Nucleic Acids Research 2010 Vol 38 Database issue Published online 13 November 2009
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
The Plan
BioCreative IV (Bethesda October 2014) - Discussion of text mining needs of metagenomics
community Metagenomics Advisory Group (ongoing teleconfs)
- Soliciting datasets and organizers for a metagenomics task for BioCreative V (2015)
Biocuration Conference (Toronto April 2014) - Update on Metagenomics task for BioCreative
BioCreative V (Spain 2015) - Task for text mining for metagenomics
12
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
BioCreative Metagenomics Advisory Group
Jim Cole Michigan State George Garrity Names for Life and Michigan State Folker Meyer Argonne National Lab Nikos Kyrpides Joint Genome Institute Evangelos Pafilis Hellenic Centre for Marine
Research Lynn Schriml U Maryland Medical School Tatiana Tatusova NCBI
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Acknowledgements National Science Foundation for support of
BioCreative1 National Science Foundation for support of MITRErsquos
earlier activities on Mining Metadata for Metagenomics2
Department of Energy for conference grant for the metagenomics and text mining work3
14
1NSF Grant DBI-0850319 2NSF Grants IIS-0746650 and IIS-0844419 3Office of Science (BER) of the US Dept of Energy This material is based upon work supported by the Department of Energy under Award Number DE-SC0010838 Disclaimer This presentation was prepared as an account of work sponsored by an agency of the United States Government Neither the United States Government nor any agency thereof nor any of their employees makes any warranty express of implied or assumes any legal liability or responsibility for the accuracy completeness or usefulness of any information apparatus product or process disclosed or represents that its use would not infringe privately owned rights Reference herein to any specific commercial product process or service by trade name trademark manufacturer or otherwise does not necessarily constitute or imply its endorsement recommendation or favoring by the United States Government or any agency thereof The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Back Up
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
MegX (Marine Ecological GenomiX) (Pre-MIGSMIMSMIENS)
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Information in Full Text Full text article Methods section1 Further details for all methods used in this study are provided in Supplementary Information O algarvensis specimens were collected off Capo di Sant Andrea Elba Italy Supplementary Material (pdf) Juvenile and adult Olavius algarvensis specimens were collected in May and September 2004 from 56 m water depth in silicate sediments around sea grass beds of Posidonia oceanica in a bay off Capo di Santrsquo Andrea Elba Italy (42deg4826N 010deg0828E)
1Symbiosis insights through metagenomic analysis of a microbial consortium Woyke T et al Nature 443 950-955 (26 October 2006) doi101038nature05192 2httpwwwnaturecomnaturejournalv443n7114extrefnature05192-s1pdf
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Metadata in Reference
Faiz O Colak A Saglam N Canakccedili S Belduumlz AO
J Biochem Mol Biol 2007 Jul 3140(4)588-94
Information scattered throughout article probably in reference
More specific information given in the conclusion
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Metagenomics and Metadata
Metagenomics approach - Handles environmental (heterogeneous) samples - Enables exploration of Microbial communities that canrsquot be cultured Biodiversity in multiple environments eg soil
ocean toxic waste siteshellip Human microbiome studies
Metagenomics data sets must preserve metadata - Context is everything - including naming the
environment eg toxic sludge whalefall ldquoa blade of grass from Raritan River NJ USArdquo
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
The Genomic Standards Consortium1 (GSC) has been a leader in standards and Minimum Information checklists for Metagenomics
To capture computable metadata we need - Minimal data standards (eg GSCrsquos Minimum
Information about any sequence or MIxS)2
- Controlled structured vocabulary or ontologies EnvO (Environmental Ontology)3
- Tools to extract metadata from free text and map into structured vocabulary Prospective for new meta(genomics) data Retrospective from published literature
Text Mining and Metadata Capture
1httpgenscorg wwwnaturecomnbtjournalv29n5fullnbt1823html 3Buttigieg et al Journal of Biomedical Semantics 2013 443 httpwwwjbiomedsemcomcontent4143
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Why BioCreative
BioCreative does Critical Assessment of Information Extraction for Biology
- Metagenomics is an important new area of research Focus of BioCreative is to
- Drive research towards real applications - Supply real(istic) challenge tasks including Applications from biological database curators Community amp standards-based metrics Reusable data and resources Interoperability (eg BioC)
BioCreative V (2015) will include a task on Text Mining for Metagenomics
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
What Metadata to Capture Candidate metadata types
- Species - Sample location Environment Geospatial location
- Phenotype Morphological characteristics Antibiotic resistance
BioCreative Metagenomics Advisory Group identified capture of sample environment
(isolation source) as critical text mining task
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Constructing a BioCreative Task Define the task
- Capture of environmental metadata - Interactive (prospective) capture at data entry time
Define a target output or vocabulary - EnvO (Environmental Ontology)1 EnvO-Lite2
Identify sources of training data - megX3 data annotated with EnvO-Lite terms - ENVIRONMENTS4 project mapping Encyclopedia
of Life to EnvO Identify test data and interested end users
- GOLD MG-RAST BioProject [your data here] Recruit text mining teams to participate
1Buttigieg PL et al J Biomed Semantics 2013 Dec 114(1)43 doi 1011862041-1480-4-43 2HirschmanL et al (2008) Habitat-Lite a GSC case study based on free text terms for environmental metadata OMICS 12 129ndash136 3httpwwwmegxnethabitatshabitatshtml 4httpenvironmentshcmrgr
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Defining a Terminology EnvO and EnvO-Lite (aka Habitat-Lite)
GSC participantsrsquo need - Light-weight structured
terminology to capture high level environment metadata
EnvO-Lite a ldquoslimrdquo from EnvO1 (Environmental Ontology Ashburner Morrison et al)
Extended by Gloumlcknerrsquos group at MPI Bremen (Buttigieg)
In use at - megX GOLD MG-RAST
BioProject Genomic Standards Consortium
1Buttigieg PL et al J Biomed Semantics 2013 Dec 114(1)43 doi 1011862041-1480-4-43
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Training Data megx Experiment Data Tagged with EnvO-Lite Classes
httpwwwmegxnethabitatshabitatshtml
Crocosphaera watsonii WH0002 was isolated from the subtropical Pacific Ocean waters taken at a depth of 50 meters
Marine Habitat
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Training Data ENVIRONMENTS Project (Pafilis Hellenic Centre for Marine Research
httpenvironmentshcmrgr
Environmental data from Encyclopedia of Life tagged
with EnvO terms
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Genomes On Line Database (GOLD) An Example Use Case
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Environmental Metadata in GOLD
Isolation Site a Blade of grass from Raritan River NJ USA Mapping into EnvO-Lite
From Genomes On Line Database (GOLD) D346ndashD354 Nucleic Acids Research 2010 Vol 38 Database issue Published online 13 November 2009
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
The Plan
BioCreative IV (Bethesda October 2014) - Discussion of text mining needs of metagenomics
community Metagenomics Advisory Group (ongoing teleconfs)
- Soliciting datasets and organizers for a metagenomics task for BioCreative V (2015)
Biocuration Conference (Toronto April 2014) - Update on Metagenomics task for BioCreative
BioCreative V (Spain 2015) - Task for text mining for metagenomics
12
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
BioCreative Metagenomics Advisory Group
Jim Cole Michigan State George Garrity Names for Life and Michigan State Folker Meyer Argonne National Lab Nikos Kyrpides Joint Genome Institute Evangelos Pafilis Hellenic Centre for Marine
Research Lynn Schriml U Maryland Medical School Tatiana Tatusova NCBI
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Acknowledgements National Science Foundation for support of
BioCreative1 National Science Foundation for support of MITRErsquos
earlier activities on Mining Metadata for Metagenomics2
Department of Energy for conference grant for the metagenomics and text mining work3
14
1NSF Grant DBI-0850319 2NSF Grants IIS-0746650 and IIS-0844419 3Office of Science (BER) of the US Dept of Energy This material is based upon work supported by the Department of Energy under Award Number DE-SC0010838 Disclaimer This presentation was prepared as an account of work sponsored by an agency of the United States Government Neither the United States Government nor any agency thereof nor any of their employees makes any warranty express of implied or assumes any legal liability or responsibility for the accuracy completeness or usefulness of any information apparatus product or process disclosed or represents that its use would not infringe privately owned rights Reference herein to any specific commercial product process or service by trade name trademark manufacturer or otherwise does not necessarily constitute or imply its endorsement recommendation or favoring by the United States Government or any agency thereof The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Back Up
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
MegX (Marine Ecological GenomiX) (Pre-MIGSMIMSMIENS)
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Information in Full Text Full text article Methods section1 Further details for all methods used in this study are provided in Supplementary Information O algarvensis specimens were collected off Capo di Sant Andrea Elba Italy Supplementary Material (pdf) Juvenile and adult Olavius algarvensis specimens were collected in May and September 2004 from 56 m water depth in silicate sediments around sea grass beds of Posidonia oceanica in a bay off Capo di Santrsquo Andrea Elba Italy (42deg4826N 010deg0828E)
1Symbiosis insights through metagenomic analysis of a microbial consortium Woyke T et al Nature 443 950-955 (26 October 2006) doi101038nature05192 2httpwwwnaturecomnaturejournalv443n7114extrefnature05192-s1pdf
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Metadata in Reference
Faiz O Colak A Saglam N Canakccedili S Belduumlz AO
J Biochem Mol Biol 2007 Jul 3140(4)588-94
Information scattered throughout article probably in reference
More specific information given in the conclusion
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
The Genomic Standards Consortium1 (GSC) has been a leader in standards and Minimum Information checklists for Metagenomics
To capture computable metadata we need - Minimal data standards (eg GSCrsquos Minimum
Information about any sequence or MIxS)2
- Controlled structured vocabulary or ontologies EnvO (Environmental Ontology)3
- Tools to extract metadata from free text and map into structured vocabulary Prospective for new meta(genomics) data Retrospective from published literature
Text Mining and Metadata Capture
1httpgenscorg wwwnaturecomnbtjournalv29n5fullnbt1823html 3Buttigieg et al Journal of Biomedical Semantics 2013 443 httpwwwjbiomedsemcomcontent4143
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Why BioCreative
BioCreative does Critical Assessment of Information Extraction for Biology
- Metagenomics is an important new area of research Focus of BioCreative is to
- Drive research towards real applications - Supply real(istic) challenge tasks including Applications from biological database curators Community amp standards-based metrics Reusable data and resources Interoperability (eg BioC)
BioCreative V (2015) will include a task on Text Mining for Metagenomics
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
What Metadata to Capture Candidate metadata types
- Species - Sample location Environment Geospatial location
- Phenotype Morphological characteristics Antibiotic resistance
BioCreative Metagenomics Advisory Group identified capture of sample environment
(isolation source) as critical text mining task
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Constructing a BioCreative Task Define the task
- Capture of environmental metadata - Interactive (prospective) capture at data entry time
Define a target output or vocabulary - EnvO (Environmental Ontology)1 EnvO-Lite2
Identify sources of training data - megX3 data annotated with EnvO-Lite terms - ENVIRONMENTS4 project mapping Encyclopedia
of Life to EnvO Identify test data and interested end users
- GOLD MG-RAST BioProject [your data here] Recruit text mining teams to participate
1Buttigieg PL et al J Biomed Semantics 2013 Dec 114(1)43 doi 1011862041-1480-4-43 2HirschmanL et al (2008) Habitat-Lite a GSC case study based on free text terms for environmental metadata OMICS 12 129ndash136 3httpwwwmegxnethabitatshabitatshtml 4httpenvironmentshcmrgr
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Defining a Terminology EnvO and EnvO-Lite (aka Habitat-Lite)
GSC participantsrsquo need - Light-weight structured
terminology to capture high level environment metadata
EnvO-Lite a ldquoslimrdquo from EnvO1 (Environmental Ontology Ashburner Morrison et al)
Extended by Gloumlcknerrsquos group at MPI Bremen (Buttigieg)
In use at - megX GOLD MG-RAST
BioProject Genomic Standards Consortium
1Buttigieg PL et al J Biomed Semantics 2013 Dec 114(1)43 doi 1011862041-1480-4-43
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Training Data megx Experiment Data Tagged with EnvO-Lite Classes
httpwwwmegxnethabitatshabitatshtml
Crocosphaera watsonii WH0002 was isolated from the subtropical Pacific Ocean waters taken at a depth of 50 meters
Marine Habitat
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Training Data ENVIRONMENTS Project (Pafilis Hellenic Centre for Marine Research
httpenvironmentshcmrgr
Environmental data from Encyclopedia of Life tagged
with EnvO terms
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Genomes On Line Database (GOLD) An Example Use Case
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Environmental Metadata in GOLD
Isolation Site a Blade of grass from Raritan River NJ USA Mapping into EnvO-Lite
From Genomes On Line Database (GOLD) D346ndashD354 Nucleic Acids Research 2010 Vol 38 Database issue Published online 13 November 2009
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
The Plan
BioCreative IV (Bethesda October 2014) - Discussion of text mining needs of metagenomics
community Metagenomics Advisory Group (ongoing teleconfs)
- Soliciting datasets and organizers for a metagenomics task for BioCreative V (2015)
Biocuration Conference (Toronto April 2014) - Update on Metagenomics task for BioCreative
BioCreative V (Spain 2015) - Task for text mining for metagenomics
12
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
BioCreative Metagenomics Advisory Group
Jim Cole Michigan State George Garrity Names for Life and Michigan State Folker Meyer Argonne National Lab Nikos Kyrpides Joint Genome Institute Evangelos Pafilis Hellenic Centre for Marine
Research Lynn Schriml U Maryland Medical School Tatiana Tatusova NCBI
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Acknowledgements National Science Foundation for support of
BioCreative1 National Science Foundation for support of MITRErsquos
earlier activities on Mining Metadata for Metagenomics2
Department of Energy for conference grant for the metagenomics and text mining work3
14
1NSF Grant DBI-0850319 2NSF Grants IIS-0746650 and IIS-0844419 3Office of Science (BER) of the US Dept of Energy This material is based upon work supported by the Department of Energy under Award Number DE-SC0010838 Disclaimer This presentation was prepared as an account of work sponsored by an agency of the United States Government Neither the United States Government nor any agency thereof nor any of their employees makes any warranty express of implied or assumes any legal liability or responsibility for the accuracy completeness or usefulness of any information apparatus product or process disclosed or represents that its use would not infringe privately owned rights Reference herein to any specific commercial product process or service by trade name trademark manufacturer or otherwise does not necessarily constitute or imply its endorsement recommendation or favoring by the United States Government or any agency thereof The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Back Up
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
MegX (Marine Ecological GenomiX) (Pre-MIGSMIMSMIENS)
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Information in Full Text Full text article Methods section1 Further details for all methods used in this study are provided in Supplementary Information O algarvensis specimens were collected off Capo di Sant Andrea Elba Italy Supplementary Material (pdf) Juvenile and adult Olavius algarvensis specimens were collected in May and September 2004 from 56 m water depth in silicate sediments around sea grass beds of Posidonia oceanica in a bay off Capo di Santrsquo Andrea Elba Italy (42deg4826N 010deg0828E)
1Symbiosis insights through metagenomic analysis of a microbial consortium Woyke T et al Nature 443 950-955 (26 October 2006) doi101038nature05192 2httpwwwnaturecomnaturejournalv443n7114extrefnature05192-s1pdf
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Metadata in Reference
Faiz O Colak A Saglam N Canakccedili S Belduumlz AO
J Biochem Mol Biol 2007 Jul 3140(4)588-94
Information scattered throughout article probably in reference
More specific information given in the conclusion
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Why BioCreative
BioCreative does Critical Assessment of Information Extraction for Biology
- Metagenomics is an important new area of research Focus of BioCreative is to
- Drive research towards real applications - Supply real(istic) challenge tasks including Applications from biological database curators Community amp standards-based metrics Reusable data and resources Interoperability (eg BioC)
BioCreative V (2015) will include a task on Text Mining for Metagenomics
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
What Metadata to Capture Candidate metadata types
- Species - Sample location Environment Geospatial location
- Phenotype Morphological characteristics Antibiotic resistance
BioCreative Metagenomics Advisory Group identified capture of sample environment
(isolation source) as critical text mining task
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Constructing a BioCreative Task Define the task
- Capture of environmental metadata - Interactive (prospective) capture at data entry time
Define a target output or vocabulary - EnvO (Environmental Ontology)1 EnvO-Lite2
Identify sources of training data - megX3 data annotated with EnvO-Lite terms - ENVIRONMENTS4 project mapping Encyclopedia
of Life to EnvO Identify test data and interested end users
- GOLD MG-RAST BioProject [your data here] Recruit text mining teams to participate
1Buttigieg PL et al J Biomed Semantics 2013 Dec 114(1)43 doi 1011862041-1480-4-43 2HirschmanL et al (2008) Habitat-Lite a GSC case study based on free text terms for environmental metadata OMICS 12 129ndash136 3httpwwwmegxnethabitatshabitatshtml 4httpenvironmentshcmrgr
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Defining a Terminology EnvO and EnvO-Lite (aka Habitat-Lite)
GSC participantsrsquo need - Light-weight structured
terminology to capture high level environment metadata
EnvO-Lite a ldquoslimrdquo from EnvO1 (Environmental Ontology Ashburner Morrison et al)
Extended by Gloumlcknerrsquos group at MPI Bremen (Buttigieg)
In use at - megX GOLD MG-RAST
BioProject Genomic Standards Consortium
1Buttigieg PL et al J Biomed Semantics 2013 Dec 114(1)43 doi 1011862041-1480-4-43
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Training Data megx Experiment Data Tagged with EnvO-Lite Classes
httpwwwmegxnethabitatshabitatshtml
Crocosphaera watsonii WH0002 was isolated from the subtropical Pacific Ocean waters taken at a depth of 50 meters
Marine Habitat
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Training Data ENVIRONMENTS Project (Pafilis Hellenic Centre for Marine Research
httpenvironmentshcmrgr
Environmental data from Encyclopedia of Life tagged
with EnvO terms
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Genomes On Line Database (GOLD) An Example Use Case
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Environmental Metadata in GOLD
Isolation Site a Blade of grass from Raritan River NJ USA Mapping into EnvO-Lite
From Genomes On Line Database (GOLD) D346ndashD354 Nucleic Acids Research 2010 Vol 38 Database issue Published online 13 November 2009
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
The Plan
BioCreative IV (Bethesda October 2014) - Discussion of text mining needs of metagenomics
community Metagenomics Advisory Group (ongoing teleconfs)
- Soliciting datasets and organizers for a metagenomics task for BioCreative V (2015)
Biocuration Conference (Toronto April 2014) - Update on Metagenomics task for BioCreative
BioCreative V (Spain 2015) - Task for text mining for metagenomics
12
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
BioCreative Metagenomics Advisory Group
Jim Cole Michigan State George Garrity Names for Life and Michigan State Folker Meyer Argonne National Lab Nikos Kyrpides Joint Genome Institute Evangelos Pafilis Hellenic Centre for Marine
Research Lynn Schriml U Maryland Medical School Tatiana Tatusova NCBI
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Acknowledgements National Science Foundation for support of
BioCreative1 National Science Foundation for support of MITRErsquos
earlier activities on Mining Metadata for Metagenomics2
Department of Energy for conference grant for the metagenomics and text mining work3
14
1NSF Grant DBI-0850319 2NSF Grants IIS-0746650 and IIS-0844419 3Office of Science (BER) of the US Dept of Energy This material is based upon work supported by the Department of Energy under Award Number DE-SC0010838 Disclaimer This presentation was prepared as an account of work sponsored by an agency of the United States Government Neither the United States Government nor any agency thereof nor any of their employees makes any warranty express of implied or assumes any legal liability or responsibility for the accuracy completeness or usefulness of any information apparatus product or process disclosed or represents that its use would not infringe privately owned rights Reference herein to any specific commercial product process or service by trade name trademark manufacturer or otherwise does not necessarily constitute or imply its endorsement recommendation or favoring by the United States Government or any agency thereof The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Back Up
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
MegX (Marine Ecological GenomiX) (Pre-MIGSMIMSMIENS)
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Information in Full Text Full text article Methods section1 Further details for all methods used in this study are provided in Supplementary Information O algarvensis specimens were collected off Capo di Sant Andrea Elba Italy Supplementary Material (pdf) Juvenile and adult Olavius algarvensis specimens were collected in May and September 2004 from 56 m water depth in silicate sediments around sea grass beds of Posidonia oceanica in a bay off Capo di Santrsquo Andrea Elba Italy (42deg4826N 010deg0828E)
1Symbiosis insights through metagenomic analysis of a microbial consortium Woyke T et al Nature 443 950-955 (26 October 2006) doi101038nature05192 2httpwwwnaturecomnaturejournalv443n7114extrefnature05192-s1pdf
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Metadata in Reference
Faiz O Colak A Saglam N Canakccedili S Belduumlz AO
J Biochem Mol Biol 2007 Jul 3140(4)588-94
Information scattered throughout article probably in reference
More specific information given in the conclusion
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
What Metadata to Capture Candidate metadata types
- Species - Sample location Environment Geospatial location
- Phenotype Morphological characteristics Antibiotic resistance
BioCreative Metagenomics Advisory Group identified capture of sample environment
(isolation source) as critical text mining task
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Constructing a BioCreative Task Define the task
- Capture of environmental metadata - Interactive (prospective) capture at data entry time
Define a target output or vocabulary - EnvO (Environmental Ontology)1 EnvO-Lite2
Identify sources of training data - megX3 data annotated with EnvO-Lite terms - ENVIRONMENTS4 project mapping Encyclopedia
of Life to EnvO Identify test data and interested end users
- GOLD MG-RAST BioProject [your data here] Recruit text mining teams to participate
1Buttigieg PL et al J Biomed Semantics 2013 Dec 114(1)43 doi 1011862041-1480-4-43 2HirschmanL et al (2008) Habitat-Lite a GSC case study based on free text terms for environmental metadata OMICS 12 129ndash136 3httpwwwmegxnethabitatshabitatshtml 4httpenvironmentshcmrgr
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Defining a Terminology EnvO and EnvO-Lite (aka Habitat-Lite)
GSC participantsrsquo need - Light-weight structured
terminology to capture high level environment metadata
EnvO-Lite a ldquoslimrdquo from EnvO1 (Environmental Ontology Ashburner Morrison et al)
Extended by Gloumlcknerrsquos group at MPI Bremen (Buttigieg)
In use at - megX GOLD MG-RAST
BioProject Genomic Standards Consortium
1Buttigieg PL et al J Biomed Semantics 2013 Dec 114(1)43 doi 1011862041-1480-4-43
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Training Data megx Experiment Data Tagged with EnvO-Lite Classes
httpwwwmegxnethabitatshabitatshtml
Crocosphaera watsonii WH0002 was isolated from the subtropical Pacific Ocean waters taken at a depth of 50 meters
Marine Habitat
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Training Data ENVIRONMENTS Project (Pafilis Hellenic Centre for Marine Research
httpenvironmentshcmrgr
Environmental data from Encyclopedia of Life tagged
with EnvO terms
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Genomes On Line Database (GOLD) An Example Use Case
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Environmental Metadata in GOLD
Isolation Site a Blade of grass from Raritan River NJ USA Mapping into EnvO-Lite
From Genomes On Line Database (GOLD) D346ndashD354 Nucleic Acids Research 2010 Vol 38 Database issue Published online 13 November 2009
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
The Plan
BioCreative IV (Bethesda October 2014) - Discussion of text mining needs of metagenomics
community Metagenomics Advisory Group (ongoing teleconfs)
- Soliciting datasets and organizers for a metagenomics task for BioCreative V (2015)
Biocuration Conference (Toronto April 2014) - Update on Metagenomics task for BioCreative
BioCreative V (Spain 2015) - Task for text mining for metagenomics
12
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
BioCreative Metagenomics Advisory Group
Jim Cole Michigan State George Garrity Names for Life and Michigan State Folker Meyer Argonne National Lab Nikos Kyrpides Joint Genome Institute Evangelos Pafilis Hellenic Centre for Marine
Research Lynn Schriml U Maryland Medical School Tatiana Tatusova NCBI
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Acknowledgements National Science Foundation for support of
BioCreative1 National Science Foundation for support of MITRErsquos
earlier activities on Mining Metadata for Metagenomics2
Department of Energy for conference grant for the metagenomics and text mining work3
14
1NSF Grant DBI-0850319 2NSF Grants IIS-0746650 and IIS-0844419 3Office of Science (BER) of the US Dept of Energy This material is based upon work supported by the Department of Energy under Award Number DE-SC0010838 Disclaimer This presentation was prepared as an account of work sponsored by an agency of the United States Government Neither the United States Government nor any agency thereof nor any of their employees makes any warranty express of implied or assumes any legal liability or responsibility for the accuracy completeness or usefulness of any information apparatus product or process disclosed or represents that its use would not infringe privately owned rights Reference herein to any specific commercial product process or service by trade name trademark manufacturer or otherwise does not necessarily constitute or imply its endorsement recommendation or favoring by the United States Government or any agency thereof The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Back Up
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
MegX (Marine Ecological GenomiX) (Pre-MIGSMIMSMIENS)
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Information in Full Text Full text article Methods section1 Further details for all methods used in this study are provided in Supplementary Information O algarvensis specimens were collected off Capo di Sant Andrea Elba Italy Supplementary Material (pdf) Juvenile and adult Olavius algarvensis specimens were collected in May and September 2004 from 56 m water depth in silicate sediments around sea grass beds of Posidonia oceanica in a bay off Capo di Santrsquo Andrea Elba Italy (42deg4826N 010deg0828E)
1Symbiosis insights through metagenomic analysis of a microbial consortium Woyke T et al Nature 443 950-955 (26 October 2006) doi101038nature05192 2httpwwwnaturecomnaturejournalv443n7114extrefnature05192-s1pdf
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Metadata in Reference
Faiz O Colak A Saglam N Canakccedili S Belduumlz AO
J Biochem Mol Biol 2007 Jul 3140(4)588-94
Information scattered throughout article probably in reference
More specific information given in the conclusion
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Constructing a BioCreative Task Define the task
- Capture of environmental metadata - Interactive (prospective) capture at data entry time
Define a target output or vocabulary - EnvO (Environmental Ontology)1 EnvO-Lite2
Identify sources of training data - megX3 data annotated with EnvO-Lite terms - ENVIRONMENTS4 project mapping Encyclopedia
of Life to EnvO Identify test data and interested end users
- GOLD MG-RAST BioProject [your data here] Recruit text mining teams to participate
1Buttigieg PL et al J Biomed Semantics 2013 Dec 114(1)43 doi 1011862041-1480-4-43 2HirschmanL et al (2008) Habitat-Lite a GSC case study based on free text terms for environmental metadata OMICS 12 129ndash136 3httpwwwmegxnethabitatshabitatshtml 4httpenvironmentshcmrgr
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Defining a Terminology EnvO and EnvO-Lite (aka Habitat-Lite)
GSC participantsrsquo need - Light-weight structured
terminology to capture high level environment metadata
EnvO-Lite a ldquoslimrdquo from EnvO1 (Environmental Ontology Ashburner Morrison et al)
Extended by Gloumlcknerrsquos group at MPI Bremen (Buttigieg)
In use at - megX GOLD MG-RAST
BioProject Genomic Standards Consortium
1Buttigieg PL et al J Biomed Semantics 2013 Dec 114(1)43 doi 1011862041-1480-4-43
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Training Data megx Experiment Data Tagged with EnvO-Lite Classes
httpwwwmegxnethabitatshabitatshtml
Crocosphaera watsonii WH0002 was isolated from the subtropical Pacific Ocean waters taken at a depth of 50 meters
Marine Habitat
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Training Data ENVIRONMENTS Project (Pafilis Hellenic Centre for Marine Research
httpenvironmentshcmrgr
Environmental data from Encyclopedia of Life tagged
with EnvO terms
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Genomes On Line Database (GOLD) An Example Use Case
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Environmental Metadata in GOLD
Isolation Site a Blade of grass from Raritan River NJ USA Mapping into EnvO-Lite
From Genomes On Line Database (GOLD) D346ndashD354 Nucleic Acids Research 2010 Vol 38 Database issue Published online 13 November 2009
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
The Plan
BioCreative IV (Bethesda October 2014) - Discussion of text mining needs of metagenomics
community Metagenomics Advisory Group (ongoing teleconfs)
- Soliciting datasets and organizers for a metagenomics task for BioCreative V (2015)
Biocuration Conference (Toronto April 2014) - Update on Metagenomics task for BioCreative
BioCreative V (Spain 2015) - Task for text mining for metagenomics
12
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
BioCreative Metagenomics Advisory Group
Jim Cole Michigan State George Garrity Names for Life and Michigan State Folker Meyer Argonne National Lab Nikos Kyrpides Joint Genome Institute Evangelos Pafilis Hellenic Centre for Marine
Research Lynn Schriml U Maryland Medical School Tatiana Tatusova NCBI
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Acknowledgements National Science Foundation for support of
BioCreative1 National Science Foundation for support of MITRErsquos
earlier activities on Mining Metadata for Metagenomics2
Department of Energy for conference grant for the metagenomics and text mining work3
14
1NSF Grant DBI-0850319 2NSF Grants IIS-0746650 and IIS-0844419 3Office of Science (BER) of the US Dept of Energy This material is based upon work supported by the Department of Energy under Award Number DE-SC0010838 Disclaimer This presentation was prepared as an account of work sponsored by an agency of the United States Government Neither the United States Government nor any agency thereof nor any of their employees makes any warranty express of implied or assumes any legal liability or responsibility for the accuracy completeness or usefulness of any information apparatus product or process disclosed or represents that its use would not infringe privately owned rights Reference herein to any specific commercial product process or service by trade name trademark manufacturer or otherwise does not necessarily constitute or imply its endorsement recommendation or favoring by the United States Government or any agency thereof The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Back Up
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
MegX (Marine Ecological GenomiX) (Pre-MIGSMIMSMIENS)
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Information in Full Text Full text article Methods section1 Further details for all methods used in this study are provided in Supplementary Information O algarvensis specimens were collected off Capo di Sant Andrea Elba Italy Supplementary Material (pdf) Juvenile and adult Olavius algarvensis specimens were collected in May and September 2004 from 56 m water depth in silicate sediments around sea grass beds of Posidonia oceanica in a bay off Capo di Santrsquo Andrea Elba Italy (42deg4826N 010deg0828E)
1Symbiosis insights through metagenomic analysis of a microbial consortium Woyke T et al Nature 443 950-955 (26 October 2006) doi101038nature05192 2httpwwwnaturecomnaturejournalv443n7114extrefnature05192-s1pdf
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Metadata in Reference
Faiz O Colak A Saglam N Canakccedili S Belduumlz AO
J Biochem Mol Biol 2007 Jul 3140(4)588-94
Information scattered throughout article probably in reference
More specific information given in the conclusion
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Defining a Terminology EnvO and EnvO-Lite (aka Habitat-Lite)
GSC participantsrsquo need - Light-weight structured
terminology to capture high level environment metadata
EnvO-Lite a ldquoslimrdquo from EnvO1 (Environmental Ontology Ashburner Morrison et al)
Extended by Gloumlcknerrsquos group at MPI Bremen (Buttigieg)
In use at - megX GOLD MG-RAST
BioProject Genomic Standards Consortium
1Buttigieg PL et al J Biomed Semantics 2013 Dec 114(1)43 doi 1011862041-1480-4-43
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Training Data megx Experiment Data Tagged with EnvO-Lite Classes
httpwwwmegxnethabitatshabitatshtml
Crocosphaera watsonii WH0002 was isolated from the subtropical Pacific Ocean waters taken at a depth of 50 meters
Marine Habitat
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Training Data ENVIRONMENTS Project (Pafilis Hellenic Centre for Marine Research
httpenvironmentshcmrgr
Environmental data from Encyclopedia of Life tagged
with EnvO terms
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Genomes On Line Database (GOLD) An Example Use Case
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Environmental Metadata in GOLD
Isolation Site a Blade of grass from Raritan River NJ USA Mapping into EnvO-Lite
From Genomes On Line Database (GOLD) D346ndashD354 Nucleic Acids Research 2010 Vol 38 Database issue Published online 13 November 2009
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
The Plan
BioCreative IV (Bethesda October 2014) - Discussion of text mining needs of metagenomics
community Metagenomics Advisory Group (ongoing teleconfs)
- Soliciting datasets and organizers for a metagenomics task for BioCreative V (2015)
Biocuration Conference (Toronto April 2014) - Update on Metagenomics task for BioCreative
BioCreative V (Spain 2015) - Task for text mining for metagenomics
12
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
BioCreative Metagenomics Advisory Group
Jim Cole Michigan State George Garrity Names for Life and Michigan State Folker Meyer Argonne National Lab Nikos Kyrpides Joint Genome Institute Evangelos Pafilis Hellenic Centre for Marine
Research Lynn Schriml U Maryland Medical School Tatiana Tatusova NCBI
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Acknowledgements National Science Foundation for support of
BioCreative1 National Science Foundation for support of MITRErsquos
earlier activities on Mining Metadata for Metagenomics2
Department of Energy for conference grant for the metagenomics and text mining work3
14
1NSF Grant DBI-0850319 2NSF Grants IIS-0746650 and IIS-0844419 3Office of Science (BER) of the US Dept of Energy This material is based upon work supported by the Department of Energy under Award Number DE-SC0010838 Disclaimer This presentation was prepared as an account of work sponsored by an agency of the United States Government Neither the United States Government nor any agency thereof nor any of their employees makes any warranty express of implied or assumes any legal liability or responsibility for the accuracy completeness or usefulness of any information apparatus product or process disclosed or represents that its use would not infringe privately owned rights Reference herein to any specific commercial product process or service by trade name trademark manufacturer or otherwise does not necessarily constitute or imply its endorsement recommendation or favoring by the United States Government or any agency thereof The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Back Up
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
MegX (Marine Ecological GenomiX) (Pre-MIGSMIMSMIENS)
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Information in Full Text Full text article Methods section1 Further details for all methods used in this study are provided in Supplementary Information O algarvensis specimens were collected off Capo di Sant Andrea Elba Italy Supplementary Material (pdf) Juvenile and adult Olavius algarvensis specimens were collected in May and September 2004 from 56 m water depth in silicate sediments around sea grass beds of Posidonia oceanica in a bay off Capo di Santrsquo Andrea Elba Italy (42deg4826N 010deg0828E)
1Symbiosis insights through metagenomic analysis of a microbial consortium Woyke T et al Nature 443 950-955 (26 October 2006) doi101038nature05192 2httpwwwnaturecomnaturejournalv443n7114extrefnature05192-s1pdf
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Metadata in Reference
Faiz O Colak A Saglam N Canakccedili S Belduumlz AO
J Biochem Mol Biol 2007 Jul 3140(4)588-94
Information scattered throughout article probably in reference
More specific information given in the conclusion
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Training Data megx Experiment Data Tagged with EnvO-Lite Classes
httpwwwmegxnethabitatshabitatshtml
Crocosphaera watsonii WH0002 was isolated from the subtropical Pacific Ocean waters taken at a depth of 50 meters
Marine Habitat
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Training Data ENVIRONMENTS Project (Pafilis Hellenic Centre for Marine Research
httpenvironmentshcmrgr
Environmental data from Encyclopedia of Life tagged
with EnvO terms
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Genomes On Line Database (GOLD) An Example Use Case
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Environmental Metadata in GOLD
Isolation Site a Blade of grass from Raritan River NJ USA Mapping into EnvO-Lite
From Genomes On Line Database (GOLD) D346ndashD354 Nucleic Acids Research 2010 Vol 38 Database issue Published online 13 November 2009
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
The Plan
BioCreative IV (Bethesda October 2014) - Discussion of text mining needs of metagenomics
community Metagenomics Advisory Group (ongoing teleconfs)
- Soliciting datasets and organizers for a metagenomics task for BioCreative V (2015)
Biocuration Conference (Toronto April 2014) - Update on Metagenomics task for BioCreative
BioCreative V (Spain 2015) - Task for text mining for metagenomics
12
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
BioCreative Metagenomics Advisory Group
Jim Cole Michigan State George Garrity Names for Life and Michigan State Folker Meyer Argonne National Lab Nikos Kyrpides Joint Genome Institute Evangelos Pafilis Hellenic Centre for Marine
Research Lynn Schriml U Maryland Medical School Tatiana Tatusova NCBI
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Acknowledgements National Science Foundation for support of
BioCreative1 National Science Foundation for support of MITRErsquos
earlier activities on Mining Metadata for Metagenomics2
Department of Energy for conference grant for the metagenomics and text mining work3
14
1NSF Grant DBI-0850319 2NSF Grants IIS-0746650 and IIS-0844419 3Office of Science (BER) of the US Dept of Energy This material is based upon work supported by the Department of Energy under Award Number DE-SC0010838 Disclaimer This presentation was prepared as an account of work sponsored by an agency of the United States Government Neither the United States Government nor any agency thereof nor any of their employees makes any warranty express of implied or assumes any legal liability or responsibility for the accuracy completeness or usefulness of any information apparatus product or process disclosed or represents that its use would not infringe privately owned rights Reference herein to any specific commercial product process or service by trade name trademark manufacturer or otherwise does not necessarily constitute or imply its endorsement recommendation or favoring by the United States Government or any agency thereof The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Back Up
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
MegX (Marine Ecological GenomiX) (Pre-MIGSMIMSMIENS)
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Information in Full Text Full text article Methods section1 Further details for all methods used in this study are provided in Supplementary Information O algarvensis specimens were collected off Capo di Sant Andrea Elba Italy Supplementary Material (pdf) Juvenile and adult Olavius algarvensis specimens were collected in May and September 2004 from 56 m water depth in silicate sediments around sea grass beds of Posidonia oceanica in a bay off Capo di Santrsquo Andrea Elba Italy (42deg4826N 010deg0828E)
1Symbiosis insights through metagenomic analysis of a microbial consortium Woyke T et al Nature 443 950-955 (26 October 2006) doi101038nature05192 2httpwwwnaturecomnaturejournalv443n7114extrefnature05192-s1pdf
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Metadata in Reference
Faiz O Colak A Saglam N Canakccedili S Belduumlz AO
J Biochem Mol Biol 2007 Jul 3140(4)588-94
Information scattered throughout article probably in reference
More specific information given in the conclusion
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Training Data ENVIRONMENTS Project (Pafilis Hellenic Centre for Marine Research
httpenvironmentshcmrgr
Environmental data from Encyclopedia of Life tagged
with EnvO terms
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Genomes On Line Database (GOLD) An Example Use Case
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Environmental Metadata in GOLD
Isolation Site a Blade of grass from Raritan River NJ USA Mapping into EnvO-Lite
From Genomes On Line Database (GOLD) D346ndashD354 Nucleic Acids Research 2010 Vol 38 Database issue Published online 13 November 2009
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
The Plan
BioCreative IV (Bethesda October 2014) - Discussion of text mining needs of metagenomics
community Metagenomics Advisory Group (ongoing teleconfs)
- Soliciting datasets and organizers for a metagenomics task for BioCreative V (2015)
Biocuration Conference (Toronto April 2014) - Update on Metagenomics task for BioCreative
BioCreative V (Spain 2015) - Task for text mining for metagenomics
12
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
BioCreative Metagenomics Advisory Group
Jim Cole Michigan State George Garrity Names for Life and Michigan State Folker Meyer Argonne National Lab Nikos Kyrpides Joint Genome Institute Evangelos Pafilis Hellenic Centre for Marine
Research Lynn Schriml U Maryland Medical School Tatiana Tatusova NCBI
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Acknowledgements National Science Foundation for support of
BioCreative1 National Science Foundation for support of MITRErsquos
earlier activities on Mining Metadata for Metagenomics2
Department of Energy for conference grant for the metagenomics and text mining work3
14
1NSF Grant DBI-0850319 2NSF Grants IIS-0746650 and IIS-0844419 3Office of Science (BER) of the US Dept of Energy This material is based upon work supported by the Department of Energy under Award Number DE-SC0010838 Disclaimer This presentation was prepared as an account of work sponsored by an agency of the United States Government Neither the United States Government nor any agency thereof nor any of their employees makes any warranty express of implied or assumes any legal liability or responsibility for the accuracy completeness or usefulness of any information apparatus product or process disclosed or represents that its use would not infringe privately owned rights Reference herein to any specific commercial product process or service by trade name trademark manufacturer or otherwise does not necessarily constitute or imply its endorsement recommendation or favoring by the United States Government or any agency thereof The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Back Up
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
MegX (Marine Ecological GenomiX) (Pre-MIGSMIMSMIENS)
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Information in Full Text Full text article Methods section1 Further details for all methods used in this study are provided in Supplementary Information O algarvensis specimens were collected off Capo di Sant Andrea Elba Italy Supplementary Material (pdf) Juvenile and adult Olavius algarvensis specimens were collected in May and September 2004 from 56 m water depth in silicate sediments around sea grass beds of Posidonia oceanica in a bay off Capo di Santrsquo Andrea Elba Italy (42deg4826N 010deg0828E)
1Symbiosis insights through metagenomic analysis of a microbial consortium Woyke T et al Nature 443 950-955 (26 October 2006) doi101038nature05192 2httpwwwnaturecomnaturejournalv443n7114extrefnature05192-s1pdf
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Metadata in Reference
Faiz O Colak A Saglam N Canakccedili S Belduumlz AO
J Biochem Mol Biol 2007 Jul 3140(4)588-94
Information scattered throughout article probably in reference
More specific information given in the conclusion
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Genomes On Line Database (GOLD) An Example Use Case
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Environmental Metadata in GOLD
Isolation Site a Blade of grass from Raritan River NJ USA Mapping into EnvO-Lite
From Genomes On Line Database (GOLD) D346ndashD354 Nucleic Acids Research 2010 Vol 38 Database issue Published online 13 November 2009
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
The Plan
BioCreative IV (Bethesda October 2014) - Discussion of text mining needs of metagenomics
community Metagenomics Advisory Group (ongoing teleconfs)
- Soliciting datasets and organizers for a metagenomics task for BioCreative V (2015)
Biocuration Conference (Toronto April 2014) - Update on Metagenomics task for BioCreative
BioCreative V (Spain 2015) - Task for text mining for metagenomics
12
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
BioCreative Metagenomics Advisory Group
Jim Cole Michigan State George Garrity Names for Life and Michigan State Folker Meyer Argonne National Lab Nikos Kyrpides Joint Genome Institute Evangelos Pafilis Hellenic Centre for Marine
Research Lynn Schriml U Maryland Medical School Tatiana Tatusova NCBI
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Acknowledgements National Science Foundation for support of
BioCreative1 National Science Foundation for support of MITRErsquos
earlier activities on Mining Metadata for Metagenomics2
Department of Energy for conference grant for the metagenomics and text mining work3
14
1NSF Grant DBI-0850319 2NSF Grants IIS-0746650 and IIS-0844419 3Office of Science (BER) of the US Dept of Energy This material is based upon work supported by the Department of Energy under Award Number DE-SC0010838 Disclaimer This presentation was prepared as an account of work sponsored by an agency of the United States Government Neither the United States Government nor any agency thereof nor any of their employees makes any warranty express of implied or assumes any legal liability or responsibility for the accuracy completeness or usefulness of any information apparatus product or process disclosed or represents that its use would not infringe privately owned rights Reference herein to any specific commercial product process or service by trade name trademark manufacturer or otherwise does not necessarily constitute or imply its endorsement recommendation or favoring by the United States Government or any agency thereof The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Back Up
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
MegX (Marine Ecological GenomiX) (Pre-MIGSMIMSMIENS)
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Information in Full Text Full text article Methods section1 Further details for all methods used in this study are provided in Supplementary Information O algarvensis specimens were collected off Capo di Sant Andrea Elba Italy Supplementary Material (pdf) Juvenile and adult Olavius algarvensis specimens were collected in May and September 2004 from 56 m water depth in silicate sediments around sea grass beds of Posidonia oceanica in a bay off Capo di Santrsquo Andrea Elba Italy (42deg4826N 010deg0828E)
1Symbiosis insights through metagenomic analysis of a microbial consortium Woyke T et al Nature 443 950-955 (26 October 2006) doi101038nature05192 2httpwwwnaturecomnaturejournalv443n7114extrefnature05192-s1pdf
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Metadata in Reference
Faiz O Colak A Saglam N Canakccedili S Belduumlz AO
J Biochem Mol Biol 2007 Jul 3140(4)588-94
Information scattered throughout article probably in reference
More specific information given in the conclusion
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Environmental Metadata in GOLD
Isolation Site a Blade of grass from Raritan River NJ USA Mapping into EnvO-Lite
From Genomes On Line Database (GOLD) D346ndashD354 Nucleic Acids Research 2010 Vol 38 Database issue Published online 13 November 2009
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
The Plan
BioCreative IV (Bethesda October 2014) - Discussion of text mining needs of metagenomics
community Metagenomics Advisory Group (ongoing teleconfs)
- Soliciting datasets and organizers for a metagenomics task for BioCreative V (2015)
Biocuration Conference (Toronto April 2014) - Update on Metagenomics task for BioCreative
BioCreative V (Spain 2015) - Task for text mining for metagenomics
12
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
BioCreative Metagenomics Advisory Group
Jim Cole Michigan State George Garrity Names for Life and Michigan State Folker Meyer Argonne National Lab Nikos Kyrpides Joint Genome Institute Evangelos Pafilis Hellenic Centre for Marine
Research Lynn Schriml U Maryland Medical School Tatiana Tatusova NCBI
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Acknowledgements National Science Foundation for support of
BioCreative1 National Science Foundation for support of MITRErsquos
earlier activities on Mining Metadata for Metagenomics2
Department of Energy for conference grant for the metagenomics and text mining work3
14
1NSF Grant DBI-0850319 2NSF Grants IIS-0746650 and IIS-0844419 3Office of Science (BER) of the US Dept of Energy This material is based upon work supported by the Department of Energy under Award Number DE-SC0010838 Disclaimer This presentation was prepared as an account of work sponsored by an agency of the United States Government Neither the United States Government nor any agency thereof nor any of their employees makes any warranty express of implied or assumes any legal liability or responsibility for the accuracy completeness or usefulness of any information apparatus product or process disclosed or represents that its use would not infringe privately owned rights Reference herein to any specific commercial product process or service by trade name trademark manufacturer or otherwise does not necessarily constitute or imply its endorsement recommendation or favoring by the United States Government or any agency thereof The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Back Up
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
MegX (Marine Ecological GenomiX) (Pre-MIGSMIMSMIENS)
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Information in Full Text Full text article Methods section1 Further details for all methods used in this study are provided in Supplementary Information O algarvensis specimens were collected off Capo di Sant Andrea Elba Italy Supplementary Material (pdf) Juvenile and adult Olavius algarvensis specimens were collected in May and September 2004 from 56 m water depth in silicate sediments around sea grass beds of Posidonia oceanica in a bay off Capo di Santrsquo Andrea Elba Italy (42deg4826N 010deg0828E)
1Symbiosis insights through metagenomic analysis of a microbial consortium Woyke T et al Nature 443 950-955 (26 October 2006) doi101038nature05192 2httpwwwnaturecomnaturejournalv443n7114extrefnature05192-s1pdf
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Metadata in Reference
Faiz O Colak A Saglam N Canakccedili S Belduumlz AO
J Biochem Mol Biol 2007 Jul 3140(4)588-94
Information scattered throughout article probably in reference
More specific information given in the conclusion
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
The Plan
BioCreative IV (Bethesda October 2014) - Discussion of text mining needs of metagenomics
community Metagenomics Advisory Group (ongoing teleconfs)
- Soliciting datasets and organizers for a metagenomics task for BioCreative V (2015)
Biocuration Conference (Toronto April 2014) - Update on Metagenomics task for BioCreative
BioCreative V (Spain 2015) - Task for text mining for metagenomics
12
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
BioCreative Metagenomics Advisory Group
Jim Cole Michigan State George Garrity Names for Life and Michigan State Folker Meyer Argonne National Lab Nikos Kyrpides Joint Genome Institute Evangelos Pafilis Hellenic Centre for Marine
Research Lynn Schriml U Maryland Medical School Tatiana Tatusova NCBI
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Acknowledgements National Science Foundation for support of
BioCreative1 National Science Foundation for support of MITRErsquos
earlier activities on Mining Metadata for Metagenomics2
Department of Energy for conference grant for the metagenomics and text mining work3
14
1NSF Grant DBI-0850319 2NSF Grants IIS-0746650 and IIS-0844419 3Office of Science (BER) of the US Dept of Energy This material is based upon work supported by the Department of Energy under Award Number DE-SC0010838 Disclaimer This presentation was prepared as an account of work sponsored by an agency of the United States Government Neither the United States Government nor any agency thereof nor any of their employees makes any warranty express of implied or assumes any legal liability or responsibility for the accuracy completeness or usefulness of any information apparatus product or process disclosed or represents that its use would not infringe privately owned rights Reference herein to any specific commercial product process or service by trade name trademark manufacturer or otherwise does not necessarily constitute or imply its endorsement recommendation or favoring by the United States Government or any agency thereof The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Back Up
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
MegX (Marine Ecological GenomiX) (Pre-MIGSMIMSMIENS)
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Information in Full Text Full text article Methods section1 Further details for all methods used in this study are provided in Supplementary Information O algarvensis specimens were collected off Capo di Sant Andrea Elba Italy Supplementary Material (pdf) Juvenile and adult Olavius algarvensis specimens were collected in May and September 2004 from 56 m water depth in silicate sediments around sea grass beds of Posidonia oceanica in a bay off Capo di Santrsquo Andrea Elba Italy (42deg4826N 010deg0828E)
1Symbiosis insights through metagenomic analysis of a microbial consortium Woyke T et al Nature 443 950-955 (26 October 2006) doi101038nature05192 2httpwwwnaturecomnaturejournalv443n7114extrefnature05192-s1pdf
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Metadata in Reference
Faiz O Colak A Saglam N Canakccedili S Belduumlz AO
J Biochem Mol Biol 2007 Jul 3140(4)588-94
Information scattered throughout article probably in reference
More specific information given in the conclusion
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
BioCreative Metagenomics Advisory Group
Jim Cole Michigan State George Garrity Names for Life and Michigan State Folker Meyer Argonne National Lab Nikos Kyrpides Joint Genome Institute Evangelos Pafilis Hellenic Centre for Marine
Research Lynn Schriml U Maryland Medical School Tatiana Tatusova NCBI
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Acknowledgements National Science Foundation for support of
BioCreative1 National Science Foundation for support of MITRErsquos
earlier activities on Mining Metadata for Metagenomics2
Department of Energy for conference grant for the metagenomics and text mining work3
14
1NSF Grant DBI-0850319 2NSF Grants IIS-0746650 and IIS-0844419 3Office of Science (BER) of the US Dept of Energy This material is based upon work supported by the Department of Energy under Award Number DE-SC0010838 Disclaimer This presentation was prepared as an account of work sponsored by an agency of the United States Government Neither the United States Government nor any agency thereof nor any of their employees makes any warranty express of implied or assumes any legal liability or responsibility for the accuracy completeness or usefulness of any information apparatus product or process disclosed or represents that its use would not infringe privately owned rights Reference herein to any specific commercial product process or service by trade name trademark manufacturer or otherwise does not necessarily constitute or imply its endorsement recommendation or favoring by the United States Government or any agency thereof The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Back Up
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
MegX (Marine Ecological GenomiX) (Pre-MIGSMIMSMIENS)
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Information in Full Text Full text article Methods section1 Further details for all methods used in this study are provided in Supplementary Information O algarvensis specimens were collected off Capo di Sant Andrea Elba Italy Supplementary Material (pdf) Juvenile and adult Olavius algarvensis specimens were collected in May and September 2004 from 56 m water depth in silicate sediments around sea grass beds of Posidonia oceanica in a bay off Capo di Santrsquo Andrea Elba Italy (42deg4826N 010deg0828E)
1Symbiosis insights through metagenomic analysis of a microbial consortium Woyke T et al Nature 443 950-955 (26 October 2006) doi101038nature05192 2httpwwwnaturecomnaturejournalv443n7114extrefnature05192-s1pdf
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Metadata in Reference
Faiz O Colak A Saglam N Canakccedili S Belduumlz AO
J Biochem Mol Biol 2007 Jul 3140(4)588-94
Information scattered throughout article probably in reference
More specific information given in the conclusion
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Acknowledgements National Science Foundation for support of
BioCreative1 National Science Foundation for support of MITRErsquos
earlier activities on Mining Metadata for Metagenomics2
Department of Energy for conference grant for the metagenomics and text mining work3
14
1NSF Grant DBI-0850319 2NSF Grants IIS-0746650 and IIS-0844419 3Office of Science (BER) of the US Dept of Energy This material is based upon work supported by the Department of Energy under Award Number DE-SC0010838 Disclaimer This presentation was prepared as an account of work sponsored by an agency of the United States Government Neither the United States Government nor any agency thereof nor any of their employees makes any warranty express of implied or assumes any legal liability or responsibility for the accuracy completeness or usefulness of any information apparatus product or process disclosed or represents that its use would not infringe privately owned rights Reference herein to any specific commercial product process or service by trade name trademark manufacturer or otherwise does not necessarily constitute or imply its endorsement recommendation or favoring by the United States Government or any agency thereof The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Back Up
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
MegX (Marine Ecological GenomiX) (Pre-MIGSMIMSMIENS)
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Information in Full Text Full text article Methods section1 Further details for all methods used in this study are provided in Supplementary Information O algarvensis specimens were collected off Capo di Sant Andrea Elba Italy Supplementary Material (pdf) Juvenile and adult Olavius algarvensis specimens were collected in May and September 2004 from 56 m water depth in silicate sediments around sea grass beds of Posidonia oceanica in a bay off Capo di Santrsquo Andrea Elba Italy (42deg4826N 010deg0828E)
1Symbiosis insights through metagenomic analysis of a microbial consortium Woyke T et al Nature 443 950-955 (26 October 2006) doi101038nature05192 2httpwwwnaturecomnaturejournalv443n7114extrefnature05192-s1pdf
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Metadata in Reference
Faiz O Colak A Saglam N Canakccedili S Belduumlz AO
J Biochem Mol Biol 2007 Jul 3140(4)588-94
Information scattered throughout article probably in reference
More specific information given in the conclusion
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Back Up
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
MegX (Marine Ecological GenomiX) (Pre-MIGSMIMSMIENS)
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Information in Full Text Full text article Methods section1 Further details for all methods used in this study are provided in Supplementary Information O algarvensis specimens were collected off Capo di Sant Andrea Elba Italy Supplementary Material (pdf) Juvenile and adult Olavius algarvensis specimens were collected in May and September 2004 from 56 m water depth in silicate sediments around sea grass beds of Posidonia oceanica in a bay off Capo di Santrsquo Andrea Elba Italy (42deg4826N 010deg0828E)
1Symbiosis insights through metagenomic analysis of a microbial consortium Woyke T et al Nature 443 950-955 (26 October 2006) doi101038nature05192 2httpwwwnaturecomnaturejournalv443n7114extrefnature05192-s1pdf
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Metadata in Reference
Faiz O Colak A Saglam N Canakccedili S Belduumlz AO
J Biochem Mol Biol 2007 Jul 3140(4)588-94
Information scattered throughout article probably in reference
More specific information given in the conclusion
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
MegX (Marine Ecological GenomiX) (Pre-MIGSMIMSMIENS)
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Information in Full Text Full text article Methods section1 Further details for all methods used in this study are provided in Supplementary Information O algarvensis specimens were collected off Capo di Sant Andrea Elba Italy Supplementary Material (pdf) Juvenile and adult Olavius algarvensis specimens were collected in May and September 2004 from 56 m water depth in silicate sediments around sea grass beds of Posidonia oceanica in a bay off Capo di Santrsquo Andrea Elba Italy (42deg4826N 010deg0828E)
1Symbiosis insights through metagenomic analysis of a microbial consortium Woyke T et al Nature 443 950-955 (26 October 2006) doi101038nature05192 2httpwwwnaturecomnaturejournalv443n7114extrefnature05192-s1pdf
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Metadata in Reference
Faiz O Colak A Saglam N Canakccedili S Belduumlz AO
J Biochem Mol Biol 2007 Jul 3140(4)588-94
Information scattered throughout article probably in reference
More specific information given in the conclusion
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Information in Full Text Full text article Methods section1 Further details for all methods used in this study are provided in Supplementary Information O algarvensis specimens were collected off Capo di Sant Andrea Elba Italy Supplementary Material (pdf) Juvenile and adult Olavius algarvensis specimens were collected in May and September 2004 from 56 m water depth in silicate sediments around sea grass beds of Posidonia oceanica in a bay off Capo di Santrsquo Andrea Elba Italy (42deg4826N 010deg0828E)
1Symbiosis insights through metagenomic analysis of a microbial consortium Woyke T et al Nature 443 950-955 (26 October 2006) doi101038nature05192 2httpwwwnaturecomnaturejournalv443n7114extrefnature05192-s1pdf
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Metadata in Reference
Faiz O Colak A Saglam N Canakccedili S Belduumlz AO
J Biochem Mol Biol 2007 Jul 3140(4)588-94
Information scattered throughout article probably in reference
More specific information given in the conclusion
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2007 The MITRE Corporation ALL RIGHTS RESERVED
Metadata in Reference
Faiz O Colak A Saglam N Canakccedili S Belduumlz AO
J Biochem Mol Biol 2007 Jul 3140(4)588-94
Information scattered throughout article probably in reference
More specific information given in the conclusion
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Critical Assessment of Information Extraction in Biology BioCreative 2004-05 27 teams - Organizers MITRE CNB (now CNIO1) NCBI - Curators GOA (Camon Lee Apweiler) BioCreative II 2006-07 44 teams - Organizers CNIO MITRE NCBI - Curators MINT IntAct BioCreative II5 2008-2009 15 teams - Organizers CNIO MINT Elsevier MITRE BioCreative III 2009-2010 23 teams - Organizers U Delaware NCBI CTD2 CNIO MITRE Colorado BioCreative IV 2013 24 teams - Organizers U Delaware NCBI CNIO MITRE CTD Colorado BioCreative V 2015 planning underway BioCreative IIIIV is funded by NSF grant DBI-0850319
1Spanish National Cancer Center 2Comparative Toxicogenomics Database
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
Minimum Information Checklists from Genomic Standards Consortium
Yilmaz et al Nat Biotechnol 2011 May29(5)415-20 doi 101038nbt1823
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml
copy 2014 The MITRE Corporation ALL RIGHTS RESERVED
megX EnvO-Lite Annotation
httpwwwmegxnethabitatshabitatshtml