+ All Categories
Home > Technology > PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Date post: 08-May-2015
Category:
Upload: cgiar-generation-challenge-programme
View: 1,035 times
Download: 1 times
Share this document with a friend
58
Generation Challenge Programme Workshop, 13 th January 2014 In Plant and Animal Genomics Conference, San Diego, USA, 11-15 th January 2014 Elizabeth Arnaud 1 *, Luca Matteis 1 , Marie Angelique Laporte 1 , Herlin Espinosa 2 , Glenn Hyman 2 , Rosemary Shrestha 3 , Arlett Portugal 4 , Pierre Yves Chibon 5, Medha Devare 6 , Akinnola Akintunde 7 , Jeffrey W. White 8 , Mark Wilkinson 9 , Caterina Caracciolo 10 , Fabrizio Celli 10 , Graham McLaren 4 1 Bioversity International, France, 2 International Center for Tropical Agriculture (CIAT), Colombia, 3 Genetic Resources Program (GRP), Centro Internacional de Mejoramiento de Maíz y Trigo (CIMMYT), Mexico, 4 Generation Challenge Programme (GCP) c/o CIMMYT, 5 UR Plant Breeding, Univ. of Wageningen, The Netherlands, 6 International Maize and Wheat Improvement Center - South Asia Regional Office (CIMMYT-SARO), NepaL, 7 International Black Sea University (IBSU) Georgia, 9 Centro de Biotecnología y Genómica de Plantas UPM-INIA, Spain, 10 Food and Agriculture Organization (FAO) of the United Nations, Office for Partnership, Italy The Crop Ontology a resource for enabling access to breeders’ data http://www.cropontology.org
Transcript
Page 1: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Generation Challenge Programme Workshop, 13th January 2014 In Plant and Animal Genomics Conference, San Diego, USA, 11-15th January 2014

Elizabeth Arnaud1*, Luca Matteis1, Marie Angelique Laporte1, Herlin Espinosa2, Glenn Hyman2, Rosemary Shrestha3, Arlett Portugal4, Pierre Yves Chibon5, Medha Devare6, Akinnola Akintunde7, Jeffrey W. White8, Mark Wilkinson9, Caterina Caracciolo10,

Fabrizio Celli10, Graham McLaren4

1Bioversity International, France, 2International Center for Tropical Agriculture (CIAT), Colombia, 3Genetic Resources Program (GRP), Centro Internacional de Mejoramiento de Maíz y Trigo (CIMMYT), Mexico, 4Generation Challenge Programme (GCP) c/o CIMMYT, 5 UR Plant Breeding, Univ.

of Wageningen, The Netherlands, 6 International Maize and Wheat Improvement Center - South Asia Regional Office (CIMMYT-SARO), NepaL, 7International Black Sea University (IBSU) Georgia, 9 Centro de Biotecnología y Genómica de Plantas UPM-INIA, Spain, 10Food and Agriculture

Organization (FAO) of the United Nations, Office for Partnership, Italy

The Crop Ontology a resource for enabling access to

breeders’ data

http://www.cropontology.org

Page 2: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

CGIAR Crop Lead Centers

Since 2008

Page 3: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

The scientific context

Page 4: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Understanding the relationships between plant genotype and environment, develop the adaptive traits to respond to biotic and abiotic stress, promote the adequate agronomic practices to cultivate it and understand the heritability of adaptive traits

The Knowledge domain: plant breeding

Page 5: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Env

ironm

enta

l C

ondi

tions

Light Water

Nutrients Molecular

Physiological

Chemical

Developmental Agronomic

Dimensions of a phenotype

Temperature Soil

Socio Economic

Cultural

Time

Understanding the GxE interaction and the

heritability of adaptive traits

Page 6: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

High Throughput Data Generation needs standardized trait concepts

• Next Generation Sequencing (NGS) platforms for detailed analysis of largest plant genomes

• Phenotyping platforms measure a wide range of structural and functional plant traits at the same time as collecting meticulous metadata on the environment and experimental setup [Fiorani and Schurr, 2013]

•GWAS typically focus on associations between a single-nucleotide polymorphisms (SNPs) and traits.

Page 7: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Developing the Crop Ontology content as

a Community of Practice

Page 8: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Harmonization and access to data

Maize Kernel Colour

Rice grain or caryopsis colour

Bean pod color

• Breeders’ data are often unstructured data - Complex free text used for phenotypes description

• No semantic coherence : • Same trait given different

names by scientists

• One trait named the same way for various species but refers to different plant structures

• Data and metadata are NOT

interoperable and often not online

‘Fruit colour‘

Page 9: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Integrated Breeding Platform www.integratedbreeding.net

• one-stop shop for services to design and carry out breeding projects – Integrated breeding workflow

• Breeders’s databases share a common schema and are being published online

• IB Fielbook is available with a standard list of traits per crop

Page 10: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Phenotype

It is a composite of an entity (e.g. fruit) and an attribute (e.g. shape) with a value (e.g. round):

Entity + Attribute = Trait

Entity + (Attribute + Value) = Phenotype (observed)

fruit + (shape + round) = fruit shape round

-> round fruit is the phenotype

Page 11: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

A range of controlled vocabularies

From the controlled vocabularies build valid semantic ontologies consumabke by Web 2.0 Best practices

Web 2.0

Page 12: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Crop Ontology

• Crop Ontology is primarily an application Ontology for fielbooks

• A visualization tool supporting community-based development tool of trait dictionaries and crop specific ontologies

• Compare and validate terms in common

Rosemary Shretha, CIMMYT CO coordinator until 2012,

Page 13: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Community based development process

• Domain experts (breeders, pathologists, agronomists, etc) and Data managers identify the list of concepts

• For an variety evaluation project, Data Managers and breeders produce the IBfieldbook template with the traits and submit new terms

• Crop ontology curators in the Crop Lead centers curate, validate, compile the list and upload on the site

• The Global Crop Ontology Curator curates the crop ontology with the Crop Lead Centers’ curators

• Web development expert maintains the site

Page 14: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Crop curators and associated scientists

Page 15: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Crop Ontology themes

General germplasm information

Phenotype and traits

Plant anatomy and development

Location and environment

Trial management and experimental design

Structural and functional genomics

Page 16: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Traits and Phenotypes

Page 17: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

• Banana • Cassava • Chickpea • Common beans • Cowpea • Groundnut • Maize

Crop Ontology www.cropontology.org

For 2014, adding Barley Lentil Soybean Sweet Potato

• Pearl millet • Pigeon Pea • Potato • Rice • Sorghum • Wheat • Yam

14 CGP crops

Page 18: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Ontology Engineering • With OBO-edit - http://oboedit.org/

• Creating multi-relationships between concepts

• cross referencing with Plant Ontology and Trait Ontology

Page 19: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Trait Description

Page 20: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Crop Trait Dictionary Template simple to share with breeders

Name of submitting scientist Institution Language of submission Date of submission Bibliographic Reference Comments

Method ID Name of Method Describe how measured (method) Growth Stage Field, greenhouse

Scale ID Type of Measure (Continuous, Discrete or Categorical) For Continuous: units of measurement, reporting units, minimum. maximum For Discrete: Name of scale or units of measurement For Categorical: Name of rating scale, Class # - value = meaning

Crop Name Name of Trait Abbreviated name Synonyms (separate by commas) Trait ID for modification, Blank for New Description of Trait How is this trait routinely used? Trait Class

1

n

n

1

Page 21: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Online visualization of Trait dictionaries

Page 22: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Methods & Scales for annotations

• Precomposed relationships between Trait, Methods and Scales required for annotations in phenotype databases

• On going discussion for revising the structure and get the 3 separated in 3 namespaces

Page 23: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Methods & scales for the standard lists of the Breeders’ fieldbook

Visualization & download In Crop database and Fieldbook template

Page 24: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Easy to use the site - Partners published their Trait ontologies

Grape

Soybean

Solanaceae

France

Barley

Page 25: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Multilingual versions of the crop ontologies

Multiple languages

Page 26: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Experimental design ontology

• CROP - PLANTING

• SEED TREATMENT

• IRRIGATION

• FERTILIZER

• PESTICIDE

• SOIL

• BIOTIC STRESS

• ABIOTIC STRESS

• HARVEST-YIELD

Trial management tasks

Akinnola Akintunde, International Black Sea Univ. (IBSU), Georgia Development of the ontology and fieldbook

Medha Devare CSISA-Nepal Coordinator, CIMMYT –SARO Design of the Fieldbook and coordination

Page 27: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Dictionary for Trial Management Concepts

From Medha Devare, CSISA-Nepal Coordinator CIMMYT -SARO

Page 28: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Environmental Ontology

Jeffrey W. White Research Plant Physiologist & Research Leader Arid-Land Agricultural Research Center USDA-ARS, Arizona, USA

Sheryl Porter Coordinator, Computer Research Applications University of Florida, Gainesville, FL, USA

Page 29: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Environment Ontology and Trial management Ontology

Page 30: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

• Improve the current list of concepts •International Consortium for Agricultural System Applications (ICASA) • Integration of a Master list of 600 variables for describing crop management and recording plant responses. • ICASA promotes the use of standards in relation to crop field research and for ecophysiological models. • One objective is the application of ICASA variables by the Agricultural Model Intercomparison and Improvement Project (AgMIP) (http://www.agmip.org/ ).

Environmental Ontology

Page 31: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Synchronization with the Crop databases and IBWS

Page 32: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Synchronization of Crop Ontology with Integrated Breeding Workflow

Graham Mc Laren, Generation Challenge Programme

Rebecca Berrigan, Efficio Technology Service

Luca Matteis, CO Web Site developer, Bioversity International

Arllet Portugal IBP Data Management Leader

Harold Durufle, CO curator, Bioversity International

Page 33: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Application Programming Interface (API)

• Developed by Luca Matteis • Provide access services to 3rd party web sites or software • Support open collaboration and use of the Crop Ontology

Page 34: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Local Databases Breeders & Data Managers

Crop Database Data Manager

Fieldbook Template

Breeders’ Trait Dictionaries

Curation of the Crop Ontology

Cross referencing terms with Plant Ontology &Trait Ontology Submission of new traits through the term tracker

Data Annotation & new terms addition

Page 35: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

IBWS - Key elements of the Logical Data Model to store phenotypic data

Page 36: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Annotation for storing phenotypic data in the IBWS

Property (Trait)- CO_ID Method - CO_ID Scale – CO_ID continuous discrete categorical Class1-value – CO_ID Class2-value – CO_ID Class3-value – CO_ID

A unique combination of IDs for P+M+S+C = A Standard Variable

Is_a_valid_value_of

Data

Term ID

Controlled vocabulary

Requires 3 namespaces

Page 37: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Synchronization flow The IBWS accepts updates sent by Crop ontologies

Schema from Rebecca Berrigan, Efficio LLC

Page 38: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Synchronization flow

Schema from Rebecca Berrigan, Efficio LLC

Crop ontology accepts new addition from local ontologies

Page 39: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

The crop Ontology web site A Concept name server on the Cloud

Luca Matteis, Web developer, Bioversity International

Page 40: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Crop Ontology

Page 41: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

API access by 3rd Party Web sites

[Text]

[Text] API

Genotype Data MS

EU-SOL Solanaceae Breeding DB

Wageningen.

IB Fieldbook

Agtrials -CCAFS International cassava DB

Phenomics Ontology Driven DB (PODD)

IBP Crop Databases

Page 42: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Global Agricultural Trial Repository and database www.agtrials.org

Glenn Hyman, geographer, CIAT

Herlin R. Espinosa G. , web developper, CIAT

Luca Matteis, Web developer, Bioversity International

Page 43: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Global Agricultural Trial Repository http://www.agtrials.org/

1,029 trials for Cassava

• To store evaluation data files described with metadata • To produce an Atlas of the trials

Page 44: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

1. Annotating Evaluation data files

Page 45: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

2. Searching evaluation data files

Agtrials uses the Crop Ontology trait terms

Page 46: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

3. Display the Trial Information

Access to the definition of the Trait in

the Crop Ontology

Page 47: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Luca Matteis, CO Web developer, Bioversity International

Fred Okono, IBP Project Administrator

Brandon Tooke, IBP web developer

Integration of Crop Ontology in IBP

Page 48: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Integration of Crop Ontology in IBP

Page 49: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

CO Semantic Web Compliance

Luca Matteis, CO Web developer, Bioversity International

Marie Angelique Laporte, Ontology development, RDF & SKOS conversion, Bioversity International

Mark Wilkinson, Centro de Biotecnología y Genómica de Plantas UPM-INIA, Spain

Page 50: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Linked Open Data Cloud • A term used to describe a recommended best practice for exposing,

sharing, and connecting pieces of data, information and knowledge • It builds upon standard Web technologies such as HTTP, RDF and

URIs • Rather than using them to serve web pages for human readers, it

extends them to share information in a way that can be read automatically by computers.

• This enables data from different sources to be connected and queried.

Wikipedia

Page 51: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Crop Ontology in the Linked Open Data recommended format

• Conversion from OBO to RDF/SKOS resolvable HTTP URIs

• A conversion into Simple Knowledge Organization System (SKOS) is going on

<http://www.cropontology.org/rdf/CO_324:0000002> a skos:Concept ; rdfs:label "Flag leaf weight"@en ; dc:creator _:b1 ; skos:definition "Weight of the flag leaf (the one just below the panicle)." ; skos:inScheme co:sorghum ; skosxl:prefLabel [ a skosxl:Label ; co:acronym [ a skosxl:Label ; skosxl:literalForm "FLGWT" ] ; skosxl:literalForm "Flag leaf weight"@en ] .

Page 52: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Linked Open Data publishing and Aligning Crop Ontology with

AGROVOC

Caterina Caracciolo, Food and Agriculture Organization (FAO), AIMES, Italy

Fabrizzio Celli, Food and Agriculture Organization (FAO), AIMES, Italy

Luca MatteisBioversity International

Marie Angelique Laporte, Bioversity International

Page 53: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Agrovoc - Agricultural Thesaurus

• 32,000 concepts organized in a hierarchy

• each concept may have labels in up to 22 languages

• is now available as a linked data set published, aligned (linked) with several vocabularies

Page 54: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Release of Agris 2.0 agris.fao.org

• AGRIS bibliographic records contain rich metadata and are largely indexed by AGROVOC FAO’s multilingual thesaurus

Page 55: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

AGRIS 2.0 and Phenotypic Data

1. The AGRIS datasets were converted to RDF creating some 200 million triples. AGROVOC was aligned to other thesauri.

2. Sparql endpoints, web services and APIs were discovered.

3. AGRIS RDF was interlinked – using AGROVOC LOD as a backbone – to external datasets.

• Align Crop Ontology with AGROVOC in SKOS/RDF • Promote the publishing of Phenotypic data into RDF • Objective : Retrieve bibliographic references and data from

phenotypic databases in the mash up site

• AGRIS 2.0 uses the Linked Open Data Methodology to link various source of data in the mash up site

• Proof of concept done with the Collecting mission database of Bioversity International

• 3 steps

Page 56: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Partners collaborating to the informatics and integration formats

• IBFieldbook and IBWS teams and Efficio LLC

• Plant Breeding dept. of Wageningen for the Resource Description Format (RDF)

• CIAT-DAPA, for the synchronization of The Global Repository of Evaluation trials (Agtrials) of CCAFS

• FAO-AIMES for the use of Linked Open data with AGRIS 2.0

Page 57: PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders’ data – E Arnaud

Partners collaborating to the content engineering & the looking forward to a Reference Ontology for plants

• Plant Ontology, Jaiswal Lab., Oregon State University, USA

• Soybase, USDA-ARS, USA • Solanaceae Genomic Network (SGN), USA • Cornell University, USA • Institut National de Recherche d’Agronomie (INRA),

France • Centro de Biotecnología y Genómica de Plantas UPM-

INIA, Spain • POLAPGEN, Poland • Australian Plant Phenomics Data Repository


Recommended