Challenge of Semantics for the Encyclopedia of Life

Post on 01-Sep-2014

1,767 views 1 download

description

An introduction to EOL (http://www.eol.org) and some of the challenges and possible applications for structured, semantic information about biological organisms. Presented at the kick-off meeting of the NSF-Funded Phenotype Ontology Research Coordination Network.

transcript

Challenges for semantics in EOL

Cynthia ParrNational Museum of Natural HistorySmithsonian Institution

Phenotype Ontology RCNNESCent25 February 2011

http://www.eol.org• All species known to science• Summary descriptions across

biology domains• Freely accessible• Available from a single portal

in a common format• Quality• Always growing

Catalogue of Life

IUCN

GBIF

Biodiversity Heritage Library

Content providersDatabasesJournalsLifeDesksPublic contribution

Curating

CommentingTagging

EOL is a Content Curation Community

Typical species page

Objects can come from many partnersObjects are sorted by topic and by taxonEach partner gets credit

http://www.eol.org/content_partner

Curation, Comments, Tags

Not

Statistics

2.8 million pages – one (or more) per taxon

2 million data objects

500 thousand pages with objects

100+ partner databases

700 curators/1000s contributors/~46,000 members

http://NodeXL.codeplex.com

Schema

Very coarsely structured33 subjects (TDWG Species Profile Model)

No numeric dataMinimal controlled vocabulariesAPI

Corvidae

We have an infrastructure . . .Aggregation mechanismsNames resolutionCuration mechanismsPublic and machine interfaces

Version 2 (August) vastly improved support for community interaction

Version 3 (???)

Rich page calculations

TaxonKey 1 Value

Key 2 Value

Key 3 Value

Key 1 Unit Label URI

Key 2 Unit Label URI

Key 3 Unit Label URI

Possible path to semantics

What could we do?

Organize info on EOL pages

Index by taxonSort into one of the 33

SPM subjectsImprove discoverability

Serve data by API or query interface

“Give me all the information you have about the elbow joint and life histories in rodents”

Make the whole page semantically browsable (LOD: linked open data)TaxonText blobsCharacter dataMetadata

Consistency checks

CuratorsCrowd-sourcingReasoning…

… inferring summaries….mining for patterns?… hypothesis testing?

ievobio.org

Image credits

Michal Koupý Lorraine PhelanDavid J Patterson Dmitry Mozzherin