Date post: | 01-Sep-2014 |
Category: |
Technology |
Upload: | cyndy-parr |
View: | 1,863 times |
Download: | 0 times |
@cydparr @eol
Cynthia ParrSemantic reasoning workshopWashington, DC 6-7 September 2012
Introduction to eol.org
Whirlwind tour
• What kind of information we have• How we assemble that information• How machines and people interact with EOL• Next steps
>1.1 million taxon pages with content from more than 200 providers, 1000s individuals
5 million content objects
Total of 1,344,711 images 9,586 videos 28,569 sounds
Maps
Literature
EOL has Global Partners and is internationalized
China
Australia
Dutch
South Africa
Costa Rica
Mexico EgyptIndia
Colombia
Peru
Taiwan
Norway
USA
EOL summarizes knowledge
Erosaria caputserpentisSerpent's Head Cowrie
Depth range based on 51 specimens in 2 taxa.Water temperature and chemistry ranges based on 40 samples.
Environmental ranges Depth range (m): -5 - 67 Temperature range (°C): 23.011 - 28.496 Nitrate (umol/L): 0.048 - 0.923 Salinity (PPS): 33.821 - 35.837 Oxygen (ml/l): 4.349 - 4.825 Phosphate (umol/l): 0.088 - 0.228 Silicate (umol/l): 0.983 - 4.026
From Moorea Biocode
From GBIFFrom OBIS
Erosaria caputserpentisSerpent's Head Cowrie
Salinity envelope (n=40)
From OBIS
Cynthia ParrSpecies Pages Group
Global Content Summit17-19 Jan 2011
Richness scores
http://eol.org/pages/704102
Whirlwind tour
• What kind of information we have• How we assemble that information
– Big picture– Subject semantics– Names infrastructure– Curation– Richness score
• How machines and people interact with EOL• Next steps
EOL aggregates and curates
Scientific Databases, includingBHL, GBIF, ALA, INBio, COL, Scratchpads, LifeDesks Scientific Journals Curate
CommentRate, Collect
eol.orgAggregate
Quality control
EOL v2
Plinian Core
DwCdescription
SPMinfoitem
usingDarwin Core Archive flat files as transport mechanism
Sharing process adds semantics to content objects
DistributionMolecularBiology
Multiple topicsTypeInformation
HabitatConservationStatus
ThreatsMorphology
ConservationManagement
TrendsSize
AssociationsUses
TrophicStrategyCyclicity & Life Cycle
PopulationBiologyReproduction
MigrationTaxonomy
LifeExpectancyIdentification
BehaviourEcology
Diseases
0 100000 200000 300000 400000 500000 600000 700000 800000
Number of text objectsSu
bjec
t of t
ext o
bjec
t
Content objects are associated with taxon names
Wikimedia Commons: Physeter macrocephalus
(note we actually have over 3.3 million named pages)
Names from different providers are matched
Animal Diversity Web .... Physeter catodon Linnaeus, 1758 ARKive .................. Physeter macrocephalus Linné BioPix .................. Physeter macrocephalus L. INBio ................... Physeter catodon IUCN .................... Physeter Macrocephalus ITIS .................... Physeter macrocephalus Linnaeus, 1758 MarLIN .................. Physeter macrocephalus Linné NCBI .................... Physeter Catodon Species 2000 ............ Physeter macrocephalus Linnaeus, 1758 Taxon Concept ........... Physeter australasianus Desmoulins, 1822 Wikimedia Commons ....... Physeter macrocephalus WORMS ................... Physeter macrocephalus Linnaeus 1758
Physeter macrocephalus
Taxon concept pages: multiple hierarchies on Names tab
Problem: one taxon may have several names
Animal Diversity Web .... Physeter catodon Linnaeus, 1758 ARKive .................. Physeter macrocephalus Linné BioPix .................. Physeter macrocephalus L. INBio ................... Physeter catodon IUCN .................... Physeter Macrocephalus ITIS .................... Physeter macrocephalus Linnaeus, 1758 MarLIN .................. Physeter macrocephalus Linné NCBI .................... Physeter Catodon Species 2000 ............ Physeter macrocephalus Linnaeus, 1758 Taxon Concept ........... Physeter australasianus Desmoulins, 1822 Wikimedia Commons ....... Physeter macrocephalus WORMS ................... Physeter macrocephalus Linnaeus 1758
Problem: the same name may apply to more than one taxon
EOL curation
• Trust or untrust taxon associations• Add new taxon association• Set preferred hierarchies• Set preferred common names• Leave comments
Coming: Taxonomic concept curation
EOL is not Wikipedia
…though we have more than 212,000 Wikipedia articles and 115,000 Wikimedia images Can’t currently edit within text objects
Whirlwind tour
• What kind of information we have• How we assemble that information• How machines and people interact with EOL
– API– Third party apps– Collections and communities
• Next steps
EOL enables machine interaction
Curate
CommentRate, Collect
eol.orgAggregate
API
Third party apps
Third party applications eol.org/api
People interact with EOL content & each other
Collections
Communities
Studies currently underwaywith University of Maryland
• Cross-cultural study on motivation to engage in citizen science – Dana Rotman
• Interaction among scientists and non-scientists on EOL’s social network – Jae-wook Ahn
• Website traffic analysis to aid conservation communication – Yurong He and Bill Fagan
Whirlwind tour
• What kind of information we have• How we assemble that information• How machines and people interact with EOL• Next steps
Using EOL collections to get computable data
Step 1: Search on EOL for organisms with characteristics of interest. Add each one to an EOL collection. Step 2: Write a program using EOL API methods to retrieve the external database identifiers for the species in that collection.Step 3: Add to your program code to retrieve data using external database APIs.Step 4: Analyze, rinse, repeat.
From Arthur Chapman
Crowd-sourcing for computable data
Lovell and Libby Langstroth, Calphotos Foodwebs.org
Efforts underwayPhylogenetic trees: Collaboration with Open Tree of Life project for draft tree
Computable data challengehttp://eol.org/info/data_challengeRod Page’s Bionames projectAlexandria Archive Institute
Devries and Thessen using DBPedia Spotlight to extract associations among taxa and add to Linked Open Data cloud
Sloan 2 project: Marine computable data
TraitBank ABI proposal
Research wishes
• Collecting nominations for research idea where EOL can help:
http://eol.org/info/wishes_for_researchDUE 15 SEPTEMBER
• Will follow with Rubenstein Fellows call for proposals
Our fundersJohn D. and Catherine T. MacArthur FoundationAlfred P. Sloane FoundationSmithsonian InstitutionMarine Biological LaboratoryHarvard UniversityDavid Rubenstein and other funders and donors
All our content providers and global partners
Volunteer curators and individual contributors via Flickr, Wikimedia, and members of EOL
Thanks to
Summary of EOL page richnessOverall• 950,000 have content• 2 % are rich• ~22 % have only links• to literature
Hot List• 30 % of 75K are rich• Average richness = ~30
• Red Hot List• 56 % of 3K are rich• Average richness = 43
1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 131 1360
100000
200000
300000
400000
500000
600000
1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 1311
10
100
1000
10000
100000
1000000
Partners in order of # taxa contributed to EOL
Num
ber o
f tax
a fo
r whi
ch c
onte
nt is
con
trib
uted
to E
OL
Long Tail in databases contributing to EOL
… viewed on log scale
Taxon page richness algorithm
a (Breadth) b (Depth) c (Diversity)+ +
Breadth: Images, topics of text objects, references, maps, videos, sounds, conservation status
Depth: # words per text object, # words total
Diversity: Sources (partners)
60% 30% 10%
0 – 100, Threshold 40