Date post: | 13-Apr-2017 |
Category: |
Science |
Upload: | david-remsen |
View: | 40 times |
Download: | 2 times |
universal Biological indexer and organizer 1
New Dimensions in Managing Biological Information @ the MBLWHOI LIBRARY
David Remsen
June 27, 2006
universal Biological indexer and organizer 2
All accumulated information of a species is tied to a scientific name, a name that serves as a link between what has been learned in the past and what we today add to the body of knowledge.
- Grimaldi & Engel, 2005, Evolution of the Insects
universal Biological indexer and organizer 3
a name that serves as a link between what has been learned in the past
From T.E. Glover, The Fishes of Southwestern Japan, c.1870
universal Biological indexer and organizer 4
Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
…and what we today add to the body of knowledge.
universal Biological indexer and organizer 5
universal Biological indexer and organizer 6
The challenge of names as keywordsFinding this…
Type keyword…
With this…
universal Biological indexer and organizer 7
Names – the only universal metadata for Biology
Names offer a logical way to search for and index content
• Names annotate data objects• All names annotate all data objects• A compilation of all names ever used is the foundation of a universal index for biology• or for a semantic web for biology
universal Biological indexer and organizer 8
• Many names refer to one concept• Vernacular concept• Lexical or Nominal synonym• Nomenclatural synonym• Taxonomic Synonym
• Single name refers to many concepts• Homonyms• Taxonomic concepts• Vernacular concepts• Taxonomic Groups/Classifications
The Taxonomic Names Problem in Biology
universal Biological indexer and organizer 9
Many to One: Vernacular Concepts
• Equivalence implicit through co-occurrence
universal Biological indexer and organizer 10
Many to One: Lexical Synonyms
Many to One: Nomenclatural Synonyms
universal Biological indexer and organizer 11
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Retention of lexical & nomenclatural variation
Loligo pealeiiLoligo pealiiLoligo pealei
Doryteuthis pealei
universal Biological indexer and organizer 12
Peranema – the fern
One to Many: Homonyms
Peranema – the euglenid
universal Biological indexer and organizer 13
Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Taxonomic Concept
universal Biological indexer and organizer 14
LibrariesPublishers
MuseumsFederal Agencies
Name IR impediments in current systems: NLM, JSTOR
universal Biological indexer and organizer 15
Name IR impediments in current systems: OBIS
One organism
4 scientific names
4 maps
We want one map
universal Biological indexer and organizer 16
• Basis for Relationships: Facts• Vernacular concept• Lexical or Nominal synonym• Nomenclatural synonym• Homonyms
• Basis for Relationship: Opinion• Taxonomic Synonym• Vernacular concepts• Taxonomic Groups/Classifications
Division of Concepts
universal Biological indexer and organizer 17
Lexical SynonymsNomenclatural SynoymsVernacular Names
Taxonomic HierarchiesTaxonomic Synonyms
Primary Components of uBio
Indexes to content
Indexes to taxonomic views
universal Biological indexer and organizer 18
NameBank: An index of names and sources
universal Biological indexer and organizer 19
ClassificationBank
An index of taxon concepts
universal Biological indexer and organizer 20
Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
Fitting In
universal Biological indexer and organizer 21
Fitting In: A datacentric perspective
universal Biological indexer and organizer 22
Network Service :Attribution
• Every datum sent out via service is logged– nameBankID– datestamp– Client IP– Calling method– requestorIP
• <client optional>
universal Biological indexer and organizer 23
Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
Tools and Applications: FindIT
• Is trainable
• Locates names & authorities
• Finds names it doesn’t know
• Finds names mangled by OCR
universal Biological indexer and organizer 24
Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
Tools and Applications: LinkIT
universal Biological indexer and organizer 25
Applications
universal Biological indexer and organizer 27
Taxonomic intelligence applied to search
Synonymies expand the scope of queries
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
universal Biological indexer and organizer 28
uBioRSS: Embedding taxonomies into literature retrieval
universal Biological indexer and organizer 29
Embedding uBio into remote services: uBioRSS
universal Biological indexer and organizer 30
Taxonomic hierarchies enhance data browsing
• Birds of the Belgian Congo
• 4500 pages• One page has a
species of dipteran• How would someone
interested find it?• 50,000+ Diptera
species to choose from
Both enhancements apply to all name-annotated content
universal Biological indexer and organizer 31
uBio Portal: Building communities, enabling connections
universal Biological indexer and organizer 32
Elements of the PortalIndexing power from NameBank
universal Biological indexer and organizer 33
Alternative names
Vernacular names
Expert view
More or less specific
Suggestions & corrections
Indexing power from NameBank
universal Biological indexer and organizer 34
Results from an array of resources
universal Biological indexer and organizer 35
Additional information from specific projects
universal Biological indexer and organizer 36
content certified linkouts to authoritative resources
XML source
Additional information from specific projects
universal Biological indexer and organizer 37
Text source
Additional information from specific projects
universal Biological indexer and organizer 38
• data from various sources may be merged
• red dots on the maplink back to the website thatprovided the geographical co-ordinates
Specimen distribution data from remote sources