Date post: | 18-Jan-2016 |
Category: |
Documents |
Upload: | shanon-ferguson |
View: | 220 times |
Download: | 0 times |
Australia’s Virtual Herbarium:
Medium to long-term benefits
from distributed biodiversity
information systems
Austalia’s Virtual Herbarium
• Is an idea
• Is a tool for data access
• Is not the answer
The AVH as a framework
• Will dominate herbarium activity and priorities for the next 5 years
• Data management• Data exchange• Curation priorities• Specimen management• Loans and exchanges
The AVH as a framework
• Will involve all major Australian herbaria
• Common information standards• Specimen data exchange• Common national census• Division of labour• New visualization tools• New analysis tools• New botanical products and services
The AVH
• a prototype• not terribly sophisticated technically• replicated query engine (portal)• interrogating distributed data
providers (URLs)• implementing common schema
through a limited set of access points (gen./sp.)
The AVH
• Illustrates how federated systems might evolve in heterogenous environments:
• the development and application of community standards
– HISPID, XML
• the adoption of open source solutions – Mapserver, Perl, PHP etc.
• Similar solutions are being used to federate ENHSIN, SpeciesAnalyst, DIGIR, etc.
Collecting specimens
The work of herbaria
Herbarium Specimens
Botanical literature
Specimen Data Capture
Public Reference Herbarium
What is a Virtual Herbarium?
• The physical resources and biological information of a herbarium represented digitally
• On-line access to herbaria and to botanical information managed by herbaria
• Integrated access to botanical information from various sources in a herbarium and other on-line botanical information
What is the AVH?
• A collaborative project of the Australian Herbarium community, providing:
• Partnership and shared access to data• Real-time access to current working data• Shared access to common authority files• A shared development environment• Opportunity to shared data-hosting,
archiving and off-site backup.• Co-ownership of the final product
Where is the AVH?• Spread across
Australian herbaria• Data distributed;
resides with custodians• Each herbarium has a
portal to receive requests to and deliver data
• A common single query AVH interface in each herbarium polls all herbaria
Major Australian Herbaria
Who are the participants?
State Herbarium of South Australia
Queensland Herbarium
Australian National Herbarium
Northern Territory Herbarium
Tasmanian Herbarium
Industry Partner:KE Software
National Herbarium of Victoria
National Herbarium of New South Wales
Western Australian Herbarium
Australian Biological Resources Study
Why is there an AVH?
• Pressure on Herbaria to work more efficiently
• Demand for access to larger amounts of data
• Demand to access data more quickly• Demand to view data in different ways• Pressure on herbaria to be and appear
more responsive to community needs
What is the Problem?
• > 20,000 species of higher plants• > 64,000 available names• Extensive synonymy (3 - 4 names per
species)
• 8 major government-funded herbaria• Similar number of university herbaria
• > 6,500,000 specimens in Aust. herbaria
• 50-100 data elements per specimen• Several Kb per specimen (excl. images)
Holdings of Aust. Herbaria
National Herbarium Collectiondatabase status
‘Us’
Where is the data?
• In each herbarium (largest 1.3 million specimens)
• Pooling data centrally not acceptable for operational, political and emotional reasons.
• We need a distributed data management and access solution, maintaining and ensuring custodial responsibility
Where is the data?
• Images compound the problem• Several Kb and up for live plant images
(possibly 100,000 available)• Specimen images need high resolution,
up to 20 Mb or more• Need to be sub-sampled for web
display• At least 100,000 type specimens• Ideally all 6.5 million specimens should
be done
Who runs the AVH?
• The Council of Heads of Australian Herbaria (CHAH).
• The Herbarium Information Systems Committee (HISCOM)
– IT staff at herbaria (technology)– Botanical staff at herbaria (content)– Data entry staff at herbaria (content)– Scientific staff at herbaria (validation)
Aust. & NZ Environment & Conservation Council (ANZECC)
• Government committee of Commonwealth and State/Territory Environment Ministers
• Accepted community wanted the product• Funding options and regional support• Working group• AVH Board and Trust
• (management through Environment Australia)
“The Agreement”
• $10 million project over five years• Capture new data and validate old• State/Territory to contribute amount
relative to specimens to be databased/validated
• $4 million Commonwealth + $4 million State/Territory + $2 million private
• Sharing data critical to cost • (cf. $16 million to do each specimen)
How does the AVH work?
• On a number of different levels:
•Politically•Administratively•Technically•Scientifically•Emotionally
Race to database
Need for semantic standard recognized
HISPID
Exchange Distributed query
Standard syntax
Need for common semantic schema recognized Botanica
l ontology?
Evolution of the AVH
How does the AVH work?
The technology• Currently very simple architecture
and technology• Increase in complexity and ‘bulk’ is
inevitable• Can not avoid engaging computer
scientists and the computer industry• Optimize data storage• Optimize data access and delivery• Optimize analysis and visualization• Optimize knowledge discovery
AVH General Architecture
The pilot: distribution of Acacia aneura, mulga
The pilot: distribution of Acacia aneura, mulga
Acacia aneura: Distribution of specimens from each herbarium
Overlays
Geocode accuracySurvey data
Example HISPID data export in XML
A Herbarium Database Structure
Who uses the AVH?
• The participating herbaria get access to all the data at the highest precision.
• Custodians retain rights on data release• General agreement to minimize restriction
• Public access filter restricts access to work in progress, sensitive locality data, etc.
• Password controlled locally• Simple httpd access control• No encryption
Who uses the AVH?
• Basic public access available to:• Access to conservation agencies,
environmental decision makers, etc• Research and education• Public general interest
• Detailed access to large chunks of data• One stop shop• Application through project proposal to CHAH• Applications to individual herbaria
discouraged– Respecting data custodianship
“Greening the Grainbelt” Uses
Uses
ROTAP ferns and fern allies
Insufficiently known
Rare
Vulnerable
Endangered
Presumed extinct
ROTAP ferns and fern allies
Cyathea exilis
Tectaria devexa
Cyathea exilis
Whence the AVH?
• A new era of integrated access to botanical information
• New ways of visualizing data form different sources
• New ways on managing and validating data across remote databases
• More automation, more speed, higher throughput
Added extras - the real AVH
• Stage 1: databasing (dots on maps)• Plus map overlays, precision flags,
spatial queries, pretty interfaces, etc.• Conflicting taxonomies - towards a
National Census – the “Consensus Census”
• Stage 2+: images, descriptions, identification tools
• Multiple resources and options (cf. library)
Botanical illustrationsPlus
Plus
High resolution image oftype specimen of Austrobaileyadownloaded over the Internetfrom the Herbarium of theNew York Botanical Garden
Type Images on demand
But...
Tackling fungal biodiversity
• Problem: 250,000 spp., 5% known, few herbarium collections
• A solution: Fungimap
• Community mapping of 100 common species by 600 volunteers
• Distribution and habitat data leads to better conservation and systematics
BIG But...
Australian eFloras and other digital products
Australian eFloras and other digital products
Some challenges
• Identifications patchy• Inadequate specimens• Work in progress / Curation lag
• Lack of a national “Consensus Census”• Interstate differences• “Problem” families and genera
• > 35% herbarium unsuitable / unusable• Unidentifiable / qualified identifications• Vague / imprecise locality data
• Records represent presence only data
CPBR projects benefiting
• Basically anything spatial needing defensible dots or blobs on maps
• Rare plants; Conservation• Australian flora distributions• General biogeography; Weed
biogeography• Remnant vegetation; Revegetation• Phylogeography of Australian plants
• Outreach• On-line Floras• Interactive Keys
Why it will work
• Communication - CHAH, few herbaria• Collaboration - long-standing, data
and specimen sharing, overcoming Australia’s Federal/State system
• Champions – government, management, staff, public
• Lobbying and profile of herbaria• Relevance and utility of product• And now…we need to maintain
commitment to project
Current Developments
• need to join communities into larger “federations”
• ultimately part of GBIF• distributed generic portals (DiGIR)• utilizing discovery (UDDI) of
published web services– for specimens, taxonomy, coverages, etc.…
• exchanging complex queries and result sets encapsulated as XML (SOAP/XMLP)
Current Developments
• rely on the existance of an extended community schema– abcd, a common subset (Darwin core) of
elements – simple thesauri
• Incorporation and discovery of ontologies and semantic networks will have to wait a while…
AcknowledgementsState Herbarium of South Australia
Queensland Herbarium
Australian National Herbarium
Northern Territory Herbarium
Tasmanian Herbarium
Industry Partner:KE Software
National Herbarium of Victoria
National Herbarium of New South Wales
Western Australian Herbarium
Australian Biological Resources Study