Managing Ecological and Biodiversity
Data Using Ecoinformatics:
Taiwan Experience
Chau Chin Lin
Taiwan Forestry Research Institute
Dr. Bill Chang (US NSF)
Persons to Thank First for The
Following Presentation
Dr. Hen-biau King
(TFRI Director 2003-2007)
Ecology:Information of Biocomplexity
Biotic Abiotic Temporal Spatial
Vernacular (FR): Pyrale du maïs
Vernacular (ES): Piral del maíz
Vernacular (DE): Maiszünsler
Diagnosis: Wingspan 26-30mm;
sexually dimorphic;male:
forewings ochreous to dark
brown; female: forewings pale
yellow; …
Foodplant: Zea mais L. 1753
Species: Ostrinia nubilalis (Hübner, 1796)
Family: Pyralidae
Order: Lepidoptera
Class: Insecta
Genus: Ostrinia Hübner, 1825
Vernacular (EN): European Corn-borer
Family: Gramineae
Taxonomic Names
Collection: DGH Lepidoptera
Record id: DGHEUR_003217
Country: France
Coordinates: 03.047˚E 48.730˚N
Date: 28 June 2003
Collector: Donald Hobern
Individuals: 3
Richness:
Spatial /Temporal
Observations
Biotic Interactions
Locus: AAL35331
Definition: acyl-CoA Z/E11 desaturase 1 mvpyattadg hpekdecfed...
Sequence Data
Average Rainfall
Location: 48.82°N 2.29°E Jan Feb Mar Apr ...
182.3 120.6 158.1 204.9 ...
Abiotic
Taxonomic Descriptions
Pheromones of Ostrinia http://www.nysaes.cornell.edu/fst/faculty/acree
/pheronet/phlist/ostrinia.html
Digital Literature
and Web Resources
Synonym: Pyralis nubilalis Hübner, 1796
Biodiversity:Information of Life
All Based on Data
Why Data Management Is Important in
Ecological Research?
http://siliconangle.com/blog/2012/
Data Informs Impacts of Biodiversity Loss on
Ocean Ecosystem Services
Annual
Cumulative
Worm et al., Science 2006
Data Enhances Understanding of The Real World
Understanding this disease requires
knowledge of epidemiology,
genetics, and transmission modes,
along with their ecological contexts.
Integrating ecologically pertinent
data into the chain of information
from the gene to the biosphere will
significantly enhance our
understanding of the natural world.
Whitfield J. 2003 Ape populations decimated by hunting and
Ebola virus. Nature 422:551
However,
Observations/experiments
the real world
Data/Raw data/Dataset
information
Data Collection Is A Hard Work
Planning
Problem
Analysis
and
modeling
Traditional Way of Research
Doesn’t Care About Data
Data Collection
Raw Data
(Michener et al. 1997)
Data Entropy Occurs Without Managing In
form
ati
on
Co
nte
nt
Time
Time of publication
Specific details
General details
Accident
Retirement or
career change
Death
What Data We Have Collected
Slide from Dr. John Porter
For Example: Forest Dynamics Plot Data
Forest Dynamics Plots in Taiwan
16 Plots Around the Island
For Example :
Biodiversity Data
For Example :
Carbon Flux Towers
How Did We Do?
Planning Problem Definition
(Research Objectives)
Analysis
and
modeling
Collection
Original
Observations
Planning
Selection and
extraction
Secondary
Observations
Used
data
What Techniques We Need?
• A framework that enables scientists to
generate new knowledge through
innovative tools and approaches
• For management, archiving, curation,
discovering, retrieval, integrating, analyzing,
and visualization of biodiversity and
ecological data
• It is called “Ecoinformatics”
• Ecological Metadata Language, EML
• Morpho – metadata and data management software
• Metacat – distributed data system
– registries: KNB, UCNRS, OBFS, NCEAS, PISCO, LTER
• EcoGrid and Tool Kit – integrating distinct data systems and networks
• Kepler – grid-enabled scientific workflows
< EML>
Search and Adapt The Existing Tools
Assembling Tools As
An Information Management System
EML Driven IMS
Information
Synthesis
Senor Network
Information Management
QA/QC
ecogrid
Dealing with Data Flow Change
Slide from US LTER
Interpret a pattern
1,000 x daily
Interpret a
number
10 x daily
Dealing with Data Collecting Change
Dealing with Data Deluge
Providing Good Quality Data
Available Online
Capacity Building and Training
Helped from US LTER
International Collaborations
2006
U.S.
LTER
Taiwan TFRI
Malaysia (FRIM)
Kasetsart
University
Thailand
Help Each Other
within EAP!
Apart from software products there have also been a series of publications in both Asian and Western journals,
including TREE, Bioscience and Ecological Informatics
Management, Archiving (Creating Metadata)
Metadata?
Standard for Ecology/Biodiversity: EML
EML Modules
Metadata/Data Depository System
Data Curation Network
Metacat Catalog
Morpho clients
Key
Site metadata system
Web clients
AND
CAP
OBFS
XML output filter
TFRI
ECNU
LNO
SEV SEV?
PISCO
Replication
Harvester
Internet
User-1
User-2 User-3
National Science
Council
Database
Server
Forming A Decentralized National System
Authentication
National GIS
National Park
Agriculture
Forestry
Joining Data Observation Network for Earth
National Center for Ecological Analysis and Synthesis (NCEAS), U.S.A; ; http://www.dataone.org/what-dataone.
is a data repository for sharing and
preserving data
is capable of providing researchers to
access globally distributed, networked
data from a single point of discovery.
is a collaboration among many partner
organizations, and is funded by the
US-NSF.
DataONE
DataONE
DataONE
[Through the knowledge and infrastructure integrates information]
Data Integration Data integration refers to linking research & monitoring
data to the modeling community & vice versa.
Data integration also refers to archiving data from
monitoring, research, & modeling efforts, as well as
making the data easily available for others to access &
use.
http://www.clear.lsu.edu/data_integration/
Return
URL
Data Depository
Metadata
Shared
Data
Registry Data
Site 1
Data
Site 2
Query Data Grid to find
data
Get
Data
Compute grid
Simulation
Model
Algorithm
Service
Broker
(UDDI)
Web
Service
WSDL
Query Service broker to find
services Get Component
Return URL & call functions
Archive output data to
Depository
Archive workflow
Workflow
archive
Toward An Automation of Data Process
Scientific Workflow Approach to Analysis
C RESULTS:
Tables
Maps
Graphs
ASCII
Application-A Case
Luquillo,Puerdo Rico
Pasho,Malaysia
Lienhuachih,Taiwan
Ogawan,Japan
(LDAP)
WebServer
(Apache+PHP)
Data Retrieval
(SQL)
Raw
data
Upload
EML
Document
Metadata
Catalog EcoGrid
Scientific
Workflow
EML + Raw data
Download
Morpho
Metadata
CTFS
Data
Model
Other
Data
Models
Action Items for Individual Ecologists
• Organize, document, and preserve data
for posterity
• Share data
• Collaborate with networks of colleagues to
bring together heterogeneous datasets to
address larger scale questions
• Address data management issues with
students and peers
Data Sharing
• 1.Data policy
What are fair policies for providing access to data?
• 2.Agreements Specification
What controls, embargoes, usage constraints, or other
limitations are needed to assure fairness of access and use?
• 3.Policy Administration
What data publication models are appropriate?
Kaohsiung, Taiwan 2007
Experience Learned: Many hands truly do make "light work!"