Date post: | 14-Apr-2017 |
Category: |
Documents |
Upload: | cyndy-parr |
View: | 303 times |
Download: | 1 times |
Cynthia Parr US Department of AgricultureNational Agricultural Library21 October 2015
Ag Data Commons Adding value to
open agricultural research data
Credit: Phenocam USDA-ARS Hawbecker Farm, PA
Federal directives: Public access to open, machine-readable data
The challenge of agricultural data
• Broad subject areas• Journals not integrated with repositories like
Dryad• Too many existing databases & web distribution
points• Lack of infrastructure for long-tail data• Lack of a neutral, sustainable solution for long-
term multi-institutional projects
3
• Supports Public Access mandates• Holds agricultural research data• Primary audience: researchers• Holds metadata for data held elsewhere• Starting with USDA data but will broaden• Both human and machine access• Can include unpublished data that is ready
for release
Ag Data Commons Prototyping FY 2015
A proposed solution
AG DATA COMMONSSearch &
Knowledge Discovery
Thesaurus &Indexing
Ag Data CommonsRepository
Organization & Curation
Grant management
systems
INGESTION DISSEMINATION
PubAg
DatasetSubmission
Analytics & Tools
Data.govAg Data
Commons Catalog
LegendBuildingAdaptingExisting
Distributed repositories
Forest ServiceGeospatial
Adding value
6
Metadata + data package
DOILinksThesaurus tags
Idiosyncratic data dictionary
Search, services, compliance checking
DKAN http://nucivic.com/dkan/ PRO• Open source community• Drupal modules for basic
CMS functions • Integrated CKAN catalog• Feeds Data.gov• Basic metadata already
supported
CON• Not designed for scientific
data or scientists• No links to literature• No Digital Object
Identifiers• Doesn’t handle dataset
relationships• Metadata inadequate for
compliance checking & re-use
• Lacks preservation
Metadata StandardsCore Metadata Schema
POD 1.1 (Project Open Data)https://project-open-data.cio.gov/
Related Scientific Metadata & Data Standards (e.g.)ISO 19115 (GIS Data, FGDC)https://www.iso.orgEML (Ecological Metadata Language)https://knb.ecoinformatics.org/#tools/emlMiXS GSC (Genomic Standards Consortium)http://gensc.org/projects/mixs-gsc-project/Darwin Core (Biodiversity standards)http://rs.tdwg.org/dwc/
Controlled Vocabularies
• NALT – National Agricultural Library Thesaurus http://agclass.nal.usda.gov
GACS Global Agricultural Concept Scheme
• Biological Taxonomy
• Gene Ontology (GO) http://geneontology.org/
• Environments Ontology EnvO, etc.
Relevant for Agriculture
• Help create a semantic web• SKOS (Simple Knowledge Organization System): W3C
recommendation, or RDF
Credit: AIMS--FAO
https://data.nal.usda.gov/
Launched this week
Adding even more value
Structured methods metadata
Shared data dictionary
Semantic data dictionary
Adding even more value
Assist application launch
Find related data
Integrate/link related data
= help build the knowledge graph
ISO 16363
Trusted repository requirements
with Adam Kriesbergand Ricky Punzalan University of Maryland
Acknowledgements
Susan McCarthy, NAL – KSDUrsula Pieper, NAL – ISDQing Qu, NAL – KSD contractor Jeff Campbell – NAL – KSDJaylen Nathwani, NAL – student internNüCivic, Angry Cactus TeamJocelyn McNamara -- NAL – KSD contractorKerry Huller – UMD graduate fellow Erin Antognoli – UMD graduate fellow Adam Kriesberg – UMD postdoctoral fellow