RDAP14: DataNet Federal Consortium Update

Post on 11-May-2015

530 views 1 download

Tags:

description

Research Data Access and Preservation Summit, 2014 San Diego, CA March 26-28, 2014 Mary Whitton, Project Manager, Datanet Federation Consortium

transcript

National Science Foundation Cooperative Agreement: OCI-0940841

Update March 2014Mary Whitton, Project Manager

Reagan Moore, PI

Who is DFC ?

What does DFC do?

• Federate to enable collaboration– Federation of iRODS-based data grids– Interoperability for federation with other systems

• Enable reproducible science– Workflows as first class data objects; provenance

• Build on policy-based data system (iRODS)– Best practices in curation, archiving– Automated data grid administration functions

ProductionReady

ProductionReady

Data Producers

Data Users

Curators,Archivists

Data Center

Managers

ProductionReady

Data Producers

Data Users

Curators,Archivists

Data Center

ManagersSustain-ability

Interoper-ability

SW Quality & User Community

Version 4.0 release March 31

Sustainability

SW Quality & User Community

Sustainability

iRODS User Meeting June 18-19, 2014 Boston

Federating across systems

Interoperability

DataONE member node looks like a another iRODS grid to iRODS user

Capability has been demonstrated

DataONE Member

Node

iRODS Federated

Grid

iRODS Data Grids

Interface (via APIs) to DataONE

Cloud Storage

Federating across systems

Interoperability

DataONE Member

Node iRODS Grid

DataONE Coordinating

Node

Interface (via APIs) to DataONE

iRODS grid looks like a DataONE member node to a DataONE user

Work is underway

Federating across systems

Interoperability

iRODS Grid

DataVerse Network

DataVerseDataVerse

Work is underway

What our users want

• Data discovery• Data access from a workflow• Data manipulation (parsing of a data format)• Data transformation

– converting to a new coordinate system)– creating new physical variables by combining other variables– converting to new physical units

• Data subsetting (extracting a sub-region)• Data registration (GIS co-registration)• Data visualization• Creation of derived data products

Data Users

Current work: client side tools

• Ingest-MediaWiki, iDropWeb– Metadata templating, bulk uploads– Database and indexing: plug-in in V. 4.0

• Access control– Access for user defined “group” (my team)

• Integrated access to analysis tools • Interfaces: Jargon, message-passing IF

framework

Data Producers & Users

Standards and PoliciesCurators,Archivists

• Community practices and policies– Unwritten, non-existent

• Developing international standards

• Implementation in iRODS server• Future: tools to make writing

rules easier

Repository management tools

• Best practices embodied in iRODS rules and policies– Trustworthy repository

• Automatic execution– Copy, backup,

checksum– Triggers: time, event

Data Center

Managers• Tools for grid administrators

ProductionReady

Data Producers

Data Users

Curators,Archivists

Data Center

ManagersSustain-ability

Interoper-ability

National Science Foundation Cooperative Agreement: OCI-0940841

www.datafed.org

www.irods.org