Monica Omodei RDAP11 Policy-based Data Management;

Post on 27-Jan-2015

102 views 0 download

Tags:

description

Monica Omodei, Australian National Data Service; Policy-based Data Management; RDAP11 Summit The 2nd Research Data Access and Preservation (RDAP) Summit An ASIS&T Summit March 31-April 1, 2011 Denver, CO In cooperation with the Coalition for Networked Information http://asist.org/Conferences/RDAP11/index.html

transcript

Making Connections

1

Research Data Access and Preservation Summit 2011

Monica OmodeiSenior Research Analyst

monica.omodei@ands.org.au

2

What is the Australian National Data Service? Mission: ‘More researchers sharing and

re-using data more often’ Term and funding:

Initially July 2008 - June 2011 with funding from National Collaborative Research Investment Strategy (NCRIS)

Additional funding from the Education Investment Fund (EIF) Super Science Initiative (infrastructure only)

Term extended to June 2013

A collaboration between Monash University, Australian National University and CSIRO

What we are not We are not a data repository/archive (in spite of

our name) We do not include publications in our registry (but

we do support linking to publications) We do not duplicate the rich services provided by

discipline-specific registries An ongoing or legal entity

3

4

ANDS enables the transformation of:

Purpose of ANDS Services To support the Australian Research Data Commons To assist researchers

discover and re-use data relevant to their activities make their data available for re-use

To assist research institutions and agencies manage and publish their research data

To assist government agencies to publish their operational data for use by researchers

To assist cultural and memory institutions publish their collections for use by researchers 5

Strategy – Internal Projects Core Infrastructure

ANDS Registry Research Data Australia Discovery Service DOI Minting Service Vocabulary Service

Policy work Capability building Outreach activities

7

External Projects 120+ projects with 30+ Australian universities – to

enhance current systems feed ANDS registry Projects to develop insitutional repository/registry

solutions for Research Data Software tools for data integration, synthesis and

analysis to demonstrate how different data sets can be combined to enable new research

With national agencies to build persistent identification services (name authorities) for researchers, research projects, locations, vocabularies8

9

Interoperability Strategy (schema level) Develop information model for an inter-disciplinary

registry of data collections (not lossless) Develop and promote the exchange schema to our

constituency Develop cross-walks for common schema used for

datasets eg ISO19115, DDI Have sandbox for testing and QA of first data feeds

[Skip]

Plumbing Information Model - subset of ISO2146-2010

Registry Services for Libraries and Related Organisations

Schema Implementation – RIF-CS (Registry Interchange Format – Collections and Services)

ANDS Registry – developed in-house Research Data Australia Portal – discovery application

currently being redeveloped (solr/lucene)

11

12

ISO 2146 is: A framework for building registry services A very abstract model An object -oriented, relational model Describes not just collections but also the researchers and research activities that surround and are linked to a research data collection – the ‘mesh’

13

ISO2146

Interoperability Strategy (services) Develop harvesting framework which provides

data contributors with flexibility and control OAI-PMH preferred but not required Open ANDS registry content using standard

harvest, query and syndicate protocols and schema

Push our content to Google

14

Interoperability Strategy – Concepts/EntitiesData Connections Strategy“To enable data to be more easily connected with other data and with the broader research enterprise”Establish a set of infrastructure elements to promote use of shared concepts and entities:Exploit these elements in the national discovery portal to group datasets with common concepts/entities – serendipitous discovery

15

Common concepts and entities Researchers and Research Organisations Research Projects and Programs Spatial Location Research Disciplines Scientific and Scholarly Terminlogy Datasets and scholarly articles

16

Partners for Infrastructure

Whenever possible, recurrently funded organisations with mandates to ensure sustainability:

Office of Spatial Data Management (location) Research Funding Bodies (research activity) National Library of Australia (people and

organisations) Australian Bureau of Statistics (research fields -

ANZSRC) Scholarly Societies (terminologies)

Data Connections

Data Connections

Researchers and Research GroupsPartner is National Library of AustraliaAlready supports a Name Authority service for Australian National BibliographyPublic identifier for the researchers public persona – many researchers have NLA identifiers alreadyAlready supports an aggregation service for contributors of information about Australian people and organisationsBrokerage to other id systems (VIAF,ORCID)

What’s hard ? Cleaning data in institutional systems so they use common

unique persistent keys for researchers, research groups, research projects

Research Management Systems and HR systems are usually proprietary, and without APIs

Name authority issue often ignored in IR practice National Name Authority Files have poor coverage of

science researchers Library or admin staff resources in Universities will be

required for identity matching20

Research Activity Infrastructure

Partners are major funding bodies – Australian Research Council, National Health and Medical Research Council

Web service information systems for grants Identifiers, definitive information, Linked Data

URIs Status: Concept design phase…(CERIF, VIVO) In principle agreement

Location Infrastructure

Partner - Office of Spatial Data Management First data set is Gazetteer of Australia 2.0 which is

the combined set of place names with centroid point location from all States and Territories

Now available freely and without charge

Location Infrastructure

Coming soon: WFS-G interface to Gazetteer (Open Geospatial

Consortium Gazetteer Profile of Web Feature Service - WFS 1.1 protocol and GML 3.1 binding)

Best practice web spatial search and browse interface of Gazetteer

Stage 2: boundaries, marine, historical, indigenous, crowd sourcing

Field of Research Infrastructure

Partner - Australian Bureau of Statistics Web service information publication of the official

Australia New Zealand Standard Research Classifications

Identifiers, definitive information, URIs Potential for other classifiers In principle agreement

Terminology/Vocabulary Infrastructure A set of online services to support the creation,

management, and publication of human and machine-readable terminologies for use by the Australian research and higher education sector

Promotes the use of standardised terminology in data and dataset descriptions to enable data integration within and across disciplines

25

ANDS is supported by the Australian Government through the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super

Science Initiative

Executive Director: Ross.Wilkinson@ands.org.au

More Information http://ands.org.au/about-ands.html (About ANDS) http://ands.org.au/researchers/manage-data.html

(Data Management for Researchers) http://ands.org.au/guides/cc-and-data.html (Creative

Commons and Data) http://services.ands.org.au/ (Research Data Australia) http://ands.org.au/guides/content-providers-

guide.html (Content Providers Guide) http://ands.org.au/resource/techdocs.html (Technical

Resources)

More Information http://ands.org.au/dataconnections.pdf (ANDS guide

will be published soon) http://ands.org.au/guides/ardc-party-infrastructure-

awareness.html (ARDC Party Infrastructure Awareness Guide)

http://ands.org.au/resource/topics/ardc-location-infrastructure.pdf (ARDC Location Infrastructure)

http://ands.org.au/guides/ardc-activity-infrastructure-awareness.html (ARDC Activity Infrastructure Awareness Guide)