Date post: | 13-Jan-2015 |
Category: |
Education |
Upload: | andrew-treloar |
View: | 1,208 times |
Download: | 1 times |
Data Management: International challenges, National Infrastructure, and Institutional Responses - an Australian Perspective
Dr Andrew TreloarDirector of Technology
Australian National Data Service
INTERNATIONAL CHALLENGES
Inconvenient data
DOI: 10.1098/rsta.2005.1569
Imprisoneddata
DOI 10.1098/rsta.2006.1793
Invisible data
DOI 10.1098/rsta.2006.1793
Inaccessible data
Incomprehensible data
ands.org.au 7
Survey ID Ind. Cat.(O) T-PC F-Views A-Convenience
12345 O Y a sa
Date Depth (m) Temperature (Celsius) Salinty (ppt) Sigma -T (kgm-3)
30/10/80 10 -1.875 34.555 27.841
Date Depth Temperature Salinity Density
30/10/80 10 -1.875 34.555 27.841
8
Summary Not a first class object Unmanaged Disconnected Unfindable Unreusable
Why re-use data? Efficiency Validation Integrity Value for money Self-interest
10
Astronomy case study Hubble Space Telescope (HST) operating since 1990 Observations are proposed, and if accepted, data is collected and
made available to the proposers – who then write a research paper
Each year around 1,000 proposals are reviewed and approximately 200 are selected, for a total of 20,000 individual observations
Data is stored at the Space Telescope Science Institute and made available after embargo period
There are now more research papers written by “second use” of the research data, than by the use initially proposed
11Source: http://archive.stsci.edu/hst/bibliography/pubstat.html
Cancer micro-array trial case study Piwowar, et. al., “Sharing Detailed Research Data Is
Associated with Increased Citation Rate” http://www.plosone.org/article/info:doi/10.1371/journal.pone.
0000308 Looked at the citation history of cancer microarray
clinical trial publications Found that publicly available data was associated with
a 69% increase in citations, independent of journal impact factor, date of publication, and author country of origin
12
Alzheimer’s Disease NeuroImaging Initiative Collaborative effort to find
brain biomarkers for Alzheimer’s disease
Key: All brain scans and other data freely available to scientific community without embargo.
Over 3K full downloads and 1M scan downloads by over 400 investigators world-wide
Over 100 publications13
http://www.fnih.org/work/areas/chronic-disease/adni
Institut Douglas CC BY-NC-ND
14
NATIONAL INFRASTRUCTURE
National approaches Number of different countries: UK, US, DE, NL Different environments => different ecosystems
and so some local tradeoffs But some common themes emerging:
Do the things that only you can do Be the ‘voice for data’ Prime the pump
Australian National Data Service An initiative of the Australian Government being
conducted as part of the National Collaborative Research Infrastructure Strategy ($A24M) and the Super Science Initiative ($A48M)
A collaboration between Monash University, the Australian National University and CSIRO
Nearly 50 staff, funded to mid 2013 More researchers re-using more data more often Data as a first-class object
ands.org.au 16
ANDS is enabling the transformation of:
Data that are: Unmanaged Disconnected Invisible Single use
17
Collections that are: Managed Connected Findable Reusable
so that Australian researchers can easily discover, access and re-use data
18
Defining characteristics of ANDS Building national services Engaging with institutions not researchers (mostly) Working within funding constraints
use, not amount! Building the Australian Research Data Commons
20
ANDS Programs Frameworks and Capability Seeding the Commons Data Capture Metadata Stores ARDC Core Public Sector Data Applications
21
Spending profile
INSTITUTIONAL RESPONSES
24
Driven by Australian Code for Responsible Conduct of Research Equivalent of UKRIO’s Code of Practice for Research:
Promoting good practice and preventing misconduct Takes significant time to get accepted ANDS providing models of good practice Seeding the Commons U->M
Data management policy and planning
25
Retrospective data description Different selection mechanisms Seeding the Commons U->M
Fixing the past
26
Improving internal CRIS systems Better integration Moving beyond publications Better links to data collection descriptions Seeding the Commons, Metadata Stores D->C
27
Facilitating easier/better capture of data and metadata from selected ‘instruments’
Making the right thing easier Improving quality of metadata Data Capture U->M S->R
Fixing the future
28
Describing institutions research data assets Series of metadata stores rollouts plus some
ancillary activity Metadata Stores, Seeding the Commons, Data
Capture D->C I->F
29
30
ONGOING ISSUES
Country-Institution-Discipline Who wins? Who should win?
31
Sustainability, sustainability, sustainability… Institutional activity National services/resources Developed software
32
33
Priming the pump, or continuing to pump? If institutions/researchers/disciplines don’t care,
why should the funders?
Role of Government