Date post: | 15-Jul-2015 |
Category: |
Technology |
Upload: | sarah-jones |
View: | 58 times |
Download: | 0 times |
The Horizon 2020 Open Data Pilot
Sarah Jones
Digital Curation Centre, Glasgow
Twitter: @sjDCC
Fot-Net Data Stakeholder Meeting on Open Data and Data Re-use in Horizon 2020, 10th March 2015, ERTICO, Brussels
Funded by:
What is the Digital Curation Centre?
“a centre of expertise in digital information curation with a focus on building capacity, capability and skills
for research data management across the UK's higher education research community”
www.dcc.ac.uk
Benefits and drivers
WHY SHARE DATA (OPENLY)?
Image CC-BY-NC-SA by Wonderwebby www.flickr.com/photos/wonderwebby/2723279491
Science as an open enterprise
https://royalsociety.org/policy/projects/science-public-enterprise/Report
“Much of the remarkable growth of scientific understanding in recent centuries is due to open practices; open communication and deliberation
sit at the heart of scientific practice.”
The Royal Society report calls for ‘intelligent openness’ whereby data are accessible, intelligible, assessable and usable.
Faster scientific breakthroughs
www.nytimes.com/2010/08/13/health/research/13alzheimer.html?pagewanted=all&_r=0
“It was unbelievable. Its not science the way most of us have practiced in our careers. But we all realised that
we would never get biomarkers unless all of us parked our egos and
intellectual property noses outside the door and agreed that all of our data
would be public immediately.”Dr John Trojanowski, University of Pennsylvania
Increased use and economic benefit
UP TO 2008
Sold through the US Geological Survey for US$600 per scene
Sales of 19,000 scenes per year
Annual revenue of $11.4 million
SINCE 2009
Freely available over the internet
Google Earth now uses the images
Transmission of 2,100,000 scenes per year.
Estimated to have created value for the environmental management industry of $935 million, with direct benefit of more than $100 million per year to the US economy
Has stimulated the development of applications from a large number of companies worldwide
The case of NASA Landsat satellite imagery of the Earth’s surface:
http://earthobservatory.nasa.gov/IOTD/view.php?id=83394&src=ve
HORIZON 2020 OPEN DATA PILOT
Image CC-BY-NC-SA by Tom Magllery www.flickr.com/photos/lwr/13442910354
Why open access and open data?
“The European Commission’s vision is that information already paid for by the
public purse should not be paid for again each time it is accessed or used, and that
it should benefit European companies and citizens to the full.”
http://ec.europa.eu/research/participants/data/ ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf
H2020 open data pilot
• Seven areas are participating in the pilot, which correspond to about €3 billion or 20% of the overall Horizon 2020 budget in 2014 and 2015.
• Projects in other areas can opt in on a voluntary basis
Guidelines on Data Management in Horizon 2020
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
• Participants can opt out at proposal stage or during the lifetime of the project
• Reasons for exemption to be explained in the DMP
Which data does the pilot apply to?
Data, including associated metadata, needed to validate the results in scientific publications
Other curated and/or raw data, including associated metadata, as specified in the DMP
Doesn’t apply to all data (researchers to define as appropriate)
Don’t have to share data if inappropriate – exemptions apply
Key requirements of the open data pilot
1. Deposit in a research data repository
2. Make it possible for third parties to access, mine, exploit, reproduce and disseminate data – free of charge for any user
3. Provide information on the tools and instruments needed to validate the results (or better still provide the tools)
Image CC-BY-NC-SA by adesigna www.flickr.com/photos/adesigna/4090782772
Data Management Plans
Projects participating in the pilot will be required to develop a Data Management plan (DMP), in which they will specify what data will be open.
• What types of data will the project generate/collect?
• What standards will be used?
• How will this data be shared/made available? If not, why?
• How will this data be curated and preserved?
Note that the Commission does NOT require applicants to submit a DMP at the proposal stage. DMPs are a deliverable
for those participating in the pilot.
Data sharing: degrees of openness
Open Restricted Closed
Content that can befreely used, modified
and shared by anyonefor any purpose
Limits on who can use the data, how or for what purpose
- Charges for use
- Data sharing agreements
- Restrictive licences
- Peer-to-peer exchange
- …
online under an open licence
structured data
non-proprietary formats
use URIs to denote things
link data to provide context
Five star open data http://5stardata.info
Unable to shareUnder embargo
How to make data open?
1. Choose your dataset(s)What can you may open? You may need to revisit this step if
you encounter problems later.
2. Apply an open licenseDetermine what IP exists. Apply a suitable licence e.g. CC-BY or CC0
3. Make the data availableProvide the data in a suitable format. Use repositories.
4. Make it discoverablePost on the web, register in catalogues…
https://okfn.org
www.dcc.ac.uk/resources/how-guides/license-research-data
Data licensing
This DCC how-to guide outlines pros and cons of each approach and gives practical advice on how to implement your licence.
• Do you own the rights or have permission to redistribute?
• Do you need to place restrictions on who can use the data or how?
EUDAT licensing wizard
http://ufal.github.io/lindat-license-selector
Search / browse through a list of possible licences Or answer questions to determine which is most suitable
Metadata standards• Good metadata is key for research data access and re-use
• Many disciplines have formalised community metadata standards
• Use relevant standards for interoperability
www.dcc.ac.uk/resources/metadata-standards
Data catalogues
Institutional services e.g. DataFinder at the University of Oxford
National services e.g. Research Data Australia and RDDS pilot in the UK
Data centres and community initiatives e.g. FOT Data Catalogue, B2FIND etc
Data repositories
http://databib.org
http://service.re3data.org/search
Zenodo
• Joint effort by OpenAIRE-CERN
• Multidisciplinary repository
• Multiple data types
– Publications
– Long tail of research data
• Citable data (DOI)
• Links funding, publications, data & software
www.zenodo.org
• Does your publisher or funder suggest a repository?
• Are there data centres or community databases for your field?
• Does your university offer support for long-term preservation?
EUDAT services
EUDAT offers a pan-European solution, providing a generic set of services to ensure minimum level of interoperability
Building common data services in close collaboration with 25+ communities
www.eudat.eu
EUDAT B2 service suite
Covering both access and deposit, from informal data
sharing to long-term archiving, and addressing
identification, discoverability and computability of both
long-tail and big data, EUDAT’s services will
address the full lifecycle of research data
Institutional RDM support services
Diagram courtesy of Sally Rumsey, University of Oxford
University of Edinburgh Research Data Management Roadmap
www.ed.ac.uk/schools-departments/information-services/about/strategy-planning/rdm-roadmap
Research Data Oxfordhttp://researchdata.ox.ac.uk
Support on Data Management Plans
• Checklist on what to include
• How to guide on developing a plan
• Guidance on assessing plans (forthcoming)
• Webinars and training materials
• DMPonline tool
• Example DMPs
www.dcc.ac.uk/resources/data-management-plans
DMPonline
• Presents requirements from funders
• Guidance from funder, uni, discipline…
• Example answers
• Ability to share plans with collaborators
• Export into a variety of formats
• …
https://dmponline.dcc.ac.uk
Thanks for listening
DCC guidance, tools & case studies:
www.dcc.ac.uk/resources
Follow us on twitter:
@digitalcuration and #ukdcc