Introductiontodatamanagementplanning
JoyDavidsonDigitalCurationCentre
Acknowledgements:contentcontributedbySarahJones,JonathanRans Funded by:
Definitionofresearchdata
‘Researchdata’referstoinformation,inparticularfactsornumbers,collectedtobeexaminedandconsideredasabasisforreasoning,discussionorcalculation.
Inaresearchcontext,examplesofdataincludestatistics,resultsofexperiments,measurements,observationsresultingfromfieldwork,surveyresults,interviewrecordingsandimages.Thefocusisonresearchdatathatisavailableindigitalform.
GuidelinesonOpenAccesstoScientificPublicationsandResearchDatainHorizon2020v.1.0,11December2013,Footnote5,p3
Howdoesresearchdatafitinwiththethemeofopenscience?
“sciencecarriedoutandcommunicatedinamannerwhichallowsotherstocontribute,collaborateandaddtothe
researcheffort,withallkindsofdata,resultsandprotocolsmadefreelyavailableatdifferentstagesoftheresearch
process.”
Research InformationNetwork,OpenSciencecasestudieswww.rin.ac.uk/our-work/data-management-and-curation/
open-science-case-studies
Levelsofopendata
⭐ makeyourstuffavailableontheWeb(whatever format)underanopen licence
⭐⭐ makeitavailableasstructureddata(e.g.Excelinsteadofascanofatable)
⭐⭐⭐ usenon-proprietary formats(e.g.CSVinsteadofExcel)
⭐⭐⭐⭐ useURIstodenotethings,sothatpeoplecanpointatyour stuff
⭐⭐⭐⭐⭐ linkyourdatatootherdatatoprovidecontext
Tim Berners-Lee’s proposal for five star open data - http://5stardata.info
“Open data and content can be freely used, modified and shared by anyone for any purpose”
http://opendefinition.org
Howdoesresearchdatamanagementfitintothepicture?
Create
Document
Use
Store
Share
Preserve
• DataManagementPlanning• Creatingdata• Documentingdata• Accessing/usingdata• Storageandbackup• Selectingwhattokeep• Sharingdata• Datalicensingandcitation• Preservingdata
Create
Document
Use
Store
Share
Preserve
Create
Document
Use
Store
Share
Preserve
Funders haveexpectationsaboutdatasharing…
“TheEuropeanCommission’svisionisthatinformationalreadypaidforbythepublicpurseshouldnotbepaidforagaineachtimeitisaccessedorused,andthatitshouldbenefitEuropeancompanies
andcitizenstothefull.”
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf
Data management plans requested for those participating in Open Data pilot.
“Datasetsarebecomingthenewinstrumentsof
science”
DanAtkins,UniversityofMichigan
…butRDMispartofgoodresearchpractice!
DMPscanhelp
ProjectsparticipatinginthepilotwillberequiredtodevelopaDataManagementplan(DMP),inwhichtheywillspecifywhatdatawillbeopen.
Note that the Commission does NOT require applicants to submit a DMP at the proposal stage.
A DMP is therefore NOT part of the evaluation.
DMPs are a deliverable for those participating in the pilot.
WhataspectsofRDMshouldbe inaDMP?§ What data will be created (format, types, volume...)
§ Standards and methodologies to be used (incl. metadata)
§ How ethics and Intellectual Property will be addressed
§ Plans for data sharing and access
§ Strategy for long-term preservation Create
Document
Use
Store
Share
Preserve
A DMP is a plan to share!
Howwillyounameyourfiles?• Keepitsimple!• Agreemethodswithpartners• Includedates• Avoidnon-alphanumericcharacters• Usehyphensorunderscoresnotspacese.g.day-sheet,day_sheet• Ordertheelements logically
Example from ARM Climate Research Facility www.arm.gov/data/docs/plan
www.jiscdigitalmedia.ac.uk/guide/choosing-a-file-name
Whatismetadata?
Whatisthedifference?
• Metadata– Standardised– Structured– Machineandhumanreadable Metadata
Documentation
Howshouldyoudescribeyourdata?
http://www.dcc.ac.uk/resources/metadata-standards
Whatistheminimumrequired?
• DataCite metadatausedbyOpenAIRE• Citation/disambiguation
– Identifiere.g.DOI– Creator– Title– Publisher– PublicationYear
• Licencing/accessconditions
Wherewillyoustorethedataduringyourresearch?
• Yourownlaptop• Universitysystems• Cloudstorage• Combination
Your decision will be based on how sensitive your data are, how robust you need the storage to be, who needs access to the data,
and when they need access to the data!
Whichdatamustbekept?
• Data,includingassociatedmetadata,neededtovalidatetheresultsinscientificpublications
• Othercuratedand/orrawdata,includingassociatedmetadata,asspecifiedintheDMP
Doesn’t apply to all data (researchers to define as appropriate)
Don’t have to share data if inappropriate – exemptions apply
Exemptions– reasonsforoptingout
• Ifresultsareexpectedtobecommerciallyorindustriallyexploited
• Ifparticipation isincompatiblewiththeneed forconfidentiality inconnectionwithsecurityissues
• Incompatiblewithexistingrulesontheprotectionofpersonaldata
• Would jeopardise theachievementofthemainaimoftheaction
• Iftheprojectwillnotgenerate/collectanyresearchdata•
• IfthereareotherlegitimatereasonstonottakepartinthePilot
CanoptoutatproposalstageORduringlifetimeofprojectShoulddescribeissuesintheprojectDataManagementPlan
Which additional datamightbekeptaftertheprojectends?
Fivestepstofollow
① Could thisdatabere-used② Must itbekeptasevidenceorforlegalreasons③ Shoulditbekeptforitspotentialvalue④ Considercosts– dobenefitsoutweighcost?⑤ Evaluatecriteriatodecidewhattokeep
5stepstodecidewhatdatatokeepwww.dcc.ac.uk/resources/how-guides/five-steps-decide-what-data-keep
Assignpersistentidentifiers• Theyareanalphanumericcodeidentifyingaresource,organisationorindividual
• Theymustbe– Unique– Persistent
• Ideallytheyshouldbeactionabletoo
https://ssi-dev.epcc.ed.ac.uk/
Remembertoconsiderphysicaldata,software
andmodels
http://spatialinformationdesignlab.org/project_sites/library/catalog.html
http://www.ukcrcexpmed.org.uk/Coventry_Warwick_CRF/PublishingImages/Tissue%20Bank%201.jpg
Canyourdatabesharedwithothers?
• PI/researcher
• Datarepositoryandsupportstaff
• Researchparticipants
• Commercialpartners
• Secondarydatauser
Howwillitbeshared?
http://service.re3data.org/search
Zenodo
• Joint effort by OpenAIRE-CERN
• Multidisciplinary repository
• Multiple data types
• Citable data (DOI)
• Links funding, publications, data & software
www.zenodo.org
• Does your publisher or funder suggest a repository?
• Are there data centres or community databases for your discipline?
• Does your university offer support for long-term preservation?
www.dcc.ac.uk/resources/how-guides/license-research-data
Licensing research data
This DCC guide outlines the pros and cons of each approach and gives practical advice on how to implement your licence
CREATIVE COMMONS LIMITATIONS
NC Non-CommercialWhat counts as commercial?
ND No DerivativesSeverely restricts use
These clauses are not open licenses
Horizon 2020 Open Access guidelines point to:
or
EUDATlicensingtool
Answerquestionstodeterminewhichlicence(s)areappropriatetouse
http://ufal.github.io/lindat-license-selector
Optionsforopendata
• Domainrepository• Generalrepository– Figshare,Zenodo,Dryad• Institutionalrepository• Journalsupplementarymaterial• Departmentalwebpage
Ø GeneraldirectoriesRe3data.org
Ø Domainspecificdirectoriese.g.lifesciences– Biosharing.org
Ø DatajournalrecommendationsEdinburgh researchdatablog:Sourcesofdatasetpeerreview
Ø FundingbodyrecommendationsE.g.WellcomeTrustDatarepositories anddatabasesources
FindingexternalrepositoriesGo
Considerations• Theremaybeanacceptedrepositoryusedbypeersorrequiredbyfunders
• Multidisciplinarystudiesmaynothaveanobvioushome
• Datatypesandvolumeswillimpactondecision
Howwillyoumakeyourdatadiscoverable?
http://ckan.data.alpha.jisc.ac.uk/datasethttps://www.researchfish.com/
http://gtr.rcuk.ac.uk/
http://researchdata.gla.ac.uk/
Optionsforcloseddata
• Institutionaldataarchive/vault• Safehavens– (e.g.securepatientdata)• 3rd partydataarchiving• Cloudstorage• Institutionalservers– the‘donothing’option
Approach:asopenaspossible,
asclosedasnecessary
Image: ‘Balancing rocks’ by Viewminder CC-BY-SA-ND www.flickr.com/photos/light_seeker/7780857224
Refertofreeguidesandbriefingpapers
www.dcc.ac.uk/resources/
GuidelinesfromtheCommission
• FactsheetonOpenAccess– https://ec.europa.eu/programmes/horizon2020/sites/horizon2020/files/FactSheet_Ope
n_Access.pdf
• GuidelinesonOpenAccesstoScientificPublicationsandResearchDatainHorizon2020– http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_
pilot/h2020-hi-oa-pilot-guide_en.pdf
• GuidelinesonDataManagementinHorizon2020– http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_
pilot/h2020-hi-oa-data-mgt_en.pdf
https://dmponline.dcc.ac.uk
Makeuseoffreetools
WhatisDMPonline?
• Aweb-basedtooltohelpresearcherswriteDataManagementandSharingPlans
• Includesrequirementsandguidancefromfunders,universitiesandothergroups
• DevelopedbytheDigitalCurationCentre
• Morevisibleresearchoutputsandincreasedimpact - evenfornegativeresults
• Easieroutputsreporting• Betterandmorereproducibleresearch!
GoodRDMhelpsyoucomplywithmandatesbutalsoleadsto…