Griffith University’s Journey in Data CitationNatasha Simons
Senior Data Management SpecialistAustralian National Data ServiceLocated at: Griffith University, Brisbane, Australiahttp://orcid.org/0000-0003-0635-1998Tw: @n_simons
ANDS Webinar 5 June 2014
About Griffith University
Established in 1971 and opened in 1975
Now has five South-East Queensland campuses
Around 43,000 students and 4,300 staff
26 schools and departments in four academic groups: Arts, Education and Law Business Health Science, Environment,
Engineering and Technology
Image credit: Danny Munnerley, http://www.flickr.com/photos/munnerley/6381877583/
Griffith’s research profile
32 research centres and institutes
Priority areas Water science Drug discovery and infectious
diseases Asian politics, security and
development Climate change adaptation Criminology and crime prevention Music, the arts and the Asia Pacific Sustainable tourism Chronic disease prevention Physical sciences Environmental sciences Nursing EducationImage credit: Anne Ruthmann, http://www.flickr.com/photos/annemarlow/8392238157/
Research infrastructure and data management
Strong commitment from University leaders to improving data management
Staff resources – operational and project related
Successful in seeking funding from ANDS and NeCTAR to build national and local infrastructure
Strong emphasis on seeking internal funds and working with researchers on grants for funds to develop, enhance and support institutional tools
Policy frameworks and service models for data management support under discussion
What does a data citation look like?
What does a data citation look like?
What does a data citation look like?
What does a data citation look like?
How is data cited?
How is data cited?
What do you need to pack for your data citation journey?
Image credit: http://www.peregrineadventures.com/blog/13/02/2012/great-packing-debate
There are really only two things you need before you start on a data citation journey:
1. Some research data collections at your institution that have open, embargoed or mediated access.
2. A publically available metadata record that describes each of these collections and provides access to them.
Packing for the journey
At Griffith, we have:
1. Research Data Repository - http://equella.rcs.griffith.edu.au/research/logon.do
2. Research Hub (metadata store/researcher profile system) - http://research-hub.griffith.edu.au
Packing for the journey
On your journey, you may also need:
1. Management support:
2. Technical support:
Malcolm WolskiDirector, eResearch Services & Scholarly Application DevelopmentDivision of Information ServicesGriffith University
Arve SollandSenior Developer,eResearch Services & Scholarly Application DevelopmentDivision of Information ServicesGriffith University
When and why, DOI?
2011August –‘PIDs for data’ options paper, recommended DOIsAugust – ANDS launched Cite My Data service pilotSeptember to December – signed up; developed m-2-m scripts, minted DOIs
2012c.May - Put ‘Cite this collection’ feature in Griffith Research HubOctober - Commenced data citation project
2013May - Concluded data citation projectSeptember – produced DOI guidelines; developed roadmap
When and why, DOI?
Griffith needed a persistent identifier that would:
• Fill gaps in persistent identifiers for scholarly works
• Replace long and incomprehensible URLs for metadata
• Signal long-term management of our research data collections
• Contribute to the semantic vision for data in the Research Hub
• Later: foundation for data citation.
When and why, DOI?We chose DOIs to meet our needs because they:
• Are a global persistent identifier, already used for many scholarly
publications
• Can be assigned to research data, theses, grey lit and even software
code
• Improve visibility of, and access to, research data
• Gave us responsibility for managing persistent access to our data
collections
• Won’t break when IR software is re-indexed (as handles sometimes
do)
Later, because they:
• Facilitate data citation
• Greatly assist tracking impact of data sets through collection of
metrics and altmetrics based on DOI
When and why, DOI?
The ANDS Cite My Data service provided:
Partnership with international DOI registration agency: DataCite
Minting DOIs for metadata records about open, mediated or embargoed research data, theses, grey literature (even software code)
Machine-to-machine workflow Easily achieved kernel metadata Trial in safe test environment High level documentation for the M-2-M provided by ANDS High level information on data citation on the ANDS website Free!
And so we became the first guinea pigs of the Cite My Data Service….
Cite My Data – how does it work?
1. Sign agreement to use the service
2. ANDS give you an institutional id
3. Prepare your m-2-m script (includes required metadata for
each DOI: title, creator, publisher, publication year,
identifier)
4. Execute script against Cite My Data service
5. Cite My Data service returns DOIs
6. Store DOIs in own system
7. Create citation element
8. Make citation element avail in RIF-CS feed for ANDS
harvester
DOI scipts: https://github.com/gu-eresearch/ANDSDOIScripts
Decisions, decisions…• What’s the criteria for assigning a DOI to a research data collection?
• At what level of granularity should a DOI be applied?
• Should the DOI link to the landing page or the actual data? Which landing page?
• What if the data is changed e.g. updated? Should a new DOI be issued?
• Should researchers be able to mint the DOI or should we mint it for them?
• How are DOIs assigned if the research data is the result of a collaboration between various institutions?
• What happens to the DOIs we have minted if ANDS closes shop?
• Can you cite data without a DOI?
Implementing DOIs for Research Data D-Lib article http://dx.doi.org/10.1045/may2012-simons
Developing guidelines
We found answers to our questions and wrote them up in guidelines:
Digital Object Identifiers (DOIs): Introduction and Management Guide
Available for download from the ANDS website:
http://ands.org.au/cite-data/griffith_doi_guidelines-4.pdf
ANDS DOI FAQs http://ands.org.au/cite-data/doi_q_and_a.html
Documented our experiences in the Gold Standard Project @ Griffith blog: http://ands-gold-griffith.blogspot.com.au/
Data Citation
Discovery
Data Citation engagement experiences
Established a blog - http://data-citation-griffith.blogspot.com.au/
Spoke with librarians about citation practices in different disciplines
Included data citation as part of standard consultations with a group in Health & an individual in environmental economics
Notifications workflows Investigated Dryad automated notifications workflows Modified their depositor notification Manually emailed collections owners of new collections Notifications added to technical requirements for data deposit
Reviewed existing information and workflows Griffith policies and procedures Academic style guides Training materials and guides
Included data citation in new Best practice guidelines for researchers: managing research data and primary materials
Lesson #1: One size will not fit all
Disciplines Citation practices Style guides Publishing protocols Target audiences Types of research output Usage of metrics
Age and career stage Attitudes to open access Motivations Technical know-how
Image credit: Taki Steve, http://www.flickr.com/photos/13519089@N03/1380483002/
Lesson #2: Choose your time
Find ‘hooks’ in the researchers’ workflows e.g. point of data deposit e.g. final report on funded
research e.g. through data planning
Long term goal should be to get in early - improving the training and supporting artefacts (style guides, bibliographic management software) that introduce new students and researchers to the principles of citation
Image credit: Todd Lappin, http://www.flickr.com/photos/telstar/433029904/
Lesson #3: Need-to-know basis
A depositor shouldn’t have to know what a DOI is or where it comes from, or be asked to make a decision about whether they want one or not
Minting DOIs should be done automatically for collections that meet the rules defined by the ‘publisher’ of the deposited data (in this case, Griffith University) and the DOI registration agencyImage credit: Taki Steve, http://www.flickr.com/photos/13519089@N03/1380483002/
Lesson #4: Be honest and realistic with researchers
Be honest about the evidence base – they’re researchers so they will ask!
Be honest about the lack of rewards within the current system and have empathy – researchers know better than us what they do and don’t get rewarded for
Lesson #5 :Not everything can be solved now or by you alone
Culture of data
citation
Publisher policies
Style guides
Tools e.g. Endnote, Zotero
Research quality
exercises
Information and training
Institutional
procedures
Data repositori
es
Identifier registrati
on agencies
Bibliometrics
Altmetrics
Fundermandates
We mostly know what
we are doing with
these
We’re investigating these
now and in the near future
Collective action is
needed for change in
these areas
Data citation experiences
D-Lib article: Growing Institutional Support for Data Citation: Results of a Partnership between Griffith University and the Australian National Data Servicehttp://dx.doi.org/10.1045/november2013-simons
What Griffith University are doing to establish a culture of data citation:
https://www.youtube.com/watch?v=jDsD5cbIeZU
On the ‘to do’ list
But we didn’t conquer the world…
On the ‘to do’ list:
• Embedding DOIs into automated data collection workflows
• Minting DOIs for grey literature: theses, reports, discussion
papers etc.
• Improving links between research publications and underlying
data
• Reviewing DOI guidelines, rules and workflows at future points
in time
• Embedding types of metadata, such as COINS, into the landing
pages to assist import into citation tools
Reflections
Easy lessons learnt:
• Do what you can with what you have available• Technical minting and maintaining of DOIs is relatively easy• Cite My Data service is straight forward• Getting citation element is also relatively easy• There are a lot of materials available now on DOIs
(infrastructure) and on data citation (researchers) so don’t reinvent the wheel
• You could decide to set up an administrator interface for minting and maintaining the DOIs (e.g. the way TERN have done this). This would run over the top of the m-2-m scripts.
Reflections
Hard lessons learnt:
• Establishing workflows for DOIs and data citation is not easy if you don’t know when researchers are going to publish their data and if data publication is not routine
• Data citation is not (yet) common practice but there is a large international community supporting data citation as a principle and to encourage practice
• There is a growing body of evidence on a positive link between open data and citation counts