+ All Categories
Home > Education > Research Data Management: A Tale of Two Paradigms

Research Data Management: A Tale of Two Paradigms

Date post: 18-Jun-2015
Category:
Upload: tarastar
View: 254 times
Download: 1 times
Share this document with a friend
Description:
Presentation by Martin Donnelly, Digital Curation Centre, University of Edinburgh. Invited talk at a workshop for 'Scotland's National Collections and the Digital Humanities,' a knowledge-exchange project hosted at the University of Edinburgh. 2 May 2014. http://www.blogs.hss.ed.ac.uk/archives-now/
Popular Tags:
36
Research Data Management: a tale of two paradigms ‘Scotland's National Collections and the Digital Humanities’ workshop series #2 Tom Phillips, A Humument (1970, 1986, 1998, 2004, 2012…) Martin Donnelly, Digital Curation Centre, University of Edinburgh Edinburgh, 2 May 2014
Transcript
Page 1: Research Data Management: A Tale of Two Paradigms

Research Data Management: a tale of two paradigms‘Scotland's National Collections and the Digital Humanities’ workshop series #2

Tom Phillips, A Humument (1970, 1986, 1998, 2004, 2012…)

Martin Donnelly, Digital Curation Centre, University of EdinburghEdinburgh, 2 May 2014

Page 2: Research Data Management: A Tale of Two Paradigms

Overview

1. Introductions and definitions The Digital Curation Centre Research data management What do we mean by ‘data’, exactly?

2. Data as a hot topic: politics and practical concerns

3. Data in/and the Arts and Humanities How the Arts and Humanities differ Strengths and weaknesses Reflections on opportunities for exploration at national level

4. Resources

Page 3: Research Data Management: A Tale of Two Paradigms

1. INTRODUCTIONS AND DEFINITIONS

Page 4: Research Data Management: A Tale of Two Paradigms

The Digital Curation Centre

The (est. 2004) is… A UK centre of expertise in digital preservation. Emerged

from the e-Journal preservation field, now with a particular focus on research data management (RDM)

Based across three sites: Universities of Edinburgh, Glasgow and Bath

Working with a number of UK universities to identify gaps in RDM provision and raise capabilities across the sector

Also involved in a variety of national and international collaborations…

Page 5: Research Data Management: A Tale of Two Paradigms

DCC networks and partnerships

Page 6: Research Data Management: A Tale of Two Paradigms

What is research data management?

“the active management and appraisal of data over the lifecycle of scholarly

and scientific interest”

Data management is a part of good research practice.

- RCUK Policy and Code of Conduct on the Governance of Good Research Conduct

Page 7: Research Data Management: A Tale of Two Paradigms

The old way of doing things

1. Researcher collects data (information)

2. Researcher interprets/synthesises data

3. Researcher writes paper based on data

4. Paper is published (and preserved)

5. Data is left to benign neglect, and eventually ceases to be accessible

Page 8: Research Data Management: A Tale of Two Paradigms

The new way of doing things

Plan

Collect

Assure

Describe

Preserve

Discover

Integrate

Analyze

SHARE

…and RE-USE

The DataONE lifecycle model

Page 9: Research Data Management: A Tale of Two Paradigms

Other models are available…

Ellyn Montgomery, US Geological Survey

Page 10: Research Data Management: A Tale of Two Paradigms

Helicopter view: What are the benefits of RDM?

TRANSPARENCY: The data that underpins research can be made open for anyone to scrutinise, and attempt to replicate findings.

EFFICIENCY: Data collection can be funded once, and used many times for a variety of purposes.

RISK MANAGEMENT: A pro-active approach to data management reduces the risk of inappropriate disclosure of sensitive data, whether commercial or personal.

PRESERVATION: Lots of data is unique, and can only be captured once. If lost, it can’t be replaced.

Page 11: Research Data Management: A Tale of Two Paradigms

Definitions vary from discipline to discipline, and from funder to funder…

Here’s a science-centric definition: “The recorded factual material commonly accepted in the scientific community as

necessary to validate research findings.” (US Office of Management and Budget, Circular 110)

[Addendum: This policy applies to scientific collections, known in some disciplines as institutional collections, permanent collections, archival collections, museum collections, or voucher collections, which are assets with long-term scientific value. (US Office of Science and Technology Policy, Memorandum, 20 March 2014)]

And another from the visual arts: “Evidence which is used or created to generate new knowledge and interpretations.

‘Evidence’ may be intersubjective or subjective; physical or emotional; persistent or ephemeral; personal or public; explicit or tacit; and is consciously or unconsciously referenced by the researcher at some point during the course of their research.”

(Leigh Garrett, KAPTUR project: see http://kaptur.wordpress.com/2013/01/23/what-is-visual-arts-research-data-revisited/)

Okay, but what is ‘data’ exactly?

Page 12: Research Data Management: A Tale of Two Paradigms

Are the goals – or indeed the concepts – of evidence, facts, validation, replication still central in disciplines reliant on subjectivity, interpretation, argument and qualities of expression?

How do we identify, preserve and share ephemera, emotions, the unconscious…? How do we protect rights around creative data? What are the financial/ ownership issues accompanying creative / Arts research?

Is it clear where creative research begins and ends? How can we differentiate between funded research and unfunded personal work?

What problems are introduced by practice-driven research?

To what extent is non-digital material a problem? Can we share approaches to this with other subject areas (e.g. biology, geology)?

What other characteristics do Arts and Humanities data have in common with those of the Sciences? Which other disciplines share these issues more generally?

A few questions around data in the Arts and Humanities

Page 13: Research Data Management: A Tale of Two Paradigms

2. POLITICS AND PRACTICAL CONCERNS

Page 14: Research Data Management: A Tale of Two Paradigms

Nature, 09/08 Economist, 02/10

Popular Science, 11/11

Science, 02/11

Nature, 09/09ACM, 12/08

InformationWeek, 08/10 Computerworld, 11/12

A hot topic: 5 years of front pages…

Page 15: Research Data Management: A Tale of Two Paradigms

Developments in sensor technology, networking and digital storage enable new research and scientific paradigms

As costs also fall, possibilities for data sharing, citation and re-use become much more widespread

Journals dedicated solely to publishing data have even started to appear. That’s not to say it’s an entirely new thing: journals have always published data, just never before at such scale…

Technology

Page 16: Research Data Management: A Tale of Two Paradigms

Rosse

from Philosophical

Transactions of the Royal Society, (MDCCCLXI) (or

1861 if you’d prefer)

Page 17: Research Data Management: A Tale of Two Paradigms

Repurposing / VfM via data re-use

Ships’ log books build picture of climate change 14 October 2010

You can now help scientists understand the climate of the past and unearth new historical information by revisiting the voyages of First World War Royal Navy warships.

Visitors to OldWeather.org will be able to retrace the routes taken by any of 280 Royal Navy ships. These include historic vessels such as HMS Caroline, the last survivor of the 1916 Battle of Jutland still afloat. By transcribing information about the weather and interesting events from images of each ship's logbook, web volunteers will help scientists build a more accurate picture of how our climate has changed over the last century.

http://www.nationalarchives.gov.uk/news/503.htm

Detail from Royal Navy Recruitment poster, RNVR Signals branch, 1917 (Catalogue reference: ADM

1/8331)

Endeavour, 1768-71 (Captain Cook)

HMS Beagle, 1830-34

HMS Torch, 1918

Page 18: Research Data Management: A Tale of Two Paradigms

6.9 The Research Councils expect the researchers they fund to deposit published articles or conference proceedings in an open access repository at or around the time of publication. But this practice is unevenly enforced. Therefore, as an immediate step, we have asked the Research Councils to ensure the researchers they fund fulfil the current requirements. Additionally, the Research Councils have now agreed to invest £2 million in the development, by 2013, of a UK ‘Gateway to Research’. In the first instance this will allow ready access to Research Council funded research information and related data but it will be designed so that it can also include research funded by others in due course. The Research Councils will work with their partners and users to ensure information is presented in a readily reusable form, using common formats and open standards.

Government pressure/support

http://www.bis.gov.uk/assets/biscore/innovation/docs/i/11-1387-innovation-and-research-strategy-for-growth.pdf

Page 19: Research Data Management: A Tale of Two Paradigms

(Aside: Open Data)

Open Data is a philosophy, underpinned by pragmatism… transparency + utility.

“Open data is the idea that certain data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control.” – Wikipedia

Governments, cities etc are all getting onboard

Open Knowledge Foundation is basically the political / activist wing: http://okfn.org/

From the government / industry side, we have the Open Data Institute: http://theodi.org/

Page 20: Research Data Management: A Tale of Two Paradigms

Controversial FOI requests to…- University of East Anglia- Queens University Belfast- University of Stirling

Risk management

Page 21: Research Data Management: A Tale of Two Paradigms

- Reinhart & Rogoff (2010) “Growth in a Time of Debt” - paper not peer-reviewed, data not initially made available…

- Very influential and repeatedly cited by politicians to lend weight to economic strategy- Multiple issues (selective exclusions, unconventional weightings, coding error)

identified by a postgrad researcher attempting to replicate the paper’s findings- Widespread embarrassment, but at least the errors were discovered!

Research quality and integrity

Page 22: Research Data Management: A Tale of Two Paradigms

Why don’t we live in a data sharing utopia?

Four main reasons… Lack of understanding of the fundamental issues Lack of joined-up thinking within institutions,

countries, internationally… Issues around ownership / privacy Technical/financial limitations and the need for

appraisal

National bodies may be well-placed to address some of these

Page 23: Research Data Management: A Tale of Two Paradigms

What do research funders have to say? (i)

Seven “Common Principles on Data Policy” – Data as a public good; Preservation; Discovery; Confidentiality; Right of first use; Recognition; Public funding for RDM

Six of the seven RCUK councils require data management plans, or equivalent, at the application stage

The seventh (EPSRC) requires nothing short of an institutional data infrastructure

Page 24: Research Data Management: A Tale of Two Paradigms

3. DATA IN THE ARTS AND HUMANITIES

Kailie Parrish, “In My Dreams” http://datavisualization.ch/showcases/in-my-dreams/

Page 25: Research Data Management: A Tale of Two Paradigms

What do research funders have to say? (ii)

AHRC requires that significant electronic resources or datasets are made available in an accessible repository for at least three years after the end of the grant AHRC used to run several data services. Most stopped

being funded in 2008, but the Archaeology Data Service remains at York, and the Visual Arts Data Service at UCA.

ESRC applicants submit a statement on data sharing in the relevant section of the Je-S form, and provide a two-page data management and sharing plan addressing 9 distinct themes

Datasets must be offered to the UK Data Archive on conclusion of the project

Page 26: Research Data Management: A Tale of Two Paradigms

Some characteristics of Arts and Humanities data are likely to require a different kind of handling from that afforded to other disciplines

Arts ‘data’ is often personal, and creative data in particular may not be factual in nature. Furthermore, it may be quite valuable or precious to its creator. What matters most may not be the content itself, but rather the presentation, the arrangement, the quality of expression…

This tends to be why Open Access embargoes are often longer in the Arts and Humanities than other areas

Digital ‘data’ emerging in the Arts is as likely to be an outcome of the creative research process as an input to a workflow. This is at odds with the scientific method, and how most RDM resources are described.

Problems re. data in the Arts and Humanities

Page 27: Research Data Management: A Tale of Two Paradigms

Scientific and other methods…

The scientific method is a body of techniques for investigating phenomena, acquiring new knowledge, or correcting and integrating previous knowledge.

To be termed scientific, a method of inquiry must be based on empirical and measurable evidence subject to specific principles of reasoning.

The Oxford English Dictionary defines the scientific method as: “a method or procedure that has characterized natural science since the 17th century, consisting in systematic observation, measurement, and experiment, and the formulation, testing, and modification of hypotheses.”

Source: http://en.wikipedia.org/wiki/Scientific_method

An art methodology differs from a science methodology, perhaps mainly insofar as the artist is not always after the same goal as the scientist. In art it is not necessarily all about establishing the exact truth so much as making the most effective form (painting, drawing, poem, novel, performance, sculpture, video, etc.) through which ideas, feelings, perceptions can be communicated to a public. With this purpose in mind, some artists will exhibit preliminary sketches and notes which were part of the process leading to the creation of a work. Sometimes, in Conceptual art, the preliminary process is the only part of the work which is exhibited, with no visible end result displayed. In such a case the "journey" is being presented as more important than the destination.

Source: http://en.wikipedia.org/wiki/Art_methodology

Page 28: Research Data Management: A Tale of Two Paradigms

There’s nothing new about data re-use in the Arts and Humanities; it’s an integral part of the culture, and always has been Think Kristeva’s intertextuality, Barthes’ ‘galaxy of signifiers’,

Shakespeare’s plots, Lanark’s assorted ‘plagiarisms’, Edwin Morgan’s ‘found’ newspaper poems, Marcel Duchamp, variations on a theme, collage and intermedia art, T.S. Eliot, sampling/hip-hop, etc etc (http://www.slideshare.net/martindonnelly/data-reuse-in-the-arts)

However, it’s often more fraught than data re-use in other areas (such as the Sciences)

For starters, people tend not to think of their sources or influences as ‘data’, and the value and referencing systems are quite different

Furthermore, practice /praxis based research is pretty much the sole preserve of the Humanities, and research / production methods are not always rigorously methodical / linear…

Strengths and weaknesses re. data in the Arts and Humanities

Page 29: Research Data Management: A Tale of Two Paradigms

REFUGE: Many universities are developing data repositories for their funded research data, but a comparatively high proportion of Arts research does not receive external funding, so there’s less incentive for the institutions to provide support (no stick, and little demand from researchers)

APPRAISAL, STEWARDSHIP AND DISCOVERY: Furthermore, it is (probably/usually) preferable for data to be deposited in discipline- or domain-specific repositories. There’s a gap in the market, and national bodies are already experienced in managing large digital collections.

SUPPORT AND ADVOCACY: Humanities scholars are entirely comfortable with the use of primary and secondary sources. It just requires a little translation for the core concepts of RDM to become meaningful in an Arts and Humanities context. The trust is already there.

National roles around Arts and Humanities data?

Page 30: Research Data Management: A Tale of Two Paradigms

4. RESOURCES

Page 31: Research Data Management: A Tale of Two Paradigms

i. Arts-centric resources DCC and University of the Arts London were both involved in the KAPTUR project:

http://kaptur.wordpress.com

DCC subsequently ran an institutional engagement with UAL between 2011 and 2013, which developed… A data management guidance web area:

http://www.arts.ac.uk/research/research-environment/research-management/data-management/

An institutional policy: http://www.arts.ac.uk/media/research/documents/UAL-Research-Data-Management-Policy.pdf

A UAL data management planning template in http://dmponline.dcc.ac.uk A UAL data community-of-practice is being launched, with support of the senior

management

Events RDMF10: “Research data management in the Arts and Humanities”, Oxford, September

2013 UoE Digital Humanities workshop: “Managing Humanities Research Data”, Edinburgh,

January 2014

Page 32: Research Data Management: A Tale of Two Paradigms

ii. Other DCC resources

Publications Briefing Papers and How-To Guides

Training e.g. DC101 events and Curation Reference

Manual

Advice e.g. Disciplinary metadata, www.dcc.ac.uk

/resources/metadata-standards

Tools DMPonline, CARDIO, Data Asset

Framework, DRAMBORA

Page 33: Research Data Management: A Tale of Two Paradigms

iii. Further resources

JISC Services RDM resources, www.jisc.ac.uk/guides/research-data-

management EDINA and Mimas (national data centres) JISCMRD projects (Phase 1 (2009-2011) and Phase 2 (2011-

2013)) covered a wide range of topics, including infrastructure, planning, training, support and guidance, events and tools

Universities Great RDM materials are available from Edinburgh, Cambridge,

Oxford, Glasgow, Bristol, and many other places

Alliance of Digital Humanities Organizations (ADHO) http://digitalhumanities.org/

Page 34: Research Data Management: A Tale of Two Paradigms

“Ten recommendations for libraries to get started with research data management: Final report of the LIBER working group on E-Science / Research Data Management” - Christensen-Dalsgaard et al. (LIBER, 2012)

“Curating research data: the potential roles of libraries and information professionals”, Nielsen & Hjørland (2014) Journal of Documentation, Vol. 70 Iss: 2, pp.221 - 240

For more on potential future roles for librarians, see slides from Open Repositories 2013 workshop: http://tinyurl.com/whyte-OR13

Two recent surveys about libraries and data… USA & Canada – “Academic Libraries and Research Data Services: Current

practices and plans for the future” - Tenopir, Birch & Allard, University of Tennessee (Association of College & Research Libraries, June 2012)

UK – “Research data management and libraries: Current activities and future priorities” - Cox & Pinfield, Information School, University of Sheffield (Journal of Librarianship and Information Science, June 2013)

iv. Further reading

Page 35: Research Data Management: A Tale of Two Paradigms

Last slide: take-home messages

Research data management (RDM) is… An integral part of doing quality research in the 21st century Increasingly expected / required by funders, publishers and

others An opportunity for new discoveries and different

approaches to research A safeguard against inappropriate data disclosure Sometimes complicated in the Arts and Humanities! And hence… an activity that requires careful planning and

consideration, and – ideally – coordination and support at many levels

Page 36: Research Data Management: A Tale of Two Paradigms

Thank you

Questions?

Image creditsSlide 2 (forest) – http://assets.worldwildlife.org/photos/934/images/hero_small/forest-overview-HI_115486.jpg?1345533675 Slide 3 (dictionary) – http://www.flickr.com/photos/dougbelshaw/ Slide 13 (politics) – https://www.flickr.com/photos/junglearctic/ Slide 22 (utopia) – http://www.flickr.com/photos/burningmax/ Slide 30 (Thierry) – https://twitter.com/AFC_Fisher/ Slide 36 (love note) – http://www.edawax.de/wp-content/uploads/2013/01/Metadata_love250.jpg Thanks to Sarah Callaghan, PREPARDE, for the Rosse example

This work is licensed under the Creative Commons Attribution

2.5 UK: Scotland License.

For more about DCC services see www.dcc.ac.uk or follow us on twitter @digitalcuration and #ukdcc

Martin DonnellyDigital Curation Centre

University of Edinburgh

[email protected] @mkdDCC


Recommended