Establishing Trust in Data Curation: OAIS and TRAC applied ... · Allinson 2006: OAIS as a...

Post on 17-Aug-2020

4 views 0 download

transcript

Establishing Trust in Data Curation: OAIS and TRAC applied to a Data Staging Repository (DataStaR)

Gail SteinhartCornell University Library

Ann GreenDigital Life Cycle Research & Consulting

Dianne DietrichCornell University Library

IASSIST 2009

Image courtesy of the Cornell Biological Field Station

What exactly is a data staging repository?

DataStaRPermanent  Repository(domain,

institutional)

metadata

upload publish

publishcreate

data set institutional)

colleague

upload publish

disseminate

data set

share

user

IASSIST 2009

Where does it fit in the life cycle?

IASSIST 2009

Where does it fit in the life cycle?

IASSIST 2009

But DataStaR isn’t a preservation repository...

“...if repository developers and administrators are guided by a reference model, they are more likely to consider the right issues.”

Allinson 2006: OAIS as a Reference Model for Repositories: An Evaluation

“A repository is Trusted if it can demonstrate its capacity to fulfill its specified functions, and if and if those (...) functions satisfy (...) if and if those (...) functions satisfy (...) minimal criteria which all trusted repositories are assumed to require.”

DigitalPreservationEurope 2008: Repository Planning

IASSIST 2009

Checklist and Guidance

An OAIS view of DataStaR

DataStaRPermanent  Repository(domain,

institutional)

metadata

upload publish

publishcreate

data set institutional)

colleague

upload publish

disseminate

data set

share

user

IASSIST 2009

An OAIS view of DataStaR

DataStaR“pre”-SIP

Permanent  Repository(domain,

institutional)

metadata

upload publish

publishcreate

data set institutional)

colleague

upload publish

disseminate

data set

share

user

IASSIST 2009

An OAIS view of DataStaR

DataStaR“pre”-SIPAIP

Permanent  Repository(domain,

institutional)

metadata

upload publish

publishcreate

data set institutional)

colleague

upload publish

disseminate

data set

share

user

DIP

IASSIST 2009

An OAIS view of DataStaR

DataStaR“pre”-SIPAIP SIP

Permanent  Repository(domain,

institutional)

metadata

upload publish

publishcreate

data set institutional)

colleague

upload publish

disseminate

data set

share

user

DIP

IASSIST 2009

OAISAn OAIS view of DataStaR

DataStaR“pre”-SIPAIP SIP

AIP

Permanent  Repository(domain,

institutional)

metadata

upload publish

publishcreate

data set institutional)

colleague

upload publish

disseminate

data set

shareDIP

user

DIP

IASSIST 2009

Look at other approaches to implementation

IASSIST 2009

Put TRAC into context

IASSIST 2009

What did DataStaR need...

Three things:Three things:

• Data depositor agreement• Set of repository policieso po o y po• System documentation

IASSIST 2009

Data deposit agreement

IASSIST 2009

Repository policies

IASSIST 2009

System documentation

IASSIST 2009

System documentation

IASSIST 2009

How did we do?

IASSIST 2009

How did we do?

Number and percentage of TRAC criteria addressed by Number and percentage of TRAC criteria addressed by (agreement, policies, system)

TRAC  Depositor Repository System  doc/ SECTION

pagreement

p ypolicies

y /requirements

A (24 criteria) 6 (25%) 9   (38%) 3   (13%)

B (44 criteria) 4 (9%) 14 (32%) 30 (68%)B (44 criteria) 4   (9%) 14   (32%) 30   (68%)

C (16 criteria) 0 0 6   (38%)

IASSIST 2009

How did we do?

Number and percentage of TRAC criteria addressed by

TRAC  Depositor Repository System  doc/ 

Number and percentage of TRAC criteria addressed by (agreement, policies, system)

SECTIONp

agreementp ypolicies

y /requirements

A (24 criteria) 6 (25%) 9   (38%) 3   (13%)

B (44 criteria) 4 (9%) 14 (32%) 30 (68%)B (44 criteria) 4   (9%) 14   (32%) 30   (68%)

C (16 criteria) 0 0 6   (38%)

S ti A it i ( i ti l i f t t ) dd d i l b li• Section A criteria (organizational infrastructure) addressed mainly by policy

IASSIST 2009

How did we do?

Number and percentage of TRAC criteria addressed by

TRAC  Depositor Repository System  doc/ 

Number and percentage of TRAC criteria addressed by (agreement, policies, system)

SECTIONp

agreementp ypolicies

y /requirements

A (24 criteria) 6 (25%) 9   (38%) 3   (13%)

B (44 criteria) 4 (9%) 14 (32%) 30 (68%)B (44 criteria) 4   (9%) 14   (32%) 30   (68%)

C (16 criteria) 0 0 6   (38%)

S ti A it i ( i ti l i f t t ) dd d i l b li• Section A criteria (organizational infrastructure) addressed mainly by policy

• Section B and C criteria (digital object management and technologies, technical infrastructure and security addressed mainly (but not exclusively) b t

IASSIST 2009

by system

What didn’t we do?

TRAC SECTION Address at transition to production system

Not relevant to DataStaR

A (24 criteria) 11 (46%) 1 (4%)A (24 criteria) 11   (46%) 1   (4%)

B (44 criteria) 0 8   (18%)

C (16 criteria) 10   (63%) 0

TOTAL (84 criteria) 21   (25%) 9   (11%)

We are making an effort to address 64% of the TRAC criteria, in the pilot phase.

IASSIST 2009

Some observations

• Understanding/interpreting the criteria is a lot of work• Understanding/interpreting the criteria is a lot of work.• The right tools might simplify policy development.• The right software might simplify system specification.• Compiling /presenting evidence: for auditors, or for users?o p g /p g d o aud o , o o u• Picking your partners...

TRAC h l t t ff if l t TRAC has a lot to offer, even if long-term preservation isn’t your focus.

IASSIST 2009

datastar.mannlib.cornell.edu

Thank you. Gail SteinhartGSS1@cornell.edu

Ann Greengreen.ann@gmail.com

Dianne Dietrichdd388@cornell.edu

This material is based upon work supported by the National Science Foundation under Grant 

IASSIST 2009

Image courtesy of the Cornell Biological Field Station

No. III‐0712989. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.