+ All Categories
Home > Documents > Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... ·...

Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... ·...

Date post: 08-Sep-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
26
Can a Consortium Build a Viable Preservation Repository? Presentation at CNI March 31, 2014 Bradley Daigle (APTrust – University of Virginia) Stephen Davis (Columbia University) Linda Newman (University of Cincinnati) Suzanne Thorin (APTrust – University of Virginia) Scott Turnbull (APTrust – University of Virginia) www.aptrust.org
Transcript
Page 1: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

Can a Consortium Build a Viable

Preservation Repository?

Presentation at CNI

March 31, 2014

Bradley Daigle (APTrust – University of Virginia)

Stephen Davis (Columbia University) Linda Newman (University of Cincinnati)

Suzanne Thorin (APTrust – University of Virginia) Scott Turnbull (APTrust – University of Virginia)

www.aptrust.org

Page 2: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

Academic Preservation Trust

Academic Preservation Trust, a consortium

of 17 institutions, is taking a community

approach in building and managing a

repository infrastructure that will provide

long-term preservation of the scholarly

record. APTrust will also be a DPN first

node.

www.aptrust.org

Page 3: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

APTrust Institutions

Columbia University

Johns Hopkins University

Indiana University

North Carolina State

University

Penn State University

Stanford University

Syracuse University

University of Chicago

University of Cincinnati

www.aptrust.org

University of Connecticut

University of Maryland

University of Miami

University of Michigan

University of North Carolina

University of Notre Dame

University of Virginia

Virginia Tech

Page 4: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

APTrust is hosted by the University of

Virginia, which fully supports 5 ½ staff,

including space and equipment. Program Director

Lead Engineer

Junior Engineer

Systems Engineer

Content Lead (1/2 time)

www.aptrust.org

Page 5: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

Membership Dues

Member dues: $20,000 annually

Supports partner meetings, conference

travel, contract and cloud services,

marketing, and the web site

www.aptrust.org

Page 6: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

What is the problem we are trying

to solve?

Columbia University

University of Cincinnati

University of Virginia

www.aptrust.org

Page 7: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

Columbia University – Use Case 1

Columbia University Libraries / Information Services has made

commitments …

to granting agencies to provide long-term digital

archiving for digital content created with grant funds

to third-party content creators to provide

permanent access to born-digital content acquired

from them

to continuing to collect and preserve archival

collections, now partly or wholly born-digital content

to permanently preserve University-generated

archival and research content

Page 8: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

Columbia University – Use Case 2

We must preserve the content of …

Local Digitization Projects

Preservation-Related Digitization

Institutional Repository / Data Sets

Born Digital Archival Content

Archived Web Sites

Super Dark Archives – highly secure

Page 9: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)
Page 10: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)
Page 11: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

Columbia University – Questions

Why create our own single-institution long-term preservation repository?

Why divert scarce existing CUL/IS internal equipment funds to storage on a permanent basis?

Why divert scarce existing CUL/IS staff time to creation, enhancement and maintenance of our own local preservation repository, permanently?

Why undergo the costs and staff investment in obtaining local TRAC certification?

Page 12: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

Question: Why is digital preservation

important to us?

Answer: We have digital collections

where the original source material has

deteriorated or is about to be

intentionally destroyed. (Magnetic

tapes, nitrate negatives considered

flammable). The digital object is THE

ONLY object. Magnetic tape image by Daniel P. B. Smith. Released under the GNU Free Documentation License. http://en.wikipedia.org/wiki/File:Magtape1.jpg

Nitrate negative from Cincinnati Subway and Street Improvements (digital collection) http://drc.libraries.uc.edu/handle/2374.UC/702759

University of Cincinnati – Use Case

www.aptrust.org

Page 13: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

University of Cincinnati – Use Case Question: Why is digital preservation important to

us?

Answer:

We just moved a repository system from Columbus

Ohio to our Cincinnati campus.

10 TBs of data, in 16 different VMDKs (virtual machine

disk images) was transferred over the internet pipeline

Checksums were created for each VMDK and verified

upon receipt, some taking 24 hours to calculate.

Checksums were also created for one-million+ files,

compared with info in the repository database, and re-

compared after the storage format was changed (from

VMDK to NFS). www.aptrust.org

Page 14: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

University of Cincinnati – Use Case Question: Why is digital preservation important to

us?

Answer: (continued)

We decided to test a full backup and restore. This

took over a week, and we discovered that 16 of our

digital assets were corrupt. We diagnosed the cause,

adjusted, and repeated without error – but if we had

not been comparing before and after checksums of all

files we would not have known about the corruption.

This process took a 1.5 months and offered a striking

example of the care that must be taken to avoid losing

data when moving large amounts of it.

www.aptrust.org

Page 15: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

University of Cincinnati – Use Case Question: Why is digital preservation important to us?

Answer: Our credibility is at stake. We want to be

believed.

www.aptrust.org Photograph; President Nixon with Elvis Presley; 20 Dec 1970; Richard Nixon Presidential Library and

Museum, Yorba Linda, California.

http://www.nixonlibrary.gov/forresearchers/find/av/photo/images/12_20_70_3.gif

Page 16: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

University of Cincinnati – Use Case Question: Why is digital preservation important to us?

Answer: (continued)

We are promoting a new digital repository to our

faculty. Its raison d'être – why researchers should

deposit their digital assets in this repository rather than

or in addition to several short-term delivery systems on

our campus – is long term persistence.

We have promised that their assets will also be

preserved in a dark archive such as the Academic

Preservation Trust. We have stated that preservation

means bit-level integrity and format migration.

We have asserted that the Libraries’ traditional mission

of preservation of the cultural record now applies to

the digital scholarly record. www.aptrust.org

Page 17: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

University of Virginia Use Case

Integral part of our preservation and

curatorial landscape

Soup to nuts process for analogue

materials

◦ Selection

◦ Digitization

◦ Management

◦ Stewardship

Page 18: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

UVa - continued

Born Digital

◦ It is all about transfer

◦ Disk images awaiting

arrangement

◦ Need and I/O space

◦ Digital Scholarship

Wish we had this years

ago

Page 19: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

UVa Landscape

Local disk (please only temporary) /

scratch disk

Spinning disk – still only backup

Local HSM – local tape backup

APTrust – more robust preservation

actions

DPN – dark archive

Page 20: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

Basic Technology Goals

Simple submission packaging – BagIt

Strong Chain of Custody – Logging

Format agnostic basic preservation - Fixity

Strong auditing and reporting - PREMIS

Easily reference items between systems – Identifiers

Simple distribution package for restoration - BagIt

Page 21: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

Flow of Content in APTrust

Intellectual Object

Generic File1

Generic File2

Generic File3

Submission Bag

• Metadata (TagFiles)

• Preservation Files

• data/File1

• data/File2

• data/File3

DPN Bag

DPN Bag

DPN Bag

DPN Bag

Break apart bag and

manage as separate

fedora objects

Repackage to

same bag

format

Ingest

Restore

Bagged separately in

DPN to support

versioning

Related

Fedora Objects

Page 22: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

Challenges

Abstracting away from specific repository

software

Identifying content across distributed

systems

Scaling solutions are still a mixed bag

Managing dependencies in a consortium

Deleting content requires some more

work

Page 23: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

Sustainability of Service

Common development frameworks –

Hydra

Use available cloud services - AWS

Align with evolving preservation

ecosystem – OAIS & DDP

◦ Fedora 4

◦ Standards like OAIS and DDP

Page 24: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

APTrust and TRAC Certification

APTrust is committed to working toward TRAC certification,

APTrust is the first ever repository to be built from the ground up taking TRAC into account.

A Certification Working Group has been established and will be advising and consulting with the APTrust staff and partners on TRAC objectives.

Initial development work is proceeding at the level of Digital Object Management and Infrastructure.

Page 25: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

Examples of TRAC Requirements

“The repository shall have an appropriate succession plan, contingency plans, and/or escrow arrangements in place in case the repository ceases to operate or the governing or funding institution substantially changes its scope.”

“The repository shall have short- and long-term business planning processes in place to sustain the repository over time.”

“The repository shall have contracts or deposit agreements which specify and transfer all necessary preservation rights, and those rights transferred shall be documented.”

“The repository shall have the appropriate number of staff to support all functions and services.”

“The repository shall have and use a convention that generates persistent, unique identifiers.”

Page 26: Can a Consortium Build a Viable Preservation Repository?daviss/work/files/presentations/CNI... · 2014. 3. 31. · Question: Why is digital preservation important to us? Answer: (continued)

Academic Preservation Trust – part

of the evolving national digital

preservation infrastructure

“The Task Force envisions the development of a national system of digital archives, which it defines as repositories of digital information that are collectively responsible for the long-term accessibility of the nation’s social, economic, cultural and intellectual heritage instantiated in digital form.”

Preserving Digital Information. Report of the Task Force on Archiving of Digital Information, commissioned by The Commission on Preservation and Access and the Research Libraries Group. May 1, 1996. Executive Summary, iii.


Recommended