How to develop a digital preservation strategy

Post on 10-Apr-2017

1,112 views 0 download

transcript

@ulcc www.ulcc.ac.uk

How to develop a digital preservation strategyEd Pinsent, ULCC

IRMS IrelandARA Ireland19 November 2015

@ulcc www.ulcc.ac.uk

About Ed Pinsent

• Ed Pinsent, Digital Archivist at ULCC since 2004• Teaches digital preservation on the DPTP• Background as archivist / records manager• Experience in web-archiving, repository

management, metadata projects, migration, digitisation, project management, etc.

• See more at digital archives blog http://dart.blogs.ulcc.ac.uk/

@ulcc www.ulcc.ac.uk

Outline

1. Introduction2. High-level strategies3. Detailed strategies4. Metadata!5. Possible implementation scenarios6. Where to get help

@ulcc www.ulcc.ac.uk

Introduction

• Start small – and grow• Do it in stages

@ulcc www.ulcc.ac.uk

4 workpackages

WP1: High-level strategiesWP2: Detailed strategiesWP3: Metadata!WP4: Possible implementation scenarios

@ulcc www.ulcc.ac.uk

Assumptions

• You already know about…– Your users

(depositors, searchers, internal staff)

– Your collections– Your access needs

@ulcc www.ulcc.ac.uk

Workpackage 1

• Ask a few fundamental questions…• WHY do I want to do digital preservation?• WHAT will I be preserving?• WHO am I preserving it for?• HOW. What capacity do I have for doing

it?

@ulcc www.ulcc.ac.uk

Why preserve digital content?

• Legal need?• Mandate?• Continuity?• Business reasons?• Stakeholder need?• Seeking improvements?

@ulcc www.ulcc.ac.uk

What are we preserving?

• How many files?• What formats?• Where are they?

• Start to narrow the task• Aim to reduce the digital preservation

burden

@ulcc www.ulcc.ac.uk

Who wants it?

• Who benefits from digital preservation?

@ulcc www.ulcc.ac.uk

How will we preserve?

• Capacity– IT– Skills– Resource– Staff– Money

@ulcc www.ulcc.ac.uk

Things to do: Survey

• Preparatory surveys of digital content• Checklist of file formats in scope• Ownership of collections• Whereabouts of collections• Size of collections

@ulcc www.ulcc.ac.uk

Things to do: Users

• Do initial surveys of user community– User community could be internal staff,

or external users• Consider “use cases”• Prepare for your access strategy

@ulcc www.ulcc.ac.uk

Things to do: Capacity

• Perform gap analysis of your organisation• Start a discussion with IT• Get a developer on the team• Appoint a project manager• Grow your skillsets• Look for suitable training packages

@ulcc www.ulcc.ac.uk

Outcomes of WP1

• Findings, evidence base• Use for a report, to make business case,

to start requirements gathering…

@ulcc www.ulcc.ac.uk

Workpackage 2

• How will you treat digital objects?• Potential commit to a new workload, which

will be expensive• Look for economies• Define the task and expected results• This part of the talk will focus on Migration

of Files

@ulcc www.ulcc.ac.uk

Tasks might include…

• Study of file formats & behaviours• Study of conversion tools• Testing the above• Doing it against defined acceptance

criteria• Documenting the outcomes

@ulcc www.ulcc.ac.uk

And working with other strategies…

• Archivists – how much loss will we accept?

• Users – what formats suit them?• Collections – their digital nature• Other stakeholders• Building use cases

@ulcc www.ulcc.ac.uk

Outcome of WP2

• Evidence base passed to IT

@ulcc www.ulcc.ac.uk

Workpackage 3

@ulcc www.ulcc.ac.uk

Why metadata matters

• Gain intellectual control• Manage digital objects• Documenting the nature of digital objects

(technical metadata)• Documenting your actions (migrations, changes)• Documenting deposits• Prove the authenticity of your content, increasing

trust and confidence (checksums, audit trails…)• Manage rights, access, FOI, IPR…

@ulcc www.ulcc.ac.uk

Outcome of WP3

• Metadata policy passed to IT• Requirements passed to vendor

@ulcc www.ulcc.ac.uk

Workpackage 4

• Possible implementation scenarios– Different areas of concentration– Different benefits for different audiences– How it could grow– Get there incrementally and still reap

benefits as you grow– Costs: understand them, break them

down, phase them in over time

@ulcc www.ulcc.ac.uk

Scenario #1: storage only

Access / migration strategy deferred

Dedicated storage, offsite copies, security,

backup, checksums,

validation, etc.

Focus: Bit-level preservation

@ulcc www.ulcc.ac.uk

How it can grow

• Increased storage capacity

• Different types / levels of storage

• Tools for compression

• Improvements to technical metadata

• Archivists become sysadmins

@ulcc www.ulcc.ac.uk

Scenario #2 : AccessFocus: Create access

copies and serve them to users

• Migration approach

• Tools and workflows for migration

• Digital objects linked to descriptive catalogues

• Additional storage for access copies

@ulcc www.ulcc.ac.uk

How it can grow• Automatic

creation of access copies

• More, better conversion tools

• More catalogues / metadata

• Dedicated user API for viewing content

• Multiple format support

• Sharing / consortial arrangements

@ulcc www.ulcc.ac.uk

Scenario #3: For archivists

Focus: Create great preservation copies

and associated metadata

• Technical metadata / file format tools

• API for archivists

• Migration strategy

Database to index and search

@ulcc www.ulcc.ac.uk

How it can grow• Develop migration

strategy for more formats

• Preservation metadata and audit trails

• Improvements in technical metadata

• More file format identification tools

Additional security / validation

More powerful database, improved reporting

@ulcc www.ulcc.ac.uk

Scenario #4: for records managers

Focus: Producer engagement, user needs, compliance

• Captures born-digital records and keeps them long-term

• Normalisation policy and conversion tools

• Access may be deferred, must be controlled

@ulcc www.ulcc.ac.uk

How it can grow• Better

workflows• Better security• Better audit

trails and record-keeping

• Accessible, protected copies of content for external auditors and internal staff

• Better producer relations

• Transfer methods

• Early capture

Additional security /

authenticity assurance

@ulcc www.ulcc.ac.uk

Scenario #5: income generation

Focus: Generate income streams from

preserved content

• Content served to a paying public

• Adds a paywall mechanism or shopping cart

• Protects assets

@ulcc www.ulcc.ac.uk

How it can grow• Social media /

tagging• User interaction• Generates

interest• Licenses reuse

of (e.g.) images

@ulcc www.ulcc.ac.uk

Conclusion

• Where to get help and advice• http:// www.dpconline.org• http:// www.dcc.ac.uk• http:// www.dptp.org• http://www.clir.org• http://blogs.loc.gov/digitalpreservation/

@ulcc www.ulcc.ac.uk

Strategic tools for planning digital preservation

• Drambora - http://www.repositoryaudit.eu/• DPCMM -

http://www.savingthedigitalworld.com/resources/digital-preservation-capability-maturity-model

• AIDA - http://aida.jiscinvolve.org/wp/• CARDIO - http://cardio.dcc.ac.uk/• LOC levels -

http://blogs.loc.gov/digitalpreservation/2012/11/ndsa-levels-of-digital-preservation-release-candidate-one/

• PLATO - http://www.ifs.tuwien.ac.at/dp/plato/intro/• SCOUT - http://scout.keep.pt/web/