+ All Categories
Home > Business > Metadata For Preservation Delos

Metadata For Preservation Delos

Date post: 14-May-2015
Category:
Upload: digitalpreservationeurope
View: 978 times
Download: 4 times
Share this document with a friend
Popular Tags:
52
Metadata for Preservation Metadata for Preservation Priscilla Caplan, Florida Center for Library Automation
Transcript
Page 1: Metadata For Preservation Delos

Metadata for Preservation

Metadata for Preservation

Priscilla Caplan, Florida Center for Library Automation

Page 2: Metadata For Preservation Delos

Outline

Do it yourself: let’s invent some preservation metadata

The OAIS Information Model

Metadata standards for preservation general preservation metadata standards format-specific technical metadata packaging standards

Problems/issues/interesting things

Page 3: Metadata For Preservation Delos

first things first

what is metadata?

how do we normally classify different types of metadata?

what is preservation metadata?

Page 4: Metadata For Preservation Delos

first things first

what is metadata?

how do we normally classify different types of metadata?

what is preservation metadata? metadata related to the preservation management of

information resources; for example, metadata used to document, or created as a result of, preservation processes performed on information resources.

information that supports and documents the long-term preservation of information materials.

Page 5: Metadata For Preservation Delos

Fixity

Viability

Renderability

Description

Secure storage

Media management

Availability

Identity

CaptureSelection

Understandability

AuthenticityFormat strategies (migration, emulation..)

Authentication

Documentation

Preservation Pyramid

Page 6: Metadata For Preservation Delos

Fixity

Viability

Renderability

Description

Secure storage

Media management

Availability

Identity

CaptureSelection

Understandability

AuthenticityFormat strategies (migration, emulation..)

Authentication

Documentation

Preservation Pyramid

pre s e r v a t ion

metadata

Page 7: Metadata For Preservation Delos

fixity

the quality of not being altered or deleted threatened by insecure storage and media degredation

Page 8: Metadata For Preservation Delos

metadata supporting fixity

a message digest (checksum) the algorithm used to generate it when it was last calculated who did the calculation

Page 9: Metadata For Preservation Delos

viability

the quality of being readable from media threatened by media degredation and media

obsolescence

Page 10: Metadata For Preservation Delos

metadata supporting viability

the type of medium used to store the object the age of the specific unit the date the object was written to the unit performance metrics for the medium (MTTF) usage metrics for the unit

Page 11: Metadata For Preservation Delos

renderability

the quality of being displayable, playable, or otherwise usable

threatened by format obsolescence

Page 12: Metadata For Preservation Delos

authenticity

the object is what it purports to be; both the source and the content are verifiable

threatened by unknown provenance, undocumented alterations

Page 13: Metadata For Preservation Delos

metadata supporting authenticity

the source of the object a history of the custody of the object a record of any changes to the object a digital signature (maybe)

Page 14: Metadata For Preservation Delos

OAIS

Page 15: Metadata For Preservation Delos

OAIS Information Model

Page 16: Metadata For Preservation Delos

representation information

the information that is needed to make a Content Data Object understandable to a Designated Community

Structural: the format is biff8 column 1 is a date yyyy-mm-dd, column 2 is a decimal

Semantic: this is a daily business log for XYZ Corp. col. 1 is the date of business, col. 2 is gross take in Euros

Page 17: Metadata For Preservation Delos

representation information

may be recursive

Structural: the format is biff8

format specification for biff 8 (in PDF) format specification for PDF

rules for rendering as a spreadsheet column 1 is a date yyyy-mm-dd, column 2 is a decimal

Semantic: this is a daily business log for XYZ Corp. col. 1 is the date of business, col. 2 is gross in Euros

currency equivalence chart

Page 18: Metadata For Preservation Delos

preservation descriptive information

The information necessary to preserve the Content Information

reference = identifier(s)

context = relation to other Content Information

provenance = history of creation, modification, custody

fixity = checksums and similar mechanisms

Page 19: Metadata For Preservation Delos

packaging information

the information which, either actually or logically, binds, identifies and relates the Content Information and Preservation Descriptive Information

Page 20: Metadata For Preservation Delos

Standards

Page 21: Metadata For Preservation Delos

metadata standards for preservation

General preservation metadata standards PREMIS (Preservation Metadata: Implementation

Strategies) LMER (Long-term Preservation Metadata for Electronic

Resources)

Format-specific technical metadata Z39.87 NISO/AIIM Technical metadata for digital still

images AES X089 core audio metadata

Packaging standards METS (Metadata Encoding and Transmission Standard) MPEG-21 Digital Item Declaration Language

Page 22: Metadata For Preservation Delos

general standards

PREMIS (Preservation Metadata: Implementation Strategies)

LMER (Long-term Preservation Metadata for Electronic Resources)

Page 23: Metadata For Preservation Delos

PREMIS

an implementable core set of preservation metadata

defines preservation metadata as “the information a repository uses to support the digital preservation process”

defines core as what most repositories need to know most of the time

but what is implementable?

Page 24: Metadata For Preservation Delos

Implementable preservation metadata ...

is precisely defined can be automatically supplied can be automatically processed

e.g. prefer coded values from authority lists is implementation independent is based on a rigorous data model

Page 25: Metadata For Preservation Delos

PREMIS Data Model

Page 26: Metadata For Preservation Delos

Intellectual entity

Set of content that is considered a single intellectual unit for purposes of management and description (e.g., a book, a photograph, a map, a database)

May include other Intellectual Entities (e.g. a website that includes a web page)

Has one or more digital representations Not described in PREMIS – use descriptive metadata

Examples: Planets Newsletter, Issue 3 “Identical twins” by Diane Arbus (a photograph) Digital Curation Centre website

Page 27: Metadata For Preservation Delos

Object

What the repository actually preserves Three types of object:

FILE: named and ordered sequence of bytes that is known by an operating system

REPRESENTATION: set of files, including structural metadata, that, taken together, constitute a complete rendering of an Intellectual Entity

BITSTREAM: data within a file with properties relevant for preservation purposes (but needs additional structure or reformatting to be stand-alone file)

Page 28: Metadata For Preservation Delos

Example: An IE with two representations

Intellectual Entity:“My dog Ace”

Representation1: TIFF version

Representation 2:JPEG2000 version

File 1: dog.TIFF File 2: dog.JP2

Bitstream 1:Embedded metadata

Page 29: Metadata For Preservation Delos

Example 2: Another IE with 2 representations

Intellectual EntityDa Vinci Code by

Dan Brown

Representation 1Page image

version

Representation 2ebook version

File 1: page1.tiff

File 2:page2.tiff

File N:pageN.tiff

File 1:book.lit

File N+1:METS.xml

Page 30: Metadata For Preservation Delos

Event

An action that involves or impacts at least one Object or Agent associated with or known by the preservation repository

Helps document digital provenance. Can track history of Object through the chain of Events that occur during the Objects lifecycle

Examples: Validation Event: verify that chapter1.pdf is a valid PDF

file Ingest Event: transform an OAIS SIP into an AIP (one

Event or multiple Events?) Migration Event: create a new version of a file in a more

current format

Page 31: Metadata For Preservation Delos

Agent

Person, organization, software program associated with an Event or a Right

Not defined in detail in PREMIS

Examples: Seamus Ross (a person) British Library (an organization) DAITSS (a system) dioscuri (a software program)

Page 32: Metadata For Preservation Delos

Rights

Rights statement describes one or more rights or permissions granted to the repository What is the basis for claiming the right? – statute, copyright,

license What can the repository do?

Examples: because copyright status is public domain, repository can

give unrestricted access, make copies and make derivative works

because of license terms, repository can make up to 10 copies

Page 33: Metadata For Preservation Delos

some things we say about Objects

object identifier general technical characteristics, e.g.

size, format, fixity, inhibitors, creating application composition level

format specific technical characteristics (use extension) original name storage environment digital signature relationships to other objects relationships to agents, events, and rights statements significant properties

Page 34: Metadata For Preservation Delos

significant properties

the characteristics of digital objects which must be preserved over time in order to ensure the continued accessibility, usabilty, and meaning of the objects, and their capacity to be accepted as evidence of what the purport to record. (Andrew Wilson)

the characteristics of a particular object subjectively determined to be important to maintain through preservation actions

Page 35: Metadata For Preservation Delos

how could you preserve this apple?

Page 36: Metadata For Preservation Delos

significant properties

performance model: a source file is interpreted through a process to create a performance; in other words, the object is meaningful only as it is perceived

often faceted as content, context, appearance (rendering), structure, and behavior

InSPECT (Investigating the Significant Properties of Electronic Content over Time)

can apply to all objects of a given format, or individual objects

may be in the eye of the beholder

Page 37: Metadata For Preservation Delos

some things we say about Events

event identifier event type date and time detail outcome information agents and their roles objects and their roles

Page 38: Metadata For Preservation Delos

Sample Data Dictionary entry

Semantic unit size Semantic components

None

Definition The size in bytes of the file or bitstream stored in the repository.

Rationale Size is useful for ensuring the correct number of bytes from storage have been retrieved and that an application has enough room to move or process files. It might also be used when billing for storage.

Data constraint Integer Object category Representation File Bitstream Applicability Not applicable Applicable Applicable Examples 2038927 Repeatability Not repeatable Not repeatable Obligation Optional Optional Creation/ Maintenance notes

Automatically obtained by the repository.

Usage notes Defining this semantic unit as size in bytes makes it unnecessary to record a unit of measurement. However, for the purpose of data exchange the unit of measurement should be stated or understood by both partners.

Page 39: Metadata For Preservation Delos

PREMIS Maintenance Activity

Page 40: Metadata For Preservation Delos

LMER

Authored by Die Deutsche Bibliothek, used in kopal

Explicitly for exchange

Based on the National Library of New Zealand’s data model

Page 41: Metadata For Preservation Delos

a quick aside on archives

Page 42: Metadata For Preservation Delos

format specific technical metadata

What kinds of properties are format-specific? number of tracks character set height, width color space fonts

Page 43: Metadata For Preservation Delos

format specific metadata “standards”

NISO/AIIM Z39.87-2006, Data Dictionary - Technical metadata for digital still images

AES-X098B, Core audio metadata XML definition (draft) textMD (now maintained by Library of Congress) JHOVE and metadata extraction tools

Page 44: Metadata For Preservation Delos

Z39.87

Revised second edition

Defers to PREMIS where elements overlap

XML binding is MIX, maintained by Library of Congress

Page 45: Metadata For Preservation Delos

issues with format-specific metadata

how much of it is useful for preservation? what would you use it for? if you can extract it from a file header, do you need to need

to extract it from the file header? what to do when schema for format-specific metadata also

defines general technical metadata? what is the proper role of registries?

Page 46: Metadata For Preservation Delos

packaging standards

METS (Metadata Encoding and Transmission Standard) MPEG-21 Digital Item Declaration Language IMS Global Learning Consortium Content Packaging

Standards Sharable Content Object Reference Model (SCORM) CCSDS XML Packaging scheme

Page 47: Metadata For Preservation Delos

METS

Page 48: Metadata For Preservation Delos

structure of a METS document

amdsec can include source, provenance, rights, and technical metadata

Page 49: Metadata For Preservation Delos

Issues, problems and Interesting things

Page 50: Metadata For Preservation Delos

does preservation metadata actually work?

how best to store in working repositories?

role of centralized registries

what can be automated

best practices for interoperability

Page 51: Metadata For Preservation Delos

references

Priscilla Caplan, Preservation Metadata (DCC Digital Curation Manual) http://www.dcc.ac.uk/resource/curation-manual/chapters/preservation-metadata/

Brian Lavoie, Technology Watch Report: Preservation Metadata http://www.dpconline.org/docs/reports/dpctw05-01.pdf

PREMIS Maintenance Activity http://www.loc.gov/standards/premis/

METS Maintenance Activity http://www.loc.gov/standards/mets/

Page 52: Metadata For Preservation Delos

Creative Commons Licence

This work is licensed under the Creative Commons Attribution 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/us/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.


Recommended