Case History:Library of Congress
Audio-Visual Prototyping Project
METS Opening Day (2003), Revised
For the CUL Metadata Working Group
July 22, 2004
Carl Fleischhauer
Office of Strategic Initiatives, Library of Congress
The AV Project
• Preservation, sense one: reformatting into digital-file form
• Preservation, sense two: sustaining digital objects
• Participation by Motion Picture, Broadcasting, and Recorded Sound Division (M/B/RS) and the American Folklife Center
Reformatting Documentation
• About the source – original disc or tape being reformatted– <amdSec><sourceMD><AMD audio ext schema>
• About the process– how the copy file was made, what devices/tools– <amdSec><digiProvMD><PMD process ext schema>
• About the outcome– characteristics and features of the copy file– <amdSec><techMD><AMD audio ext schema>
Diagram of Extension Schemas See also: http://lcweb.loc.gov/rr/mopic/avprot/metsmenu2.html
<mets>
<dmdSec> descriptive metadata <MODS> MODS standard as maintained by LC
<techMD> technical metadata <AMD> audio (file) metadata rolled our own, using data dictionary from AES <MIX> image (file) metadata standard as maintained by LC, data dictionary from NISO
<rightsMD> rights and access mgt metadata <RMD> access “category” metadata rolled our own, just tracking categories
<sourceMD> source metadata <AMD> audio (source) metadata same schema as AMD above <MIX> image (source) metadata same schema as MIX above
<digiProv> digital provenance metadata <PMD> digital provenance metadata rolled our own, data dictionary from AES, with some simplifications
<behaviorSec> behavior section did not use will METS profiles play this role?
<fileGrp> file group (inventory) from METS proper
<structMap> structural map from METS proper
</mets>
PRODUCERS
ADMINISTRATION
DATAMANAGEMENT
ARCHIVALSTORAGE
INGEST ACCESS
CONSUMERS
PRESERVATION PLANNING
Reference Model for an Open Archival Information System (OAIS)
SIPs (Submission Information Packages) will be
produced by the AV preservation activity, ready to
submit to LC’s future digital repository.
AV Project Web Site Home Page http://lcweb.loc.gov/rr/mopic/avprot/
AV Project Extension Schema Page http://lcweb.loc.gov/rr/mopic/avprot/metsmenu2.html
AV Project Initial Data Capture System
MS-Access Database - Collation Input Screen
Top level: work
Second level: sound recordings
Third level: disc sides
Fourth level: cuts
Recorded Sound Processing Section
Content selected for reformatting
1. Initial creation or copying-in of metadata
Workflow Sidebar
Recorded Sound Processing Section
Content selected for reformatting
1. Initial creation or copying-in of metadata
LC Recording Lab or offsite contractor
Scanning activity
2. Creation of second layer of metadata
Workflow Sidebar
Recorded Sound Processing Section
Content selected for reformatting
1. Initial creation or copying-in of metadata
LC Recording Lab or offsite contractor
Scanning activity
2. Creation of second layer of metadata
3. Return loop to processing, edit and possible addition of third layer of metadata
Workflow Sidebar
The AV METS System Today
OUTCOME ONE: A VIRTUAL DIGITAL OBJECT (SIP)
Logical storage structure based in a UNIX filesystem
master -- family of logical directories where the master files are stored (there is a parallel set of “service” directories)
afc -- “owner” is the American Folklife Center
afc1941001 -- group or aggregate of items, often from an actual collection
sr05 -- item directory (at the level of the digital object, counterpart to a bib record or “line” in a finding aid)
sr05am.wav -- the master file for side A of this disc
sr05am.wav -- the master file for side B of this disc
Index of master/afc/afc1941001/sr05
OUTCOME ONE: VIRTUAL DIGITAL OBJECT
The fileGrp segment of a METS instance “binds” the object
Includes logical pathnames for files, future switch to persistent names possible.
OUTCOME 2: PRESENTATION OF OBJECT
Presentation in Browser
Zoom on Image in Presentation
Interim username/password access management
In the Presentation: Metadata Map for the Dedicated
sourceMD data from the Metadata Map
Extension schema content displayed as name-value pairs
Generator takes data from the database and makes METS XML
Snapshot of the database back end
Selection from the database diagram: tables for METS id, agent information, and structMap data
Selection from the database diagram: tables for extension schema data for image source, video source, and audio source
Selection from the database diagram: tables for digiProv (“digitization process”) information
Builder: the data-entry front end to the database
Builder: template making tool
Builder: tool to shape a structMap using indent, outdent, up, and down. May be used in both template and individual object modes.
“Cut wizard” – a twenty more like this one tool
Part of MODS descriptive data for a recorded interview with a former enslaved person.
File Association Tool
Tool to append a MODS record
Two samples from the MODS entry and editing tool.
+ repeats the section
x and – delete sections or subsections
Selection from the online data dictionary
Some METS objects, by title
Administration Tool Menu
Example of data entry screen
Blue terms are used to select separate data entry screens
Some Shortcomings
• Cumbersome data entry – many screens, many actions
• Bugs – hard to get them all fixed now that the contractor is gone
• Best if users understand METS and the structMap – barrier to entry for new team members
• Does not include tools for bulk compilation from pre-existing data
Distributed Data Entry
• Each team enters its own data in less cumbersome “local” tools
• Tool for descriptive data, especially copying in and out of the ILS
• Tool for data about the source item and certain technical aspects, copied in and out of MAVIS
• Tool for digiProv data, “the engineers’ form”• Tool or a MAVIS extension to encode the
structMap
Supporting Tools
• Approach being discussed– Dispersed tools produce XML outputs– Centralized tool gathers and compiles the
various XML data units into a METS instance– Downstream facility to manage the METS
XML documents
Supporting Concept• METS profiles
– LC implementation in early development by Morgan Cundiff
• Rationale– METS is very flexible, need to narrow use within
an organization or community– Profiles establish limits that make for more
efficient tool-building and more efficient work– Profile-governed objects will enhance
interoperability between repositories
METS Profile for “simple phonodisc”
• Relatively simple object but profile with some detail
• May evolve into a more general profile for a wider range of phonodiscs
• For now: agnostic about administrative metadata
Show examples
METS Profile for “simple phonodisc”
• Limit to discs partly for management of semantics
• Apply judgment to find the right point between specific and general– not much experience yet
• History of reformatting may inhibit our imagination – we still are using terms that fit the source object and not the digital object