Date post: | 22-Dec-2015 |
Category: |
Documents |
View: | 217 times |
Download: | 0 times |
OAI-PMH at Yale
Report on the DLF OAI Training Session
November 10, 2005Charlottesville, VA
Overview
• Review of the protocol• OAI best practices• Potential Yale applications• Next steps for the Metadata Committee
OAI-PMH v.2.0Basic Concepts
• Data provider: administers systems that expose metadata
• Service provider: uses metadata to build value-added services
• Harvester: a client application that issues OAI-PMH requests
• Repository: a network accessible server that can process OAI-PMH requests
OAI-PMH v.2.0Basic Concepts
• Resource: the physical or digital object that metadata is "about"
• Item: a constituent of a repository from which metadata about a resource can be disseminated
• Record: metadata in a specific format• Identifier: a unique identifier that
unambiguously identifies an item in a repository; must conform to URI syntax
OAI-PMH v.2.0Harvesting
• Deleted records• Sets• Datestamps
– ISO 8601– UTC
• Selective harvesting
OAI-PMH v.2.0Protocol Features: HTTP
• Request– GET baseURL?key=value&….&key=value– POST baseURL
Content-Type: application/x-www-form-urlencodedContent-Length: number of characterskey=value&…&key=value
• Response– XML document in message body or error code
OAI-PMH v.2.0Protocol Features: XML Response
• XML declaration<?xml version="1.0" encoding="UTF-8"?>
• OAI-PMH root element with these attributes:– Default namespace declaration
xmlns=“http://www.openarchives.org/OAI/2.0/”
– Schema instance declarationxmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance”
– Schema locationxsi:schemaLocation=“http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd”
OAI-PMH v.2.0Protocol Features: XML Response
• responseDate element– YYYY-MM-DDThh:mm:ssZ
• request element<request key=“value” key=“value” key=“value”>baseURL</request>
• response element– It has the same name as the verb used in
the request.
OAI-PMH v.2.0Protocol Features
• Multiple metadata formats– metadataPrefix
• Flow control– resumptionToken
OAI-PMH v.2.0Requests & Responses
• GetRecord• Identify• ListIdentifiers• ListMetadataFormats• ListRecords• ListSets
Current Work: Resource Harvesting within the OAI-PMH Framework• Datestamps
– Updated record vs. updated resource
• Locating the resource– Multiple URLs: splash page, resource,
etc.– Multiple elements used inconsistently:
dc.identifier, dc.format, dc.relation
Current Work: Resource Harvesting within the OAI-PMH Framework• Complex object formats
– FOXML– METS– MPEG-21 DID– SCORM
• Other implementations– mod_oai
OAI Best PracticesDLF OAI Implementers Workshop
Handouts from the session1. Project Abstract2. The Case for OAI3. OAI “Cheat Sheet”: A Taxonomy of Rapid OAI
Deployment Strategies4. Summary of OAI Metadata Best Practices5. Summary of the DLF Aquifer MODS Profile6. OAI Tools7. OAI Implementation: Administrative Planning
OAI Best PracticesImplementation Decisions
• Collections– Develop criteria. Prioritize according to
ease of implementation, associated risk, logical dependencies among items, etc.
• Metadata formats– Decide which formats to support.
• Technical infrastructure– E.g., use a gateway that provides a base
URL for multiple individual collections.
OAI Best PracticesDeployment Options
• Emory’s Metadata Migrator• Static repositories• UIUC’s OAI FileMakerPro Gateway• Fedora• Luna Insight
OAI Best Practicesfor Data Providers
• Identifiers– Should be persistent & unique.– Should not be reused.– Specification and XML Schema
• Datestamps– Use UTC.– Support seconds granularity, if possible.
• Deleted records– Provide persistent support, if possible.
OAI Best Practicesfor Data Providers
• Resumption tokens– For repositories > 2 MB
• Sets– Service providers harvest by set.– How should sets be organized?
• About containers– Rights– Provenance (for 3rd party aggregators)
Implementation Guidelines
Includes:Guidelines for Repository ImplementersGuidelines for Harvester Implementers
OAI Validation
• Reap: OAI command line harvesting• Repository explorer: for data providers
& service providers to test harvesting & searching
• W3C validator for XML schema• Utf8conditioner: for character encoding
problems• See OAI Tools handout for more info.
OAI Best Practicesfor Shareable Metadata
The four C’s of shareable metadata• Consistency• Coherence• Context• Conformance
OAI Best Practicesfor Shareable Metadata
• Metadata in a shared environment– Context & coherence– Don’t assume a local user.
• Granularity of description– Appropriate for access to the resource– Don’t expose records for subordinate items.
• Use of multiple metadata formats– Need to be expressed as XML schema– Stepped crosswalking to simpler formats.
OAI Best Practicesfor Shareable Metadata
• Relating versions of a resource– One-to-One Principle– Multiple strategies/compromises
• Document metadata creation practices– In OAI responses– In external documentation
• Communication with service providers
Potential Applications at YaleImplementation Goals
• Improve user experience– Federated search
• Improve management of resources– Finding aids
• Collaborate with institutional partners– AMEEL
• Develop digital library infrastructure– At Yale and beyond
Potential Applications at YaleResources & Roles
• Resources– Commitment of stakeholders– Analysis of deployment options– Server infrastructure– Staff hours
• Roles– OAI-PMH Implementation Manager– Programmers & technical staff– Metadata specialists– Digital collection curators
Potential Applications at YaleSharing Metadata
• 3rd Party Aggregators– OAIster– DLF Portal– MODS Portal
• Registries– Registered OAI repositories– Institutional Archives Registry– OAI Registry at UIUC
Next Steps for the Metadata Committee
• Centralized implementation at Yale?If yes, – Relate to other digital library initiatives.– Create buy-in.
• Service provider needs– Consult with IAC committees.
• Data provider needs– Consult with digital collection curators.
Next Steps for the Metadata Committee
• Metadata recommendations– Recommend multiple formats– Decide upon a common format
• YES? MODS?• Stepped crosswalking from other formats
– Content & encoding guidelines– Metadata creation tools– Staffing