Metadata Interaction, Integration, andInteroperability
MODS, MARC and Metadata Interoperability, ALA Conference, June 27, 2005, Chicago, IL
William E. Moen<[email protected]>
School of Library and Information SciencesTexas Center for Digital Knowledge
University of North TexasDenton, TX 72603
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 2
Is there a problem? Many metadata schemes and element sets
Well known & documented Less known and little public documentation
Similar/same content described by different metadata schemes and vocabularies No canonical metadata record for an object
Varied syntaxes for encoding metadata No canonical syntax
A vital and diverse metadata ecology! No problem, unless….
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 3
Metadata in the networked environment
Interaction between systems that use metadata Harvesting Searching
Integrating different types of metadata for local information management Technical metadata for digital asset mgmt
Reusing metadata in local applications ONIX metadata in library systems
Interoperability?
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 4
Importance of interoperability
Systems and organizations will interoperate
One should actively be engaged in the ongoing process of ensuring that the systems, procedures and
culture of an organisation are managed in such a way as to maximise opportunities for exchange and re-use of
information, whether internally or externally. Paul Miller, 2000
Metadata interoperability has to be the underlying principle of networked information management.
Marcia Lei Zeng, 2001
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 5
Interoperability
System-oriented definition: The ability of two or more systems or
components to exchange information and use the exchanged information without special effort on either system
User-oriented definition: The condition achieved when two or more
technical systems can exchange information directly in a way that is satisfactory to users of the systems (AAP)
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 6
Interoperability factors In the context of networked information
retrieval Multiple and disparate systems (operating
systems, information retrieval systems, etc.) Multiple protocols Multiple formats of data Multiple metadata schemes Multiple vocabularies, ontologies, disciplines Multiple languages Multiple character sets
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 7
Preliminary framework for interoperability
In the context of networked information retrieval Within and across communities Information communities/Communities of
practice• Focal community • Extended community• Extra community
Costs to achieve interoperability vary
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 8
Interop Among and Across Communities
Focal Community(e.g., Libraries)
Focal Community(e.g., Archives)
Focal Community(e.g., Museum)
Extended Community(e.g., Cultural Heritage)
Focal Community(e.g., Geospatial )
Focal Community(e.g., Geospatial)
Focal Community(e.g., Natural History
Museums)
Extended Community
Extra Community
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 9
Communities Communities of practice (Wenger)
Network of professionals • work on common problems • speak a common language• share similar values• produce shared meanings
Information communities Looser affiliation of people
• creators• information managers• users
Membership in multiple information communities
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 10
Rust’s people & stuff (& agreements) model
People Stuff
Create
Manage
Use
People creating stuff for specific information community; stuff used by multiple communities
People managing stuff within context of community of practice Different communities of practice interested in same stuff
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 11
Interoperability cost vs. functionality Adoption of common standard
low cost with low functionality higher functionality but with a greater cost of adoption
No best point on the curve – every point is optimal for some purpose
Functionality
Cost of acceptance
Many adopters
Few adopters
Arms, et al., 2002
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 12
So we have …
Many metadata schemes and element sets Similar/same content described by different
metadata schemes and vocabularies Varied syntaxes for encoding metadata Which reflect:
Community practices, needs, meaning Cost barriers to adopting common standards Lack of knowledge of available standards Not invented here syndrome
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 13
Mechanisms for addressing interoperability
Crosswalks and mapping Application profiles Registries Resource Description Framework (RDF)
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 14
Mapping and crosswalks Mapping: Intellectual activity that identifies
semantically equivalent elements in different metadata schemes
Crosswalk: Documentation resulting from mapping showing the equivalencies and conversion specifications
1998 NISO White Paper on Crosswalks
Unfortunately, the specification of a crosswalk is a difficult and error-prone task requiring in-depth
knowledge and specialized expertise in the associated metadata standards
St. Pierre & LaPlant, 1998
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 15
Mapping issues Semantic, structural, and data conversion One-way or reversible mappings? Mapping between any two elements:
One-to-one One-to-many (repeatable elements; unique more narrowly
defined elements) Many-to-one (complete mapping; incomplete mapping) One-to-zero (no semantically equivalent element)
Data conversion From less inclusive to more inclusive format From uncontrolled to controlled vocabulary
Correct and efficient mapping of metadata elements among various formats is the essential condition for
ensuring metadata interoperabilityZeng & Xiao, 2001
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 16
Mapping to an interoperable core OCLC Office of Research’s Metadata Switch
Project Experimental modular services that add value to
metadata
Metadata Schema Transformation Web Service (Godby, et al., 2003)
An interoperable core Translations between metadata standards via mapping
to and from the core Reducing the number of separate mappings between
metadata standards Design of the interoperable core is an open issue
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 17
Application profiles
Reuse of elements from different sets, but cannot define new elements
Specify permitted schemes (e.g., date/time formats, controlled vocabulary) for data values
Can refine standard definitions
Application profiles consist of data elements drawn from one or more namespace schemas combined together by implementors and
optimised for a particular local application.Heery & Patel, 2000
By defining application profiles and, most importantly by declaring them, implementers can start to share information about their
schemas in order to inter-work with wider groupings. Heery & Patel, 2000
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 18
Registries
Metadata registry: An index of metadata terms, official definitions, local variations
extensions Can enable the reuse of existing elements rather than
users/communities reinventing their own UK Schemas Project: Includes registry of several
metadata element sets EU Cores Project: Includes registry of core vocabularies
and profiles; a schema creation tool and Web interface to register schemas
Dublin Core Metadata Registry: Authoritative source for DC; Designed to promote the discovery and reuse of exiting metadata definitions;
The term "registry" covers a broad range of databases, documentation services, or Web-based portals providing access to schemas.
Baker, et al., 2001
Almost universally, registries are seen as our best hope in the medium term for a scalable solution to the problem of mapping and translating between a diversity of schemas.
Baker, et al., 2001
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 19
Resource Description Framework (RDF) Provides a basic grammar for representing
metadata terms, their semantics, relationships, etc.
Use of Uniform Resource Identifiers (URIs) to identify namespace schemas where terms are declared and defined
RDF Schemas and XML Schemas, see: Heery & Johnston, 2003 Hunter and Lagoze, 2001 Baker, et al., 2001
SchemaWeb: gathers information about schemas published on the web
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 20
References Application Profiles: Mixing and Matching Metadata Schemas.
Heery & Patel. 2000. http://www.ariadne.ac.uk/issue25/app-profiles/
Combining RDF and XML Schemas to Enhance Interoperability Between Metadata Application Profiles. Hunter & Lagoze. 2001.
http://archive.dstc.edu.au/RDU/staff/jane-hunter/www10/paper.html CORES Project: A Forum on Share Metadata Vocabularies
http://www.cores-eu.net/ The Dublin Core Metadata Registry
http://www.dublincore.org/dcregistry/ Issues in Crosswalking Content Metadata Standards. St. Pierre &
LaPlant. 1998 http://www.niso.org/press/whitepapers/crsswalk.html
Mapping Metadata Elements of Different Formats. Zeng, M. L. & Xiao, L. 2001.
A Metadata Registry for the Semantic Web. Heery & Wagner. 2002.
http://www.dlib.org/dlib/may02/wagner/05wagner.html
Moen MODS, MARC and Metadata Interoperability -- June 27, 2005 --Chicago, IL 21
References Metadata Schema Registries in the Partially Semantic Web: The
CORES Experience. Heery & Johnston. 2003. http://www.oclc.org/research/projects/mswitch/default.htm
Metadata Switch Project. OCLC. 2004. http://www.oclc.org/research/projects/mswitch/default.htm
SCHEMAS Project: Forum for Metadata Schema Implementers http://www.schemas-forum.org/
SchemaWeb. http://www.schemaweb.info/default.aspx
A Spectrum of Interoperability. Arms, et al. 2002 http://www.dlib.org/dlib/january02/arms/01arms.html
Two Paths to Interoperable Metadata. Godby, et al. 2003. http://www.siderean.com/dc2003/103_paper-22.pdf
What Terms Does Your Metadata Use? Application Profiles as Machine-Understandable Narratives. Baker, et al. 2001.
http://jodi.ecs.soton.ac.uk/Articles/v02/i02/Baker/