FRBR Workshop, OCLC, 2005
Toward an International Sharing and Use of
Subject Authority Data
Marcia Lei Zeng
Athena SalabaKent State University
FRBR Workshop, OCLC, 2005
Outline
1. Background information
2. Current State
3. Authority Data
4. Sharing Authority Data
FRBR Workshop, OCLC, 2005
1. Background
1.1 Subject access
Seeking information on a topic is still the predominant user task
Subject access includes: Subject searching Keyword searching Subject browsing
It is still very problematic for the majority of searchers
FRBR Workshop, OCLC, 2005
1.2 Functions of a catalog regarding subject access (1)
Cutter (1897) To find a book if the subject is known To show what a library has on a given
subject (collocate) To assist in the choice as to its character
(identify)
FRBR Workshop, OCLC, 2005
1.2 Functions of a catalog regarding subject access (2)
FRBR (1998) To find entities of Group 1 that have
entities from Group 1, 2, 3 as their subject To identify To select To obtain
FRBR Workshop, OCLC, 2005
1.3 What is a subject?
Group 1 Work Expression Manifestation Item
Group 2 Persons Families Corporate bodies
Group 3 Concepts Objects Place Event
FRBR – Functional Requirements for Bibliographic Records
FRBR Workshop, OCLC, 2005
Revisiting Group 3? Time Process Event is a combination of place and time Concrete vs. abstract concept Ranganathan
Personality Matter Energy Space Time
FRBR Workshop, OCLC, 2005
2. Current State
Subject Authority Data
2.1 Structure (heterogeneous)
2.2 Existing Knowledge Organization Systems/Structures/Schemas (KOS)
2.3 Rules and guidelines
2.4 Communication/Encoding
2.1 Structures
Term Lists:
Synonym RingsAuthority FilesGlossaries/DictionariesGazetteers
Natural language Controlled language
Wea
kly- s
truct
u red
Str o
ngly-
stru
ctur
ed
Classification &Categorization: Subject HeadingsSubject Headings
Classification schemesClassification schemes TaxonomiesCategorization schemes
Relationship Groups: Ontologies Semantic networks
ThesauriThesauri
Pick lists
FRBR Workshop, OCLC, 2005
Structures: Coordination
Pre-coordination ……….. Post-coordination
e.g. subject headings e.g. thesauri
- LCSH - AAT, INSPEC
MeSH FAST
UMLS
FRBR Workshop, OCLC, 2005
2.2 Existing KOS (1)
Library of Congress Subject Headings (LCSH) Medical Subject Headings (MeSH) ERIC Thesaurus (ERIC) Inspec Thesaurus Inspec Classification Dewey Decimal Classification (DDC) Library of Congress Classification (LCC) Universal Decimal Classification (UDC) HEREIN Thesaurus Alexandria Digital Library (ADL) Gazetteer and Thesaurus Schlagwortnormdatei (SWD) Regenburger Verbund Klassifikation (RVK) RAMEAU: repertoire d'authorite de matieres encyclopedique unifie Art and Architecture Thesaurus (AAT) National Agriculture Library Subject Headings … …
FRBR Workshop, OCLC, 2005
2.2 Existing KOS (2)
Verbal basedAATLCSHRAMEUINSPEC ThesaurusMeSH
Code basedUDCRVKDDCLCC
IntegratedINSPEC
MeSH Hierarchy
CombinedCombined
V C
Global environment
Language
Structure
FRBR Workshop, OCLC, 2005
2.3 Rules of KOS Construction
Different rules and guidelines AACR2, Z39.19, RAK (Regeln für die
alphabetische Katalogisierung), ISO5964, ISO2788, IFLA Principles UnderlyingSubject Heading Languages (SHLs) …
No rules Indirect/Inherent use of rules (by example)
FRBR Workshop, OCLC, 2005
2.4 Communication/Encoding for authority data
MARC MARC21 (1xx, 2xx, etc.) UNIMARC (1xx, 2xx, etc. different definition) etc.
Guidelines for Authority Records and References (GARR) (>, <, >>, <<)
NISO Z39.19 (BT, NT, RT, etc.) XML-based: OWL Web Ontology Language,
RDF Schema, Voc-ML, etc.
FRBR Workshop, OCLC, 2005
3. Authority Data
3.1 Use of authority data
Direct use of authority data Index Identify/Verify Search & Browse the
authority data
Indirect use of authority data Searching bibliographic file Browsing bibliographic file
Users Information
professionals
Searcher/end-user
FRBR Workshop, OCLC, 2005
3.2 Common Authority Data
Authorized/established term Variations Related terms Notes Linked/Parallel terms Numbering, International numbering? Other: language, rules, links to external
resources, roles, etc.
FRBR Workshop, OCLC, 2005
Do we need one authorized term?
Keep USER in mind! Preference, language, script
Trends: all are preferred Synonym rings (included in NISO Z39.19
now)
FRBR Workshop, OCLC, 2005
3.3 Common Semantic Relationships in Authority Data
Semantic relationships Broad categories
Equivalence (Use, Used For, UF, See)
Hierarchical (BT, NT, see also)
Associative (RT, see also) More specific relationships,
such as: Is part of Is instance of Agent/process Process/product
Need for other types of relationships? ADL, such as:
Overlap; administrativePartOf; SubFeatureOf
UMLS, such as: Like; Parent; Child; Sibling
WordNet, such as: Familiarity; derivationally
related
FRBR Workshop, OCLC, 2005
Unanswered Question
What authority data currently exist in an authority record?
orWhat authority data should be
included in an authority record?
FRBR Workshop, OCLC, 2005
4. Sharing Authority Data in a Global Environment
4.1. Challenges Structures Languages and
scripts Rules Encoding
CombinedCombined
V C
Global environment
Language
Structure
FRBR Workshop, OCLC, 2005
4.2. Projects Specifically for Subject Authority Data Sharing
Construction (not to be discussed here) Implementation
Projects based on different types of structures Projects involving multiple languages
KOS Types Projects
thesaurus classification scheme
subject heading list; controlled term list
coding system
Languages involved
Projects based on different structural types of KOS UMLS x x x x multiple
languages HILT x x x multiple
languages UC Berkeley DARPA Unfamiliar Metadata Project
x x x English, French, German, Russian, Spanish
Polish Project x x x English, Polish
Megathesaurus, H.W.Wilson
x x English
Classification Web x x English WebDewey x x English CARMEN x x German,
English Finnish Project x x Finnish Projects based on similar structural types of KOS Renardus x multiple
languages MACS x English,
French, German
Merimee x English, French
HEREIN x Spanish, French, English
LCSH/MeSH x English MSC/DDC x English SAB/DDC x Swedish,
English CAMed x English,
French
KOS
Vocabularies Authority filesBibliographic files
KOS
Vocabularies Authority filesBibliographic files
KOS
Vocabularies Authority filesBibliographic files
Sharing at Vocabulary Level
KOS
Vocabularies
KOS adaptation, extension, extraction, translation, etc.
KOS
Vocabularies
KOS
Vocabularies
Sharing at Vocabulary Level
1.Direct mapping
National database "Merimee" about the French Heritage
The Thesaurus of Architecture (Le thésaurus de l'architecture) was created and mapped to the Art and Architecture Thesaurus (AAT) and the English Heritage Thesaurus (NMR)
KOS
Vocabularies
KOS
Vocabularies
Sharing at Vocabulary Level
2.Using a switching system
KOS
Vocabularies
KOS
Vocabularies
KOS
Vocabularies
Renardus project“a cross-browsing feature based on the DDC and improved subject searching across distributed and heterogeneous European subject gateways.”
Sharing at Vocabulary Level
KOS
Vocabularies
KOS
Vocabularies
KOS
Vocabularies
3.Creating a superstructure
UMLS® Metathesaurus ®
Over 1,000,000 concepts and 4.3 million concept names from more than 100 controlled vocabularies, some in multiple languages
Sharing at Vocabulary Level
KOS
Vocabularies
KOS
Vocabularies
KOS
Vocabularies
4.Creating a superstructure
(an index)
UCB Unfamiliar Metadata VocabulariesAccepts query vocabularies and responds with a ranked list of the system’s entry vocabularies– which is an index to five controlled vocabularies.
Sharing at Vocabulary Level
KOS
Vocabularies
KOS
Vocabularies
KOS
Vocabularies
CAMed Cross-thesaurus searching Terms are linked in a temporary union list generated by the software in response to a query.
5.Creating a superstructure
(a virtual index)
Sharing at Vocabulary Level
KOS
Vocabularies
KOS
Vocabularies
KOS
Vocabularies
6. Linking through a thesaurus server protocol
UCSB Alexandria Digital Library The Thesaurus Protocol is based on the ANSI/NISO (1993, R2003) Z39.19 thesaurus model and supports downloading, querying, and navigating thesauri.
KOS
VocabulariesBibliographic files
KOS
VocabulariesBibliographic files
Sharing at Subject Authority File Level
Authority files
Authority files
Direct Mapping
Direct Mapping -- MACS (Multilingual Access to Subjects)
LCSH AND MeSH MAPPING PROJECT SAMPLE AUTHORITY RECORDS, Northwestern University Library
KOS
Vocabularies Authority filesBibliographic files
Co-occurrence mapping -- works at the application level, i.e., in metadata records, where the group of subject terms can actually result in loosely-mapped terms.
MetadataTerms from thesaurus 1Terms from thesaurus 2
S1 S2Metadata
Terms from thesaurus 1Terms from thesaurus 2
MetadataTerms from thesaurus 1Terms from thesaurus 2
MetadataTerms from thesaurus 1Terms from thesaurus 2
MetadataTerms from thesaurus 1Terms from thesaurus 2
FRBR Workshop, OCLC, 2005
So far,
Functional Requirements for Authority Records (FRAR) Covers: Names for persons, families, corporate bodies (Group 2) Titles (Group 1)
Projects for Authority Data Sharing focus mainly on Names: ONE Shared Authority Control (ONESAC, ppt) Virtual International Authority File (VIAF) Linking and Exploring Authority Files (LEAF) Hong Kong Chinese Authority (Name) (HKCAN)
FRBR Workshop, OCLC, 2005
FRSAR: Functional Requirements for Subject Authority Data
Scope:
focus on FRBR’s Group 3 entities
FRSAR Working Group
contact: Marcia Zeng [email protected]
Maja Zumer
Athena Salaba [email protected]
FRBR Workshop, OCLC, 2005
FRBR Workshop, OCLC, 2005
FRSAR terms of reference
build a conceptual model of Group 3 entities within the FRBR framework (Entities in Group 1 and Group 2 can be used as the subjects of works; but further inclusion of them will depend on the outcomes of the work of the FRANAR Working Group);
provide a clearly defined, structured frame of reference for relating the data that are recorded in subject authority records to the needs of the users of those records; and
assist in an assessment of the potential for international sharing and use of subject authority data both within the library sector and beyond.