FRBR Workshop, OCLC, 2005 Toward an International Sharing and Use of Subject Authority Data Marcia...

Post on 27-Mar-2015

213 views 0 download

Tags:

transcript

FRBR Workshop, OCLC, 2005

Toward an International Sharing and Use of

Subject Authority Data

Marcia Lei Zeng

Athena SalabaKent State University

FRBR Workshop, OCLC, 2005

Outline

1. Background information

2. Current State

3. Authority Data

4. Sharing Authority Data

FRBR Workshop, OCLC, 2005

1. Background

1.1 Subject access

Seeking information on a topic is still the predominant user task

Subject access includes: Subject searching Keyword searching Subject browsing

It is still very problematic for the majority of searchers

FRBR Workshop, OCLC, 2005

1.2 Functions of a catalog regarding subject access (1)

Cutter (1897) To find a book if the subject is known To show what a library has on a given

subject (collocate) To assist in the choice as to its character

(identify)

FRBR Workshop, OCLC, 2005

1.2 Functions of a catalog regarding subject access (2)

FRBR (1998) To find entities of Group 1 that have

entities from Group 1, 2, 3 as their subject To identify To select To obtain

FRBR Workshop, OCLC, 2005

1.3 What is a subject?

Group 1 Work Expression Manifestation Item

Group 2 Persons Families Corporate bodies

Group 3 Concepts Objects Place Event

FRBR – Functional Requirements for Bibliographic Records

FRBR Workshop, OCLC, 2005

Revisiting Group 3? Time Process Event is a combination of place and time Concrete vs. abstract concept Ranganathan

Personality Matter Energy Space Time

FRBR Workshop, OCLC, 2005

2. Current State

Subject Authority Data

2.1 Structure (heterogeneous)

2.2 Existing Knowledge Organization Systems/Structures/Schemas (KOS)

2.3 Rules and guidelines

2.4 Communication/Encoding

2.1 Structures

Term Lists:

Synonym RingsAuthority FilesGlossaries/DictionariesGazetteers

Natural language Controlled language

Wea

kly- s

truct

u red

Str o

ngly-

stru

ctur

ed

Classification &Categorization: Subject HeadingsSubject Headings

Classification schemesClassification schemes TaxonomiesCategorization schemes

Relationship Groups: Ontologies Semantic networks

ThesauriThesauri

Pick lists

FRBR Workshop, OCLC, 2005

Structures: Coordination

Pre-coordination ……….. Post-coordination

e.g. subject headings e.g. thesauri

- LCSH - AAT, INSPEC

MeSH FAST

UMLS

FRBR Workshop, OCLC, 2005

2.2 Existing KOS (1)

Library of Congress Subject Headings (LCSH) Medical Subject Headings (MeSH) ERIC Thesaurus (ERIC) Inspec Thesaurus Inspec Classification Dewey Decimal Classification (DDC) Library of Congress Classification (LCC) Universal Decimal Classification (UDC) HEREIN Thesaurus Alexandria Digital Library (ADL) Gazetteer and Thesaurus Schlagwortnormdatei (SWD) Regenburger Verbund Klassifikation (RVK) RAMEAU: repertoire d'authorite de matieres encyclopedique unifie Art and Architecture Thesaurus (AAT) National Agriculture Library Subject Headings … …

FRBR Workshop, OCLC, 2005

2.2 Existing KOS (2)

Verbal basedAATLCSHRAMEUINSPEC ThesaurusMeSH

Code basedUDCRVKDDCLCC

IntegratedINSPEC

MeSH Hierarchy

CombinedCombined

V C

Global environment

Language

Structure

FRBR Workshop, OCLC, 2005

2.3 Rules of KOS Construction

Different rules and guidelines AACR2, Z39.19, RAK (Regeln für die

alphabetische Katalogisierung), ISO5964, ISO2788, IFLA Principles UnderlyingSubject Heading Languages (SHLs) …

No rules Indirect/Inherent use of rules (by example)

FRBR Workshop, OCLC, 2005

2.4 Communication/Encoding for authority data

MARC MARC21 (1xx, 2xx, etc.) UNIMARC (1xx, 2xx, etc. different definition) etc.

Guidelines for Authority Records and References (GARR) (>, <, >>, <<)

NISO Z39.19 (BT, NT, RT, etc.) XML-based: OWL Web Ontology Language,

RDF Schema, Voc-ML, etc.

FRBR Workshop, OCLC, 2005

3. Authority Data

3.1 Use of authority data

Direct use of authority data Index Identify/Verify Search & Browse the

authority data

Indirect use of authority data Searching bibliographic file Browsing bibliographic file

Users Information

professionals

Searcher/end-user

FRBR Workshop, OCLC, 2005

3.2 Common Authority Data

Authorized/established term Variations Related terms Notes Linked/Parallel terms Numbering, International numbering? Other: language, rules, links to external

resources, roles, etc.

FRBR Workshop, OCLC, 2005

Do we need one authorized term?

Keep USER in mind! Preference, language, script

Trends: all are preferred Synonym rings (included in NISO Z39.19

now)

FRBR Workshop, OCLC, 2005

3.3 Common Semantic Relationships in Authority Data

Semantic relationships Broad categories

Equivalence (Use, Used For, UF, See)

Hierarchical (BT, NT, see also)

Associative (RT, see also) More specific relationships,

such as: Is part of Is instance of Agent/process Process/product

Need for other types of relationships? ADL, such as:

Overlap; administrativePartOf; SubFeatureOf

UMLS, such as: Like; Parent; Child; Sibling

WordNet, such as: Familiarity; derivationally

related

FRBR Workshop, OCLC, 2005

Unanswered Question

What authority data currently exist in an authority record?

orWhat authority data should be

included in an authority record?

FRBR Workshop, OCLC, 2005

4. Sharing Authority Data in a Global Environment

4.1. Challenges Structures Languages and

scripts Rules Encoding

CombinedCombined

V C

Global environment

Language

Structure

FRBR Workshop, OCLC, 2005

4.2. Projects Specifically for Subject Authority Data Sharing

Construction (not to be discussed here) Implementation

Projects based on different types of structures Projects involving multiple languages

KOS Types Projects

thesaurus classification scheme

subject heading list; controlled term list

coding system

Languages involved

Projects based on different structural types of KOS UMLS x x x x multiple

languages HILT x x x multiple

languages UC Berkeley DARPA Unfamiliar Metadata Project

x x x English, French, German, Russian, Spanish

Polish Project x x x English, Polish

Megathesaurus, H.W.Wilson

x x English

Classification Web x x English WebDewey x x English CARMEN x x German,

English Finnish Project x x Finnish Projects based on similar structural types of KOS Renardus x multiple

languages MACS x English,

French, German

Merimee x English, French

HEREIN x Spanish, French, English

LCSH/MeSH x English MSC/DDC x English SAB/DDC x Swedish,

English CAMed x English,

French

KOS

Vocabularies Authority filesBibliographic files

KOS

Vocabularies Authority filesBibliographic files

KOS

Vocabularies Authority filesBibliographic files

Sharing at Vocabulary Level

KOS

Vocabularies

KOS adaptation, extension, extraction, translation, etc.

KOS

Vocabularies

KOS

Vocabularies

Sharing at Vocabulary Level

1.Direct mapping

National database "Merimee" about the French Heritage

The Thesaurus of Architecture (Le thésaurus de l'architecture) was created and mapped to the Art and Architecture Thesaurus (AAT) and the English Heritage Thesaurus (NMR)

KOS

Vocabularies

KOS

Vocabularies

Sharing at Vocabulary Level

2.Using a switching system

KOS

Vocabularies

KOS

Vocabularies

KOS

Vocabularies

Renardus project“a cross-browsing feature based on the DDC and improved subject searching across distributed and heterogeneous European subject gateways.”

Sharing at Vocabulary Level

KOS

Vocabularies

KOS

Vocabularies

KOS

Vocabularies

3.Creating a superstructure

UMLS® Metathesaurus ®

Over 1,000,000 concepts and 4.3 million concept names from more than 100 controlled vocabularies, some in multiple languages

Sharing at Vocabulary Level

KOS

Vocabularies

KOS

Vocabularies

KOS

Vocabularies

4.Creating a superstructure

(an index)

UCB Unfamiliar Metadata VocabulariesAccepts query vocabularies and responds with a ranked list of the system’s entry vocabularies– which is an index to five controlled vocabularies.

Sharing at Vocabulary Level

KOS

Vocabularies

KOS

Vocabularies

KOS

Vocabularies

CAMed Cross-thesaurus searching Terms are linked in a temporary union list generated by the software in response to a query.

5.Creating a superstructure

(a virtual index)

Sharing at Vocabulary Level

KOS

Vocabularies

KOS

Vocabularies

KOS

Vocabularies

6. Linking through a thesaurus server protocol

UCSB Alexandria Digital Library The Thesaurus Protocol is based on the ANSI/NISO (1993, R2003) Z39.19 thesaurus model and supports downloading, querying, and navigating thesauri.

KOS

VocabulariesBibliographic files

KOS

VocabulariesBibliographic files

Sharing at Subject Authority File Level

Authority files

Authority files

Direct Mapping

Direct Mapping -- MACS (Multilingual Access to Subjects)

LCSH AND MeSH MAPPING PROJECT SAMPLE AUTHORITY RECORDS, Northwestern University Library

KOS

Vocabularies Authority filesBibliographic files

Co-occurrence mapping -- works at the application level, i.e., in metadata records, where the group of subject terms can actually result in loosely-mapped terms.

MetadataTerms from thesaurus 1Terms from thesaurus 2

S1 S2Metadata

Terms from thesaurus 1Terms from thesaurus 2

MetadataTerms from thesaurus 1Terms from thesaurus 2

MetadataTerms from thesaurus 1Terms from thesaurus 2

MetadataTerms from thesaurus 1Terms from thesaurus 2

FRBR Workshop, OCLC, 2005

So far,

Functional Requirements for Authority Records (FRAR) Covers: Names for persons, families, corporate bodies (Group 2) Titles (Group 1)

Projects for Authority Data Sharing focus mainly on Names: ONE Shared Authority Control (ONESAC, ppt) Virtual International Authority File (VIAF) Linking and Exploring Authority Files (LEAF) Hong Kong Chinese Authority (Name) (HKCAN)

FRBR Workshop, OCLC, 2005

FRSAR: Functional Requirements for Subject Authority Data

Scope:

focus on FRBR’s Group 3 entities

FRSAR Working Group

contact: Marcia Zeng mzeng@kent.edu

Maja Zumer

Athena Salaba asalaba@kent.edu

FRBR Workshop, OCLC, 2005

FRBR Workshop, OCLC, 2005

FRSAR terms of reference

build a conceptual model of Group 3 entities within the FRBR framework (Entities in Group 1 and Group 2 can be used as the subjects of works; but further inclusion of them will depend on the outcomes of the work of the FRANAR Working Group);

provide a clearly defined, structured frame of reference for relating the data that are recorded in subject authority records to the needs of the users of those records; and

assist in an assessment of the potential for international sharing and use of subject authority data both within the library sector and beyond.