Extending Models for Controlled
Vocabularies to Classification Systems:
Modelling DDC with FRSAD
Joan S. Mitchell
OCLC, Inc.
Marcia Lei Zeng
Kent State University
Maja Žumer
University of Ljubljana, Slovenia
The big question
Can the FRSAD conceptual model be extended
beyond subject authority data (its original focus) to
model classification data?
Outline
1. From Knowledge Organisation Systems (KOS)
to data and conceptual models
2. FRSAD conceptual model
3. FRSAD model for classification systems
4. DDC case study
5. Findings and limitations
6. Future work
2009
1998
2010
1876 DDC
1905
UDC
1898
LCSH
FRSAD
FRAD FRBR
1967
TEST*
*Thesaurus of engineering and scientific terms
ISO 2788 (1974) Guidelines for the Establishment and Development of Monolingual Thesauri
ISO 5964 (1985) Guidelines for the Establishment and Development of Multilingual Thesauri
1974
ISO 2788*
1985
ISO5964*
2004-2
009
SKOS
OWL
1. From Knowledge Organisation Systems
to Data and Conceptual Models:
Timeline
From Knowledge Organisation Systems
to Data and Conceptual Models:
Modelling efforts
2009
1998
2010
1876
1905
Classifi-
cation
1898
Subject
headings
FRSAD
FRAD FRBR
1967
1974
ISO 2788
1985
ISO5964
2004-2
009
SKOS
OWL
Classifi-
cation
Thesauri
Thesauri KOS
KOS
ontology
Thesauri: mostly comply with ISO 2788 and ISO 5964.
Subject heading schemes: adopted the basic structure of the thesaurus since 1990s.
Classification systems: implemented different practices and are usually constructed
according to specific conventions and examples.
The “FRBR family”
FRBR: the original framework
All entities, focusing on Group 1 entities: work, expression, manifestation, item
Published 1998
FRAD: Functional Requirements for Authority Data
Focusing on Group 2 entities: person, corporate body, family
Published 2009
FRSAD: Functional Requirements for Subject Authority Data
Focusing on Group3 entities
FRSAR WG established in 2005
Published 2010
The core of the FRSAD conceptual model
FRSAD Part 1: WORK has as subject THEMA /
THEMA is subject of WORK
FRSAD Part 2: THEMA has appellation NOMEN /
NOMEN is appellation of THEMA
NOMEN = any sign or sequence of
signs (alphanumeric characters,
symbols, sound, etc.) that a thema
is known by, referred to or
addressed as
Note: in a given controlled vocabulary and within a domain,
a nomen should be an appellation of only one thema.
The ‘has appellation’ relationship between
thema and nomen in a controlled vocabulary:
NOMEN = any sign or sequence of signs (alphanumeric characters, symbols,
sound, etc.) that a thema is known by, referred to or addressed as.
Source: STN Database Summary Sheet: USAN (The USP Dictionary of U.S.
Adopted Names and International Drug Names)
An example of nomens in an authority record for a chemical compound
Nomen
1-8
Nomen 9
terms (preferred & non-preferred)
notations
terms of pre-coordinated strings
category labels (w or w/t notations)
terms or identifiers
… …
• thesauri:
• classification schemes:
• subject heading systems:
• taxonomies:
• controlled lists:
• … …
themas represented by:
Nomens in different types of KOS
2.2 Relationships
(1) Thema-to-thema relationships
Hierarchical The generic relationship
The hierarchical whole-part relationship
The instance relationship
Other hierarchical relationships
Associative [most commonly considered categories are listed in the
report]
Other thema-to-thema relationships are domain- or
implementation-dependent
Equivalence
Two nomens are considered equivalent only if they are appellations of the same thema in a controlled vocabulary.
Partitive
An instance of a nomen may have parts.
A whole-part relationship may exist between a nomen and its components.
2.2 Relationships
(2) Nomen-to-nomen relationships
2.3 Attributes
Some general attributes of thema and nomen are
proposed
(1) thema attributes: type of thema, scope note
In an implementation themas can be organized based on
category, kind, or type
(2) nomen attributes: see next slide
In an implementation additional attributes may be
recorded
Nomen attributes
Type of nomen (identifier, controlled name, …)
Scheme (LCSH, DDC, UDC, ULAN, ISO 8601…)
Reference source of nomen (Encyclopaedia Britannica…)
Representation of nomen (alphanumeric, sound, visual,...)
Language of nomen (English, Japanese, Slovenian,…)
Script of nomen (Cyrillic, Thai, Chinese-simplified,…)
Script conversion (Pinyin, ISO 3601, Romanisation of Japanese…)
Form of nomen (full name, abbreviation, formula…)
Time of validity of nomen (until xxxx, after xxxx, from… to …)
Audience (English-speaking users, scientists, children …)
Status of nomen (provisional, accepted, official,...)
Note: examples of attribute values in parenthesis
include but not limited to:
2.4 The importance of the THEMA-NOMEN model
to the subject authority data
Separating what are usually called concepts (or
topics, subjects, classes [of concepts]) from what
they are known by, referred to, or addressed as
A general abstract model, not limited to any
particular domain or implementation
Potential for interoperability within the library
field and beyond
3. FRSAD model for classification systems
• Each class corresponds to a thema
• Notation associated with the class is the nomen
• Thema is the full category description of the class
• Nomen is the symbol (or surrogate) used to
represent the full category description
Nomens: DDC number, Full caption, URI
025.04
Computer science, information & general
works/Library & information sciences/Operations of
libraries, archives, information centers/Information
storage and retrieval systems
http://dewey.info/class/025.04/
Thema: Any topic co-extensive with the full
meaning of the class topics that are
functionally
equivalent to the
class
Scope note: Text describing or defining thema
or specifying scope within particular system
Scope note
(≠ thema/class)
Scope note
(≠ thema/class)
Thema-to-thema relationships
associative
relationship
associative
relationship
(poly)hierarchical
relationship
5. Findings and limitations
• FRSAD conceptual model appears to accommodate
DDC data at a broad level
• Topic-to-topic relationships require further study
• The study did not consider the usefulness of
classification data modelled using FRSAD in real-
world applications
6. Future work
• Specify all relationships between Relative Index terms
and classes (see earlier work by Green, Mitchell)
6. Future work
• Specify all relationships between Relative Index terms
and classes (see earlier work by Green, Mitchell)
• Investigate DDC translations and mappings in context of
model
French
DDC 22
German
DDC 22
Italian
DDC 22 Swedish
Mixed
DDC 22
Italian
A14
Vietnamese
A14 French
A14
Spanish A14 Hebrew
A14
200
Religion
Class
Guide
(French)
DDC 22
A14
DDC Sach-
Gruppen
(German)
DDC
Summaries
English
French
Italian
Rhaeto-Romansch
Afrikaans
Arabic
Chinese
French
German
Norwegian
Portuguese
Russian
Scots Gaelic
Spanish
Swedish
Thema-to-thema relationships (Complex case):
T2—43414 (22) = T2—43414 (22/ger), but . . .
T2—43414 Giessen district (Giessen Regierungsbezirk)
Including *Lahn River
T2—43414 Regierungsbezirk Gießen
T2—434147 Lahn-Dill-Kreis
Hier auch: der Fluss *Lahn
not equivalent
to thema/class
T2—43414
functionally
equivalent to
thema/class
T2—434147
6. Future work
• Specify all relationships between Relative Index terms
and classes (see earlier work by Green, Mitchell)
• Investigate DDC translations and mappings in context
of model
• Investigate modelling the Relative Index as a separate
controlled vocabulary to provide a topic-centered
view
• Experiment with modelling other classification
schemes
• Investigate usefulness of classification data modelled
using FRSAD