Date post: | 02-Jan-2016 |
Category: |
Documents |
Upload: | jayson-dennis |
View: | 228 times |
Download: | 0 times |
1 of 52
2000
ISO 12620 and ISO 1087 as They Relate to ISO 11179
Sue Ellen Wright
Kent State University
Institute of Applied Linguistics©Sue Ellen Wright 2000
2 of 52
2000
Terminology Management vs. Data Element Administration
• Naming: the designation of concepts by terms compared to the naming of data elements
• The documentation of concepts in language as compared to the specification of data elements used in controlled database environments
3 of 52
2000
Defining Terms vs. Specifying Data Elements
• The characterization of concepts with various data categories as compared to the specification of data element attributes
• The fixing of a concept within a concept system, classification system, or ontology
4 of 52
2000
Similarities in Working Methods
• The specification of a subject field • Definition of the term within the framework of a
classification scheme, ontology, thesaurus, etc.• The assignment of preferred names• Allowances for synonyms • The assignment of a unique identifier (concept
identifier)• Mapping of all synonyms to this identifier
5 of 52
2000
Terminological Working Methods
• The extraction of existing names from text corpora and the universe of discourse
• Documentation of the term in a natural language context (either general or special language)
• No limitations on length (even 80 is too short)
6 of 52
2000
TC 37 Standards
• ISO 12620:1999, Computer Applications in Terminology — Data Categories
• ISO 1087 Terminology — Vocabulary, Terminology Work — Vocabulary — Part 1: Theory and Application
• ISO/FDIS 1087-2: Terminology Work — Vocabulary — Part 2: Computer Applications
• ISO/FDIS 704: Terminology Work — Principles and Methods
7 of 52
2000
ISO 1087 and 12620
• 1087 defines terms used in ISO TC 37 terminology standards (and in the field of terminology management).
• 1087 complies with formatting conventions specified in ISO 10241 for the layout of terminology standards.
8 of 52
2000
ISO 1087 and 12620
• 12620: specification of data element names (so-called data categories) and descriptions for data elements used in terminological databases (termbases)
• Conscious effort to make 12620 look different from 1087 or from the 10241 layout standard (descriptions rather than definitions)
9 of 52
2000
Affinity between 1087 and 12620
• Critical information units defined for the special language of terminology studies are used as data elements in termbases.
• Their function and description as data elements may in some instances differ from the way they are defined in a terminological dictionary or glossary.
10 of 52
2000
Elements Common to 1087 and 12620
• A.1 term• A.2.1.8 abbreviated form
of term– A.2.1.8.1 abbreviation– A.2.1.8.4 acronym– A.2.1.8.5 clipped term– A.2.1.13 symbol
11 of 52
2000
• A.2.3.5 temporal qualifier – c) obsolete term
• A.2.9.1 normative authorization– b) preferred term– c) admitted term– d) deprecated term
Elements Common to 1087 and 12620
12 of 52
2000
Elements Common to 1087 and 12620
• A.4 subject field [domaine in French!]
• A.5.1 definition• A.5.3 context [a text chunk]• A.8 characteristic
13 of 52
2000
Elements Common to 1087 and 12620
• Types of relations:• A.6.1 generic relation• A.6.2 partitive relation• A.6.3 sequential relation
– A.6.3.1 temporal relation– A.6.3.2 spatial relation
• A.6.4 associative relation (pragmatic relation)
14 of 52
2000
Elements Common to 1087 and 12620
• Concept Systems & Concept Positions– A.7.2.2 superordinate concept– A.7.2.3 subordinate concept– A.7.2.4 coordinate concept
• Missing elements– Ontological axioms, rules, and
functions
15 of 52
2000
Term vs. Data Element Functions
• Differences between terms and data element names– Examples: synonym (12620) vs. synonymy
(1087)– homograph vs. homonymy
16 of 52
2000
Differences in Definition
• generic relation– 1087: Relation between concepts which is
established by the division of the superordinate concept into subordinate concepts forming one or more levels, or by the reverse process.
– 12620: A hierarchical concept relation in which the intension of the superordinate concept is a subset of the intension of the subordinate concept.
17 of 52
2000
Terminological Entries
• Words that represent concepts extracted from natural language, usually special languages associated with specific subject fields (sometimes called domains in terminology management).
• Words and word strings that appear in chunks of text, almost always called contexts in terminology management.
18 of 52
2000
Scope StatementISO/IEC 11179-4
• The definitional rules and guidelines of this Part of the International Standard do not always apply to terminological definitions found in glossaries and language dictionaries ... [which may have] multiple definitions [with different meanings]. ... Data definitions must be unique within a dictionary and have a single meaning.
19 of 52
2000
Lexicographical vs. Terminological Entries
• Terminological entries– Multilingual terminology management– Standardization– Language planning
• Lexicographical entries– General language dictionaries– Machine translation lexicons
22 of 52
2000
Lexicographical Entry• rattle (rat’l) vi. -tld, -tling, [ME …] 1. to make a series of sharp,
short sounds in quick succession 2. to go or move with such sounds [a wagon rattling over the stones] 3. to talk rapidly and incessantly; chatter [often with on: rattle on] -vt. 1. to cause to rattle [to rattle the handle of a door] 2. to utter or perform rapidly 3. to confuse or upset; disconcerrt [to rattle a speaker with catcalls] -n. 1.quick succession of sharp, short sounds 2. a rattling noise made by air passing through the mucous of a partially closed throat: cf. DEATH RATTLE 3. a noisy uproar; load chatter 4. a series of horny rings at the end of a rattlesnake’s tail, used to produce a rattling sound b) any one of these 5. a device, as a baby’s toy or a percussion instrument, made to rattle when shaken
• Collocation: to rattle around in a house that is too big for one’s needs
24 of 52
2000
Terminological Entry
• rattle– Subject Field: snakes– Definition: The tail structure of the poisonous North
American genera Crotalus and Sisturus which has been modified into a series of horny, loose-fitting structures that produce a buzzing sound when vibrated.
– (Synonyms)– (Equivalent terms in other languages)
25 of 52
2000
Lexical vs. TerminologicalEntries-1
• L: Is identified using a word (frequently called a headword)
• T: Is identified by a concept, frequently using a code or classification number rather than a word in a natural language– Slides 22 and 24
26 of 52
2000
Lex-Term Entries-2
• L: Treats multiple polysemic senses of the word based on one etymological derivation
• T: Treats one concept in one entry, and documents terms assigned to that concept– Slides 22 & 24
27 of 52
2000
Lex-Term Entries-3• L: Treats homographic lexical units with
different derivations in separate entries– stud (male animal)/stud (fastener, support
member)– bloom (flower)/bloom (ingot)
T: Treats polysemic assignments of the same orthographic form to different concepts in separate entries
28 of 52
2000
Lex-Term Entries-4
• L: Provides all necessary grammatical information pertaining to the word.
• T: Generally emphasizes only those grammatical differences that may be related to term-concept assignment– Assumption that readily available
lexicographic information applies to terms
29 of 52
2000
Lex-Term Entries-5
• L: Is arranged in strict alphabetical order for easy access
• T: Frequently, but not always, has been arranged to represent logical links in classified hierarchical systems, with alphabetical cross-listing– Growing acceptance in North America
30 of 52
2000
Lex-Term Entries-6
• L: Describes, or at most, recommends usage• T: Frequently documents preferred or
recommended usage, prescribes usage, or mandates legally binding standardization– Descriptive vs prescriptive approaches both
valid in terminology management
– Data element specification by nature prescriptive
31 of 52
2000
Lex-Term Entries-7
• L: Usually treats a universal set taken from general language
• T: Treats a systematically defined subset of subject-field-specific special language– Data elements categorically subject-field
(context) dependent
32 of 52
2000
Lex-Term Entries-8
• L: Includes a full set of word classes
• T: Is comprised mainly of nouns, verbs, and sometimes adjectives– Word class inherent in naming rules, but
data elements all have a nominal character even when used as attributes
33 of 52
2000
Caveat:
• BUT special language lexicography also deals with technical terms, so the distinction can be misleading or confusing.
35 of 52
2000
1087/12620 vs. 11179 (1)
1087/12620 11179
term [main entry, preferred]
data element name
term [synonym] synonymous name
subject field context
definition/description definition
36 of 52
2000
1087/12620 vs. 11179 (2)
1087/12620 11179
context N.A.
source (s) source document
note comment
concept identifier identifier
37 of 52
2000
1087/12620 vs. 11179 (3)
1087/12620 11179
classification classification scheme
keyword keyword
related concepts related data reference
concept relation type of relationship
38 of 52
2000
1087/12620 vs. 11179 (4)
1087/12620 11179
[most size limits inappropriate]
size
sets of permissible values
permissible values (domains)
responsibility responsible organization
39 of 52
2000
1087/12620 vs. 11179 (5)
1087/12620 11179
abbreviation (abbreviated form of term)
short name
symbol symbol
formula formula
[dealt with in 12200; see slide No. 39]
data element associations
40 of 52
2000
1087/12620 vs. 11179 (6)
1087/12620 11179
see also see also
dates [misc.] dates [misc.]
administrative [misc.] similar
41 of 52
2000
Term Autonomy• Repeatability and combinability of data
elements• All synonyms and all foreign language
equivalents can be associated with all subordinate data elements
• A definition can reside at the entry level above all these terms.
• There need not be a preferred term• The concept identifier is the unique element
identifying the terminological entry.
42 of 52
2000
Language Specification
• Limitations of ISO 639
• Advantage of ISO 639-2, the three letter codes– Increased number of treated languages– Enhanced capabilities for expansion to include
additional languages
43 of 52
2000
Language of Metadata
• 11179-3, Annex A, multilingual specifications
• data element 1: The field names and the content are in Dutch; data element 2: they are both in German
• Implication: field names can appear in different languages in different parts of a single database entry
44 of 52
2000
Language of Metadata
• TC 37 practice:– Single human language used throughout a
termbase for metalanguage elements– Possibility of switching human languages for
display for different users– Mixing in a given display environment rare
45 of 52
2000
Language of Metadata
• We have agreed that at the meta level we will map to English equivalents: i.e., field names would be uniform in one language, but content can vary.
• For interchange, standardized L2 data element names can map to English (L1) names, although in individual working environments L2 (Ln) can be used.
46 of 52
2000
11179-5.2,b:Phrases• Synonyms are not definitions. (True!)
• “A phrase is necessary (in most languages)”
• Classic definitions consist of a statement of the genus (a broader concept, not necessarily the immediate superordinate concept)
• Followed by a statement of the differentia (differentiating essential characteristics)
47 of 52
2000
11179-5.2,b:Phrases
• The tail structure of the poisonous North American genera Crotalus and Sisturus which has been modified ...
• Broader concept: – Tail structure is generically superordinate to rattle.
• Differentia:– modified ... horny, loose-fitting structures
– a buzzing sound when vibrated.
• NOT a complete sentence.
48 of 52
2000
Not a Complete Sentence
• [Term/subject] rattle [implied linkage, called a copula: is] :, —, or line break
• the tail structure of the poisonous North American genera Crotalus and Sisturus which has been modified into a series of horny, loose-fitting structures that produce a buzzing sound when vibrated.
• Definitions are actually predicates, not sentences, but term + definition is a special kind of sentence.
ISO 704: Principles & Methods
object/visual representation
the set of all lead pencils
concept: designation
abstraction (term):
based on: lead pencil
category
level of
abstraction
composition
compositon
colour
composition
shape
usage
medium
function
property
concreteness
made of a long, thin piece of graphite
wood casing surrounds graphite
casing is yellow
at one end there is an eraser
other end sharpened to a point
graphite&casing sharpened for use
graphite is writing medium
used for writing or making marks
characteristic
concreteness
graphite core
graphite core is encased in wood
casing may be any colour
one end may have an eraser
one end may be sharpened to a point
graphite & casing sharpened for use
graphite is writing medium
used for making marks
50 of 52
2000
Concept Systems (ISO 704)
...... penpencilmarker
concreteness* used for writing or making marks
type of pencil
graphite core is fixed wood casing is removed for usage (sharpening)
type of pencil
permanent outer casing
graphite core advances for usage
type of writing instrument
graphite core =writing medium
lead pencil mechanical pencil
51 of 52
2000
USA TC 37 Participation
• Discontinuance of ANSI support
• Failure to find support from industry
• US participation in SC 3, Computer Applications
• Suggestions for finding funding & participation for SC 1 & 2?
52 of 52
2000
• Call for papers• Refereed scholarly journal• Desirability of an issue coming out of this
forum• Contact: Sue Ellen Wright