+ All Categories
Home > Documents > TMF - a tutorial

TMF - a tutorial

Date post: 24-Jan-2016
Category:
Upload: isla
View: 197 times
Download: 0 times
Share this document with a friend
Description:
TMF - a tutorial. TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria. Three parts. Part 1: Basic concepts Part 2: Representing data categories Part 3: Designing (schemas and) filters. TMF - a tutorial Part 1: Basic concepts. TMF - Terminological Markup Framework - PowerPoint PPT Presentation
Popular Tags:
51
TMF - a tutorial TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria
Transcript
Page 1: TMF - a tutorial

TMF - a tutorial

TMF - Terminological Markup Framework

Laurent Romary - Laboratoire Loria

Page 2: TMF - a tutorial

Three parts

Part 1: Basic concepts

Part 2: Representing data categories

Part 3: Designing (schemas and) filters

Page 3: TMF - a tutorial

TMF - a tutorialPart 1: Basic concepts

TMF - Terminological Markup Framework

Laurent Romary - Laboratoire Loria

Page 4: TMF - a tutorial

• Background - ISO etc.

• The need for abstraction

• Structure and content of terminological data - picture virtual-actual

• The meta-model (structural skeleton)

• Describing data categories

• Styles and vocabularies

• XTMF as a mapping tool - examples

• Further work: extending the model to a wider scope (language engineering)

Page 5: TMF - a tutorial

Overview

Page 6: TMF - a tutorial

General principles

Expressing constraints on the representation of computerized terminologies

• What is the underlying structure of computerized terminologies?

• Which data-category is used and under which conditions?

Maintaining interoperability between representations

• Providing a conceptual tool to compare two given formats

Page 7: TMF - a tutorial

Definitions

TMF: Terminological Mark-up Framework• Definition of underlying structures and mechanisms

needed for the computer representation of terminological data

• Independence with regards any specific format

GMT: Generic Mapping Tool• Abstract XML format equivalent to the underlying

model of TMF

Page 8: TMF - a tutorial

Definitions - cont.

TML: Terminological Mark-up Language• One specific representation format generated within

TMF

• E.g.: DXLT is a possible TML

Page 9: TMF - a tutorial

A family of formats

TMF

TML1 TML2 TML3 TMLi…

(DXLT)(Geneter)

GMT

Page 10: TMF - a tutorial

Meta-model

Representing the underlying structure of terminological data

Page 11: TMF - a tutorial

*

* 1*

1

**1

*1

*

*

11

*

1

0:1

Terminological Data Collection

Global Information

Terminological Entry

Complementary Information

Terminology- related

Information Language Section

Term Section

Term ComponentSection

Page 12: TMF - a tutorial

Meta-model description

Terminological Data Collection (TDC) • A collection of data containing information on

concepts of specific concept fields.

Terminological Entry (TE) • An entry containing information on terminological

units (i.e., subject-specific concepts, terms, etc.).» Example: Domain description, Conceptual relations

etc.

Page 13: TMF - a tutorial

Meta-model description - cont.

Language Section (LS) • The part of a terminological entry containing

information related to one language.» Note: One terminological entry may contain

information on one, two or more languages.

Term Section (TS) • The part of a language section giving information

about a term.» Example: Term status (e.g. abbreviation), Usage

information (temporal, geographical etc.)

Page 14: TMF - a tutorial

Meta-model description - cont.

Term Component Section (TCS) • The section of a term section giving information

about components of a term.» Example: Component grammatical information (Part

of speech)

Page 15: TMF - a tutorial

Meta-model description - cont.

Global Information (GI) • Technical and administrative information applying

to the entire data collection .» Example: title of the data collection, revision history

Complementary Information (CI)• Information supplementary to terminology-related

information.» Example: bibliographical source, documentary

language or description thereof.

Page 16: TMF - a tutorial

The structural skeleton

Terminological Data Collection (TDC)

Global Information (GI) Complementary Information (CI)

Terminological Entry (TE)

Language Section (LS)

Term Level (TL)

Term Component Level (TCL)

*

*

*

*

Page 17: TMF - a tutorial

How does this work?

Walking through an example…

Page 18: TMF - a tutorial

DXLT example<termEntry id='ID67'>

<descrip type='subjectField‘>manufacturing</descrip><descrip type='definition'>A value between 0 and 1 used in ...</descrip><langSet lang='en'>

<tig><term>alpha smoothing factor</term><termNote

type='termType'>fullForm</termNote></tig>

</langSet><langSet lang='hu'>

<tig><term>Alfa ...</term>

</tig></langSet>

</termEntry>

Page 19: TMF - a tutorial

Identifying the structural skeletonid=‘ID67’ [attribute]subjectField=‘ manufacturing ’ [typedElement]definition=‘A value…’ [typedElement]

lang=‘ hu ’ [attribute]lang=‘ en ’ [attribute]

term=‘…’ [element]

term=‘alpha smoothing factor’ [element]termType=‘fullForm’ [typedElement]

TE

LS

TStig

langSet

tig

langSet

termEntry

TE: Terminological EntryLS: Language SectionTS: Term Section

Page 20: TMF - a tutorial

TMF information model

TE

TS

LSLS

TS

id=‘ID67’subjectField=‘ manufacturing ’definition=‘A value…’

lang=‘ hu ’lang=‘ en ’

term=‘…’term=‘alpha smoothing factor’termType=‘fullForm’

Page 21: TMF - a tutorial

GMT representation<struct type=“TE”>

<feat type=“id”>ID67</feat><feat type=“subjectField”>manufacturing</feat><feat type=“definition”>A value between 0 and 1 used in ...</feat><struct type=“LS”>

<feat type=“lang”>en</feat><struct type=“TS”>

<feat type=“term”>alpha smoothing factor</feat> <feat type=“termType”>fullForm</feat>

</struct></struct><struct type=“LS”>

<feat type=“lang”>hu</feat><struct type=“TS”>

<feat type=“term”>Alfa ...</feat></struct>

</struct></struct>

Page 22: TMF - a tutorial

Structural Skeleton DCRref (ISO12620)

DCRi

- DCRref subset- Application dependent DCR

Interoperability conditionsGMT

Dialecti

- Expansion structures- DatCat structural styles- DatCat vocabulary styles

Terminological Markup Language (TML)

Page 23: TMF - a tutorial

TML à la mode ISO

– Ingredients– A structural skeleton

» (take the TMF Metamodel)– A reference Data Category Registry

» ISO 12620 is a good place to find one

– Recette– Choose some data categories from the registry

» You can even constrain the values of your datcats– Associate a style and vocabulary to each datcat

» You can inspire yourself from others (DXLT)– Serve it hot to your software guy with a piece of SALT software

Page 24: TMF - a tutorial

GMT

Generic Mapping Tool

Page 25: TMF - a tutorial

Background

Interoperability principle– If any two TMLs have exactly the same DCS,

even though they differ radically in style and vocabulary, they are equivalent.

Consequence– It is always possible to define a filter from one

TML to another when they are interoperable• GMT is the intermediate representation to do so

Page 26: TMF - a tutorial

From one TML to another

GMT - Generic mapping tool– an abstract XML representation

• identification of levels– <struct type=“LS”>…</struct>

» a recursive element

• representation of data-categories– <feat type=“definition”>…</feat>

Page 27: TMF - a tutorial

The tmf element

• Description:– The tmf element is the root element for any valid XTMF

document. It contains both the global information that corresponds to a terminological data collection, the collection itself, and the complementary information comprising external resources in particular, which are needed for describing the various terminological entries.

• Content model: <!ELEMENT tmf (struct*)>

Page 28: TMF - a tutorial

The struct element

• Description – The struct element should be used to represent a locus in a

given structural skeleton. The struct element is recursive and may also contain feat and/or brack elements to express attributes belonging to the corresponding level of the meta model.

• Attributes:– type: level in the meta model (TDC, TE, LS, TS or TCS)

• Content model:<!ELEMENT struct ((feat|brack)*, struct*)><!ATTLIST struct type (TDC|TE|LS|TS|TCS) #REQUIRED>

Page 29: TMF - a tutorial

The feat element

• Description – The feat element represents any feature that is either

directly attached to a locus in the structural skeleton (represented by a struct element).

• The feat element accepts the following attributes:– type: categorises the feat element through the reference to

the name of the corresponding data category.

• Content model (DTD) – <!ELEMENT feat (#PCDATA | annot)*>– <!ATTLIST feat type CDATA #REQUIRED>

Page 30: TMF - a tutorial

Bracketing information

Page 31: TMF - a tutorial

Rationale

Describing the context of use of a given data category– Example 1:

» Classification Code: AG1

» Classification System: Lenoc

– Example 2:» Transaction type: modification

» Responsible person: Mr. X

» Date: 23 avril 1988

Page 32: TMF - a tutorial

Formal model

Hierarchical feature structure– Constraint: Type given by ‘ main ’ (first) data

category

ClassificationGrp ClassificationCode AG1

ClassificationSystem Lenoc

Page 33: TMF - a tutorial

GMT description

• Bracketing features

<brack><feat type=“classificationCode“>xxx</feat><feat type=“classificationSystem“>Lenoc</feat>

</brack>

Rem: no type for ‘ brack ’

Page 34: TMF - a tutorial

Annotating content

Page 35: TMF - a tutorial

Rationale

Why should we annotate specific content?– To identify components which are not

explicitly expressed as a specific part of a terminological entry

• E.g.: Characteristics of a concept

– To relate a component to another entry or an external resource

• E.g.: bibliographical reference

Page 36: TMF - a tutorial

Formal model

?

Page 37: TMF - a tutorial

XML model

Mixed content– <!element feat (#PCDATA|annot)*>

• Attributes– type: categorises the annot element through the reference

to the name of the corresponding data category.

• Rem.: Problems with mixed content in XML schemas

Page 38: TMF - a tutorial

GMT description

• Annotating information<feat type=“definition”>pencil whose<annot type=“characteristic”> casing </annot>

is fixed around a cental graphite medium which is used for writing or making marks

</feat>

Page 39: TMF - a tutorial
Page 40: TMF - a tutorial

Representation of relations

Page 41: TMF - a tutorial

XML links

Transparency as to the actual location of a resource (internal vs. external)

Maybe useful to identify ontologies– External links between concepts

entry i

entry j

entry i

entry j

Page 42: TMF - a tutorial

Representation in GMT

Two attributes• Target - a pointer to a ‘ struct ’ element in the case

the feature expresses a relation between the current locus and another locus in the structural skeleton;

• Source - a pointer to a ‘ struct ’ element in cases where the feature is described external to the locus to which it is supposed to be attached.

Page 43: TMF - a tutorial

Some examples

• Simple atomic feature attached directly to a locus:<feat type="conceptIdentifier">ID67</feat>

• Basic feature whose value is a reference to a locus in the structural skeleton:

<feat type="partWhole" target="TE24"/>

• Basic feature anchored at the locus in the structural skeleton whose id attribute value is “TE24”:

<feat type="conceptIdentifier" source="TE24">ID67</feat>

• Compound feature anchored at “TE 23” and which makes reference to “TE 24”:

<feat type="partWhole" source="TE23" target=“TE24”/>

Page 44: TMF - a tutorial

Styles and vocabularies

Page 45: TMF - a tutorial

Structural Skeleton DCRref (ISO12620)

DCRi

- DCRref subset- Application dependant DCR

Interoperability conditionsCML

Dialecti

- Expension structures- DatCat structural styles- DatCat vocabulary styles

Terminological Markup Language (TML)

Page 46: TMF - a tutorial

Implementating a DatCat

– Definitions:• ‘ style ’ — The way a given DatCat is implemented as

an XML object…• ‘ vocabulary ’ — symbols needed to express the

implementation of a given DatCat in its associated style ;

– E.g.:» DatCat: /definition/» Vocabulary = [def]» Style = Element» <def>pencil whose casing …</def>

DatCat value

Page 47: TMF - a tutorial

Implementating a DatCat (Cont.)

– Definition:• ‘ anchor ’ — the XML element(s) to which the

implementation of a given DatCat can be attached– E.g.:

<tig>

<term>alpha smoothing factor</term>

</tig>

Page 48: TMF - a tutorial

Styles - element

Element• Def.: The Datcat is implemented as an element,

child of its anchor

• Vocabularies : the name of the corresponding element

• E.g.:<def>pencil whose casing …</def>

<term>alpha smoothing factor</term>

DatCat value

Page 49: TMF - a tutorial

Styles - typedElement

typedElement• Def.: The Datcat is implemented as a generic XML

element, which is a child of the anchor, and which is further specified by means of a type attribute. Its content is the value of the feature in the structural skeleton.

• Vocabularies : the element name and the value of the type attribute

• E.g.:<termNote type=‘definition’>Bla, bla, bla…</termNote>

DatCat value

Page 50: TMF - a tutorial

Styles - attribute

Attribute• Def.: The Datcat is implemented as an attribute of

its anchor

• Vocabularies : the name of the corresponding attribute

• E.g.:<termEntry id='ID67'> … </termEntry>

<ldl language ='en'> … </ldl>

DatCat value

Page 51: TMF - a tutorial

ValuedElement TypedValuedElement


Recommended