Date post: | 06-Jul-2015 |
Category: |
Data & Analytics |
Upload: | jorge-gracia |
View: | 163 times |
Download: | 0 times |
Enabling Language Resources to
Expose Translations as
Linked Data on the Web
Jorge Gracia, Elena Montiel-Ponsoda,
Daniel Vila-Suero, Guadalupe Aguado-de-Cea
Ontology Engineering Group (OEG)
Universidad Politécnica de Madrid (UPM)
Acknowledgments: LIDER and BabeLData projects
9th Language Resources and Evaluation
Conference, LREC 2014
Reykjavik (Iceland) 28/05/2014
Outline
Motivation
The translation model
Terminesp: a validating example
Conclusions
2
3
Motivation and goals
Motivation
Current multilingual lexica and electronic dictionaries
• Proprietary formats
• Non-standard APIs
• Disconnected from other resources
4
Motivation
GOAL: to allow language resources to expose
translations as Linked Data on the Web for their
consumption by semantic enabled applications in a
direct manner, not relying on application-specific
formats
5
Motivation
Objectives:
• To define a model for representing translations in RDF
• As a proof of concept:
1. Extract translations from the Terminesp terminological
database
2. Represent them in RDF with our model
3. Make them accessible both for human and machine
consumption
6
7
The translation model
The translation model
8
The translation model
9
LEXICONES
LEXICONEN
LexicalEntry LexicalSense
http://purl.org/goodrelations/v1#PaymentMethods
LexicalEntry LexicalSense
ONTOLOGY
“payment method”
“medio de pago”
The translation model
Translation (direct equivalent)
10
LEXICONES
LEXICONEN
LexicalEntry LexicalSensehttp://dbpedia.org/ontology/PrimeMinister
LexicalEntry LexicalSense
ONTOLOGY
“Prime Minister”
“Presidente del Gobierno”
http://es.dbpedia.org/resource/Presidente_del_Gobierno
ONTOLOGY
The translation model
Translation (Cultural equivalence)
11
The translation model
Characteristics of the model
• Translation as a relation between senses
• Translation relation reified additional information
can be attached to it
• Support to a variety of translation categories
• Translation categories clearly separated from the
model no commitment to specific views or
translation theories
• Translation sets group translations coming from the
same language resource, or belonging to the same
organization, for instance
• Re-use of well established vocabularies (DC, DCAT,
etc.) for provenance and additional information.
12
LexicalSense
tran
translationTarget
context
TranslationSet TranslationtranslationConfidence:double
The translation model
Translation Categories
http://purl.org/net/translation-categories
translationCategory
context
Resource
http://purl.org/net/translation.owl
Translation Module
translationSource
directEquivalent
culturalEquivalent
lexicalEquivalent
13
14
Terminesp,
a validating example
Terminesp, a validating example
TERMINESP
• Multilingual terminological database
• Terms and definitions from Spanish technological
standards
• More than 30K terms in Spanish, with translations into
English, German, French, Italian, …
15
lemon:LexicalEntryterminesp:38756es
lemon:LexicalEntry terminesp:38756en
lemon:LexicalSenseterminesp:38756es-sense
lemon:LexicalSenseterminesp:38756en-sense
skos:Conceptterminesp:38756
lemon:Lexiconterminesp:lexiconES
lemon:Lexicon terminesp:lexiconEN
tr:Translationterminesp:38756es-en-TR
“red”@es
“network”@en
lemon:entry
lemon:entry
lemon:sense
lemon:sensetr:translationTarget
tr:translationSource
lemon:reference
lemon:reference
ClassInstance
Legend
lemon:form
lemon:form
lemon:LexicalForm
lemon:writtenRep
lemon:writtenRep
lemon:LexicalForm
Terminesp, a validating example
16
lemon:LexicalSenseterminesp:38756es-sense
lemon:LexicalSenseterminesp:38756en-sense
Tr:TranslationSetterminesp:es-en-transet
tr:Translationterminesp:38756es-en-TR
tr:translationCategorytr:translationTarget
tr:translationSource
ClassInstance
Legend
tr:tran
trcat:directEquivalent
Terminesp, a validating example
17
Before
• MS Access database and a Web search interface
• Non standard formats and vocabularies
• Data “invisible” to software agents
• Translations implicit, not explicit
Terminesp, a validating example
18
Now
• Published on the Web as Linked Data
• Modelled using lemon and well established vocabularies
• Dereferenceable URIs
• Data “visible” to software agents
• Translations were made explicit
• Web search interface for human consumption
• SPARQL endpoint for machine consumption
Terminesp, a validating example
19
Terminesp for machine consumption – SPARQL endpoint
http://linguistic.linkeddata.es/terminesp/sparql-editor/
Terminesp, a validating example
20
Terminesp for machine consumption – SPARQL endpoint
http://linguistic.linkeddata.es/terminesp/sparql-editor/
Written representation target Lexicon target
network http://linguistic.linkeddata.es/data/terminesp/lexiconEN
Netzwerk (in der
Netzwerktopologie)http://linguistic.linkeddata.es/data/terminesp/lexiconDE
Terminesp, a validating example
21
Terminesp for human consumption – Web interface
http://linguistic.linkeddata.es/terminesp/search/
Terminesp, a validating example
22
23
Conclusions
Conclusions
24
Our proposal
• Model to represent translations as Linked Data on the
Web
• Terminesp as a validating example
Next steps
• Standardization through W3C Ontolex Community group
• Study possible reuse of ITS 2.0 elements
• Links of Terminesp to external resources (e.g., BabelNet)
24
Thanks for your attention !
25