Ontology matching
Jerome Euzenat
&
Montbonnot, [email protected]
Thanks to Pavel Shvaiko and Natasha Noy for our collaboration on former versions of these
slides
What you have learned so far
I Data can be expressed in RDF
I Linked through URIs
I Modelled with OWL ontologies
I Retrieved through SPARQL queries
Jerome Euzenat Ontology matching 2 / 36
Being serious about the semantic web
I It is not one person’s ontology
I It is not several people common ontology
I It is many people’s many ontologies
I So it is a mess, but a meaningful mess.
Heterogeneity is not a bug, it is a feature
Jerome Euzenat Ontology matching 3 / 36
Ontology heterogeneity
Item
DVD
Book
Paperback
Hardcover
CD
pricetitledoicreatorpp
author
integer
string
uri
Person
Monograph
Essay
Literary critics
Politics
Biography
Autobiography
Literature
pages
isbnauthor
title
subject
Human
Writer
Jerome Euzenat Ontology matching 4 / 36
Heterogeneity problem
Resources being expressed in different ways must be reconciled before beingused.Mismatch between formalized knowledge can occur when:
I different languages are used (OWL vs. Topic maps);I different terminologies are used:
I English vs. Chinese;I Book vs. Monograph.
I different models are used:I different classes: Autobiography vs. Paperback;I classes vs. property: Essay vs. literarygenre;I classes vs. instances: One physical book as an instance vs. one work as
an instance.
I different scopes and granularity are used.I Only books vs. cultural items vs. any product;I Books detailed to the print and translation level vs. books as works.
Jerome Euzenat Ontology matching 5 / 36
How can we address the problem?
First ontology
Second ontology
matching Resulting alignmentInitial alignment
parameters
resources
Jerome Euzenat Ontology matching 6 / 36
Ontology alignment
Item
DVD
Book
Paperback
Hardcover
CD
pricetitledoicreatorpp
author
integer
string
uri
Person
Monograph
Essay
Literary critics
Politics
Biography
Autobiography
Literature
pages
isbnauthor
title
subject
Human
Writer
≥
≥
≥
≥
≤
Jerome Euzenat Ontology matching 7 / 36
Expressive alignments (EDOAL)
Booktopic
author=
Volume
size14≥
Autobiography
v
=
∀x , Pocket(x)⇐ Volume(x) ∧ size(x , y) ∧ y ≤ 14
∀x , Book(x) ∧ author(x , y) ∧ topic(x , y) ≡ Autobiography(x)
Jerome Euzenat Ontology matching 8 / 36
Transformation and mediation
SELECT x.isbnWHERE x : Autobiography
AND x.author = ”Bertrand Russell”
mediator
SELECT x.doiWHERE x : BookAND x.author = ”Bertrand Russell”
AND x.topic = ”Bertrand Russell”
x.doi=http://dx.doi.org/10.1080/041522862X x.isbn=041522862X
Jerome Euzenat Ontology matching 9 / 36
Ontology networks
o1
a1
b1 c1
d1 e1 o3
a3
b3
f3 g3
c3
d3 e3
o2
a2
b2
f2 g2
c2
d2 e2
o5
b5
f5
h5 j5
g5
o4
a4
b4
f4 g4
c4
d4 e4
A1,3
A1,2
A2,3
w
v
A2,4
A3,4
Jerome Euzenat Ontology matching 10 / 36
Why should we deal with this?
Applications of semantic integration
I Catalogue integration
I Schema and data integration
I Query answering
I Peer-to-peer information sharing
I Web service composition
I Agent communication
I Data transformation
I Ontology evolution
I Data interlinking
Jerome Euzenat Ontology matching 11 / 36
Applications: Query answering
Firstpeer
Firstontology
Secondpeer
Secondontology
Matcher
Alignment
Generator
mediatorquery reformulated query
answerreformulated answer
Jerome Euzenat Ontology matching 12 / 36
Applications: Agent communication
Firstagent
Firstontology
Secondagent
Secondontology
message
Matcher
Alignment
Generator
Translator
Transformedontology
Transformed message
Jerome Euzenat Ontology matching 13 / 36
Data interlinking
Firstdataset
Firstontology
Seconddataset
Secondontology
Matcher
Alignment
Generator
links
Jerome Euzenat Ontology matching 14 / 36
Ontology matching in three steps
Reconciliation can be performed in 3 steps o o ′
Match, Matcher
thereby determines the alignment A
Generate Generator
a processor (for merging, transforming, etc.) Transformation
Apply
Jerome Euzenat Ontology matching 15 / 36
On what basis can we match?
I Content: relying on what is inside the ontologyI Name, comments, alternate names, names of related entities: NLP, IR,
etc.I Internal structure: constraints on relations, typingI External structure: relations between entities: Data mining, Discrete
mathematicsI Extension: Statistics, data analysis, data mining, machine learningI Semantics (models): Reasoning techniques
I Context: the relations of the ontology with the outsideI Annotated resources:I The webI External ontologies: dbpedia, etc.I External resources: wordnet, etc.
Jerome Euzenat Ontology matching 16 / 36
Name similarity
Item
DVD
Book
Paperback
Hardcover
CD
pricetitledoicreatorpp
author
Person
Monograph
Essay
Literary critics
Politics
Biography
Autobiography
Literature
pages
isbnauthor
title
subject
Human
Writer
≥
Jerome Euzenat Ontology matching 17 / 36
Structure similarity
Item
creator
DVD
Book
pricetitledoipp
Paperback
Hardcover
CD
author
integer
string
uri
Person
Monograph
Essay
Literary critics
Politics
Biography
Autobiography
Literature
pages
isbnauthor
title
subject
Human
Writer
Jerome Euzenat Ontology matching 18 / 36
Instance similarity
Item
DVD
Book
Paperback
Hardcover
CD
Monograph
Essay
Literary critics
Politics
Biography
Autobiography
LiteratureBertrand Russell: My life
Albert Camus: La chute
Jerome Euzenat Ontology matching 19 / 36
Combining different techniques
Basic matchers provide candidate correspondences, most of the systems useseveral such matchers and further combine and filter their results.
o
o ′
M A′
M ′′ A′′′
M ′ A′′
Matcher composition Aggregation
A′′′′
Filtering
A′′′′′
Iteration
A
Jerome Euzenat Ontology matching 20 / 36
How well do these approaches work?
Ontology Alignment Evaluation Initiative (OAEI)
I Formal comparative evaluation of different ontology-matching tools;
I Run every year since 2004;
I Variety of test cases (in size, in formalism, in content);
I Results consistent across test cases;
I Results very dependent on the tasks and the data (from under 50% ofprecision and recall to well over 80% if ontologies are relatively similar)
I Progress every year!
http://oaei.ontologymatching.org
Now involved in the SEALS (Semantics Evaluation At Large Scale) project.
Jerome Euzenat Ontology matching 21 / 36
Evaluation process
o
o ′
matching
parameters
resources
A
R
evaluator m
Jerome Euzenat Ontology matching 22 / 36
Benchmark results (precision and recallcurves)
recall0. 1.0.
prec
isio
n
1.
2010ASMOV
2009Lily
2008Lily
2007ASMOV
2006RiMOM
2005Falcon
edna
Jerome Euzenat Ontology matching 23 / 36
Tools you should be aware of
I Frameworks
I Alignment API: used by many tools; provides an exchange format andevaluation tools for OAEI. Alignment server for sharing.
I PROMPT (a Protege plug-in): includes a user interface and a plug-inarchitecture.
I COMA++: oriented toward database integration (many basic algorithmsimplemented).
I Matching systems
I OAEI best performers (Falcon, RiMOM, ASMOV, etc.)I Available systems (FOAM, Falcon, COMA++, Aroma, etc.)
Jerome Euzenat Ontology matching 24 / 36
The data interlinking problem
URI1 URI2
o o ′
Data interlinking
owl:sameAs
Jerome Euzenat Ontology matching 25 / 36
Example: Linking INSEE and NUTS
NUTS: Nomenclature of territorial units for statistics
#INSEE INSEE name NUTS Level #NUTS1 Pays 0 34
1 14226 Region 2 344
100 Departement 3 1488342 Arrondissement
4036 Canton 452422 Commune 5
Jerome Euzenat Ontology matching 26 / 36
INSEE and NUTS: ontology alignment
Territoire FR
Pays
Region
Departement
Arrondissement
Commune
codenom
chef-lieusubdivision
integer
string
Region
Country
NUTSRegion
LAURegion
name
level
code
hasSubRegion=
≤
≤≤
≤
=
Jerome Euzenat Ontology matching 27 / 36
Simple alignments are not sufficient
Territoire FR
Region
Departement
Commune
nom
DEP 75
nom
COM 75056
nom
Region
NUTSRegion
name
FR101
name
Paris
=
=
=
≤≤
≤
=
=
=
Jerome Euzenat Ontology matching 28 / 36
Expressive alignments are necessary
Region
NUTSRegion
level
hasParentRegion
2 =
FR1=
=
subdivision hasSubRegion=
nom name=
Jerome Euzenat Ontology matching 29 / 36
Query generation
SELECT ?rPREFIX insee: <http://rdf.insee.fr/ontologie-geo-2006.rdf#>PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>FROM <http://rdf.insee.fr/geo/regions-2011.rdf>WHERE {
?r rdf:type insee:Region .}
SELECT ?nPREFIX nuts: <http://ec.europa.eu/eurostat/ramon/ontologies/geographic.rdf#>PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>FROM <http://ec.europa.eu/eurostat/ramon/rdfdata/nuts2008/>WHERE {
?n rdf:type nuts:NUTSRegion .?n nuts:level 2^^xsd:int .?n nuts:hasParentRegion nuts:FR1 .
}
Jerome Euzenat Ontology matching 30 / 36
Query generation
CONSTRUCT { ?r owl:sameAs ?n . }PREFIX insee: <http://rdf.insee.fr/ontologie-geo-2006.rdf#>PREFIX nuts: <http://ec.europa.eu/eurostat/ramon/ontologies/geographic.rdf#>PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>FROM <http://rdf.insee.fr/geo/regions-2011.rdf>FROM <http://ec.europa.eu/eurostat/ramon/rdfdata/nuts2008/>WHERE {
?r rdf:type insee:Region .?r insee:nom ?l .?n rdf:type nuts:NUTSRegion .?n nuts:name ?l .?n nuts:level 2^^xsd:int .?n nuts:hasParentRegion nuts:FR1 .
}
Jerome Euzenat Ontology matching 31 / 36
What does this mean?
I Ontology alignments are schema-level expression of correspondences;
I They are useful for focussing the search;
I Expressive alignments are necessary;
I They can be turned into SPARQL-based link generators.
but it is also necessary to express instance level constraints:
I for converting data (e.g., mph vs. m/s);
I for expressing matching constraint on data (e.g., similarity).
Jerome Euzenat Ontology matching 32 / 36
General framework
o o ′
URI1 URI2
Ontology matching
A
Data interlinking
owl:sameAs
Jerome Euzenat Ontology matching 33 / 36
Selected challenges
I Scalability and efficiencyI Current matchers can be fast, scale and accurate, but not all at once.
I New sources of matchingI Context-based matching,
I General purpose matching (vs. special purpose matching)I Matcher combination,I Matcher selection and self-configuration,
I User involvement,I Matching (serendipitously) while working,I How to explain alignments?I Social and collaborative ontology matching,
I Alignment management: infrastructure and support,I How do we maintain alignments when ontologies evolve?I Reasoning with alignments,I Being robust to incorrect alignments.
and, of course, many others,
Jerome Euzenat Ontology matching 34 / 36
Further reading
I “Ontology Matching” by Euzenat andShvaiko
I Proceedings of ISWC, ASWC, ESWC,WWW conferences, etc.
I Journal of web semantics, Semantic webjournal, Journal on data semantics, etc.
I http://www.ontologymatching.org
Jerome Euzenat Ontology matching 35 / 36
http://exmo.inrialpes.fr