+ All Categories
Home > Documents > Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli...

Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli...

Date post: 12-Jan-2016
Category:
Upload: philippa-walker
View: 216 times
Download: 0 times
Share this document with a friend
Popular Tags:
37
Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003
Transcript
Page 1: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

Semantic MatchingSemantic MatchingFausto Giunchiglia

work in collaboration with Pavel Shvaiko

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Page 2: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

2

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Outline

Matching

Syntactic Matching

Semantic Matching

On Implementing Semantic Matching

Conclusions

Page 3: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

3

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

MATCHING

Page 4: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

4

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Application Domains

Generic Model Management Schema integration

Data warehouses

E-commerce

Semantic query processing

Data Coordination in P2P systems

Page 5: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

5

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Matching Problems

1. RDB Schemas

2. OODB Schemas

3. XML Schemas

4. Concept Hierarchies

5. Ontologies

Page 6: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

6

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Example of Matching

Arts

Organizations

Art History

Music

Baroque

History

www.google.com

Organizations

Arts&Humanities

Art History

www.yahoo.com

Design Art

Baroque

Architecture

History

Sc=1.0

Sr={}

Sc=1.0

Sr={}

Sr={}

Page 7: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

7

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Matching

Match is an operator that takes two graph-like structures (e.g., database schemas or ontologies) and produces a mapping between elements of the two graphs that correspond semantically to each other

Page 8: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

8

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Matching

The problem of matching can be decomposed in two steps:

Extract graphs from the data and conceptual models

Match the resulting graphs (generic matching)

Page 9: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

9

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Matching

Mapping element is a 4-tuple < mID, Ni1, N

j2, R >, i=1...h; j=1..k;

where

mID is a unique identifier of the given mapping element;

Ni1 is the i-th node of the first graph, h is the number of nodes in the first graph;

Nj2 is the j-th node of the second graph, k is the number of nodes in the second graph

R specifies a similarity relation of the given nodes

Mapping is a set of mapping elements

Matching is the process of discovering mappings between two graphs through the application of a matching algorithm

Page 10: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

10

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Matching: Syntactic AND Semantic

Matching

Semantic Matching

Syntactic Matching•R is computed between labels at nodes

•R = [0,1]

•R is computed between concepts at nodes

•R = {set-theoretic relations, e.g., =, , , , }

Page 11: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

11

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

SYNTACTIC MATCHING

Page 12: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

12

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Syntactic Matching

Mapping element is a 4-tuple < mID, Li1, L

j2, R >, where

Li1 is the label at the i-th node of the first graph;

Lj2 is the label at the j-th node of the second graph;

R specifies a similarity relation in the form of a coefficient, which measures the similarity between the labels of the given nodes

Example: R is a similarity coefficient in [0,1]

R = <m21,telephone, phone, 0.7>

Page 13: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

13

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Example: Cupid (tentative links)

Arts

Organizations

Art History

Music

Baroque

History

www.google.com

Organizations

Arts&Humanities

Art History

www.yahoo.com

Design Art

Baroque

Architecture

History

Sc=1.0

Sc=1.0

Sc=0.9

Sc=1.0

Sc=0.7

Sc=1.0

Sc=0.7Sc=0.7

(final result)

Page 14: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

14

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

The State of the Art

Cupid… is a hybrid matching prototype. It exploits linguistic and structural schema matching heuristics, and computes similarity coefficients between nodes of the trees.

Similarity Flooding… is a hybrid matching prototype. It uses fix-point computation to determine correspondences between nodes of the graphs.

COMA…is a composite matching prototype. It provides an extensible library of different matchers which manipulate DAGs and supports various ways of combining final results.

As far as we know, so far only syntactic matching…

Page 15: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

15

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

SEMANTIC MATCHING

Page 16: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

16

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Semantic Matching  

Mapping element is a 4-tuple < mID, Ci1, C

j2, R >, where

Ci1 is the concept of the i-th node of the first graph;

Cj2 is the concept of the j-th node of the second graph;

R specifies a similarity relation in the form of a semantic relation between the extensions of concepts at the given nodes

Possible R’s: equality {=}, overlapping {}, mismatch {}, more general/specific {, }

Example: R = <m21,telephone, phone, {=}>

Page 17: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

17

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Examples: Analysis of Siblings

A

B C

is-a is-a

is-a is-a

E D

1

4 3

5 2

A

C B

is-a is-a

is-a is-a

E D

1

4 3

5 2

Suppose that we want to match nodes 51 and 22

Cupid: R = 0,8. This is because A1=A2, C1=C2 and we

have the same structures on both sides (no importance of order of links)

A semantic matching approach compares concepts A1C1 with A2C2 and produces C51 = C22

Page 18: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

18

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Examples: Analysis of Ancestors. Case 1

Suppose that we want to match nodes 51 and 12

A

C

is-a

B

is-a

is-a is-a

E D

1

2

4

3

5

A

C B

is-a is-a

is-a is-a

E D

1

4 3

2 5

Cupid does not find a similarity coefficient between the nodes under consideration, due to the significant differences in structure of the given graphs

Semantic matching: The concept denoted by the label at node 51 is C1, while the concept at node 51 is C51 =

A1C1. The concept at node 12 is C12 = C2. Thus, C51 C12

Page 19: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

19

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Examples: Analysis of Ancestors. Case 2

Suppose that we want to match nodes 51 and 52

A

C B

is-a is-a

is-a is-a

E D

1

4 3

2 5 *

A

is-a

C

is-a

is-a is-a

E D

1

2

3

5

Cupid: R= 0,86. This is because of the identity of labels A1=A2, C1=C2

Semantic matching: The concept at node 51 is C51 =

A1C1; while the concept at node 52 is C52 = A2*C2.

Since we have that A1=A2 and C1=C2, then C52 C51

Page 20: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

20

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Examples: Enriched Analysis of Siblings

Suppose that we want to match nodes 21 and 22

is-a

World

Luxembourg

is-a is-a

Netherlands

Benelux

2

3

4

1

Belgium

World

is-a

1

2

Cupid: R= 0,68. This is mainly because of the entry in

the thesaurus specifying Belgium as a part of Benelux, and due to the fact that the nodes with labels Benelux1

and Belgium2 are leaves

Semantic matching: We treat C21 as Benelux1 Netherlands1 Luxembourg1 = Belgium.

Thus, C21 = C22

Page 21: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

21

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

ON

IMPLEMENTING

SEMANTIC MATCHING

Page 22: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

22

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

On Implementation

Semantic Matching

Structure - level

Element - level

Weak Semantics Techniques

Strong Semantics Techniques

Page 23: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

23

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Element-level Semantic Matching

Weak Semantics Techniques Analysis of strings {=}

<phone, telephone,{=}>

Analysis of data types {=, , , , } <string, integer,{}>

<integer, real,{}>

Analysis of soundex {=}< Fausto, Phausto,{=}>

Strong Semantics Techniques Precompiled thesaurus

syn key <Discount, Rebate,{=}>

WordNet <Art_#1, Humanities_#1,{}>, where #1 … sense number 1 of the word Art according to WordNet

Page 24: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

24

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Element-level Semantic Matching (cont.)

Semantic Relations via WordNetEquality: one concept is equal to another if there is at least one sense of the first concept, which is a synonym of the secondOverlapping: one concept is overlapped with the other if there are some senses in commonMismatch: two concepts are mismatched if they have no sense in commonMore general: one concept is more general then the other iff there exists at least one sense of the first concept that has a sense of the other as a hyponym or meronymLess general: one concept is less general than the other iff there exists at least one sense of the first concept that has a sense of the other concept as hypernym or as a holonym

Page 25: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

25

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Structure-level Semantic Matching

We translate the matching problem, namely the two graphs (in particular, the pair of nodes submitted to matching) into a propositional formula and then check for its validity

We check for validity using SAT

Page 26: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

26

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Semantic Matching Algorithm

1. Extract the two graphs

2. Compute element-level semantic matching

3. Compute concepts at nodes

4. Construct the propositional formula

5. Run SAT

6. Perform iterations

Page 27: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

27

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Semantic Matching Algorithm: Example – (1)

Extract the two graphs

A

C

is-a

B

is-a

is-a is-a

E D

1

2

4

3

5

A

C B

is-a is-a

is-a is-a

E D

1

4 3

2 5

• In the case of RDB, XML and OODB schemas, it is

necessary to extract useful semantic information, for instance in the form of ontologies

Page 28: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

28

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Semantic Matching Algorithm: Example – (2)Element-level semantic matching. For each node, compute semantic relations holding among all the concepts denoted by labels at nodes under consideration

A

C

is-a

B

is-a

is-a is-a

E D

1

2

4

3

5

A

C B

is-a is-a

is-a is-a

E D

1

4 3

2 5

A1 = A2

B1 = B2

C1 = C2

D1 = D2

E1 = E2

Page 29: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

29

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Semantic Matching Algorithm: Example – (3)

Compute concepts at nodes. Suppose, we want to find a semantic relation between nodes 51 and 12

A

C

is-a

B

is-a

is-a is-a

E D

1

2

4

3

5

A

C B

is-a is-a

is-a is-a

E D

1

4 3

2 5

?

C11 = A1

C51 = A1 C2

C12 = C2

C51 C12

Page 30: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

30

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Semantic Matching Algorithm: Example – (4)Construct the propositional formula. We translate all the semantic relations computed in step 2 into propositional formulas under the following rules:

A1 A2 A2 A1

A1 A2 A1 A2

A1 = A2 A1 A2

A1 A2 (A1 A2)

A

C

is-a

B

is-a

is-a is-a

E D

1

2

4

3

5

A

C B

is-a is-a

is-a is-a

E D

1

4 3

2 5

?

From step 2 we have: C1 C2

We want to prove that C51 C12 ( we guess relation

between nodes at this stage)

(A1 C1) C2

(C1 C2) ((A1 C1) C2)

Page 31: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

31

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Semantic Matching Algorithm: Example – (5)

Run SAT

In order to prove that (C1 C2) ((A1 C1 ) C2) is

valid, we prove that its negation is unsatisfiabile

(C1 C2) ((A1 C1) C2)

SAT returns FALSE

Thus, C51 C12

Page 32: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

32

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Semantic Matching Algorithm: Example – (6.1)

Iterations. Iterations are performed re-running SAT

is-a is-a

A

B

is-a

D C 4 2 3

1

F

is-a is-a

C B 2 3

1

B D C

is-a is-a

A

is-a

4 3 2

1

B D C

is-a is-a

F

is-a

4 3 2

1

B D C

is-a is-a

G

is-a

4 3 2

1

Suppose, that C21 C22

…an oracle tells us that A1 = F2 G2

After this additional analysis we can infer C21 = C22

Page 33: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

33

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Semantic Matching Algorithm: Example – (6.2)Iterations. …to use the result of a previous match

A

C F

is-a is-a

is-a is-a

E D

1

5 4

3 2

A

C B

is-a is-a

is-a is-a

E D

1

5 4

3 2

Suppose, that F1 B2

Having found that C41 C42

We can automatically infer that C51 C52

Page 34: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

34

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Example: Cupid vs. Semantic Matching

Arts

Organizations

Art History

Music

Baroque

History

www.google.com

Organizations

Arts&Humanities

Art History

www.yahoo.com

Design Art

Baroque

Architecture

History{}

{}

{}

{}

{}

{}

Page 35: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

35

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Conclusions

We have made a rational reconstruction of the major matching problems and articulated them in terms of the more generic problem of matching graphs

We have identified semantic matching as a new approach for performing generic matching

We have proposed an implementation of semantic matching using SAT

Page 36: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

36

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

Future Work

Extend to a full graph matcher

How to extract semantics from schemas

Study how to take into account attributes and instances

Develop an efficient implementation of the system

Do a thorough testing of the system

Page 37: Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

37

The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

References

Project website: http://www.dit.unitn.it/~p2p/

F. Giunchiglia, P.Shvaiko “Semantic Matching”. Technical Report #DIT-03-013, Trento, 2003. Also to appear in Proc. of ODS at IJCAI – 03.

F. Giunchiglia, I. Zaihrayeu “Making peer databases interact – a vision for an architecture supporting data coordination” In Proc. Of the Conference of Information Agents (CIA 2002), Madrid, 2002


Recommended