+ All Categories
Home > Documents > The Use of Machine-Generated Ontologies in Dynamic Information Seeking

The Use of Machine-Generated Ontologies in Dynamic Information Seeking

Date post: 07-Jan-2016
Category:
Upload: abia
View: 31 times
Download: 0 times
Share this document with a friend
Description:
The Use of Machine-Generated Ontologies in Dynamic Information Seeking. Giovanni Modica Avigdor Gal Hasan M. Jamil. Motivating example. Preliminaries. Definition : An ontology is an explicit representation of a conceptualization. (Gruber 1993) - PowerPoint PPT Presentation
Popular Tags:
20
CoopIS’2001 Trento, Italy The Use of Machine-Generated Ontologies in Dynamic Information Seeking Giovanni Modica Avigdor Gal Hasan M. Jamil
Transcript
Page 1: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

The Use of Machine-Generated Ontologies in Dynamic Information Seeking

Giovanni ModicaAvigdor Gal

Hasan M. Jamil

Page 2: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

Motivating example

Page 3: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

PreliminariesDefinition: An ontology is an explicit representation of a conceptualization. (Gruber 1993)Conjecture I: Applications in a given domain base their information exchange on some (shared) underlying ontology.Observation: Application in a given domain use different ontology representation.Conjecture II: Given an application A such that A utilizes an ontology representation OA, and an ontology O, there exists an invertible mapping fA such that

fA(OA)=O

Page 4: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

Problem descriptionGiven two applications A and B, such that A utilizes an ontology representation OA and B utilizes an ontology representation OB, introduce a mapping fBA such that

fBA (OB)=OA

In a perfect world:– O is known.

– fA is known.

– fB is known.

OA= fA-1(fB(OB))

Alas:– O is unknown. At best, an approximation of O exists, in a

form of a standard.

– fA and fB are unknown: lack of documentation, the mental state of a designer, etc.

Page 5: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

Proposed solution

Given two applications A and B, such that A utilizes an ontology representation OA and B utilizes an ontology representation OB, introduce a mapping fBA such that

fBA depends on the ontology representation.

A matching is associated with a “degree of confidence” in the matching.

0 identifies non-matching terms.1 identifies a crisp matching.

]1,0[: BABA OOf

Page 6: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

Ontology representation

Dynamic information seeking:– HTML forms

• Labels• Input fields• Scripts

– Assumptions:• Labels represent terms in an ontology (e.g., Pick-up Date).• Input fields provide constraints on the value domains (e.g., {Day, 1,…

31}).• Scripts, among other things, suggest a precedence relationship (e.g.,

Pick-up Locations is required before selecting a Car Type).

Page 7: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

Ontology representation

Conceptual modeling approachBased on Bunge:– Terms (things)– Values– Composition– Precedence

Page 8: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

Ontology extraction and matchingURL (e.g. http://www.avis.com)

HTMLParsing

DOMTree

Phase 1Parsing

Phase 2Labeling

HTML Elements

Label Identification

FORM Elements

rules

Form Renderin

g

Phase 3Ontology

Phase 4Merging

KB

KB Submission

Matching Algorithms

Target/Candidate Ontology

Target Ontology

CandidateOntology

Refined Ontology

Ontology Creation

Thesaurus

Page 9: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

Phase 1: Parsing

Page 10: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

Phase 2: Labeling

Page 11: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

Phase 2: Labeling

Page 12: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

Phase 2: Labeling

Page 13: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

MergingHeuristics for the ontology merging (Frakes and Baeza-Yates,

1992): Textual matching: Date date Pickup pickup Ignorable characters removal: *Country country De-hyphenation: Pick-up Pickup Pickup Pick up Stop terms removal:

Date of Return Return DateStop terms: a, to, do, does, the, in, or, and, this, those, that,

… etc. Substring matching: Pickup Location Code Pick-up location

(66%) Content matching:

Dropoff Day (1,..,31) Return Day (1,..,31) (100%)Dropoff Return

Thesaurus matching: Dropoff Location Return Location (100%)

Page 14: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

Phase 4: Merging

Page 15: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

Preliminary Results

Two metrics are used for performance analysis (Frakes and Baeza-Yates, 1992):

Recall (completeness) Precision (soundness)

Parameters: tr : number of terms retrieved tm : number of terms matched te : number of terms effectively matched

r

m

t

tR

m

e

t

tP Recall: Precision:

Page 16: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

Preliminary Results

RPb

PRbE

2

2 )1(1

Example: # of terms in Ontology1: 20# of matches identified: 15 Recall: 75%(15/20)# of effective matches: 10 Precision: 66%

(10/15)A third metric is used to compare the recall and precision. For a precision value P, a recall value R and an importance measure b, the combined metric E is calculated as (Frakes and Baeza-Yates, 1992):

Page 17: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

Preliminary Results

Precision vs. Recall (Avis & Hertz)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Textual I gn. Chars. De-hyph. StopTerms Substring SubstringNames

Content Thesaurus

Recall

Precision

Strategy Recall Precision ETextual 0.3 0.33 0.673913Ign. Chars. 0.3 0.33 0.673913De-hyph. 0.6 0.33 0.634146StopTerms 0.6 0.33 0.634146Substring 0.75 0.40 0.558824Substring Names 0.75 0.67 0.318182Content 0.65 0.92 0.148472Thesaurus 0.65 0.92 0.148472

Page 18: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

Preliminary Results

E Metric for Hertz vs. Alamo (b=0.5)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Textual Ign.Chars.

De-hyph. StopTerms Substring SubstringNames

Content Thesaurus

Hertz

Alamo

Page 19: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

Preliminary Results

Learning from Thesaurus

0.389534884

0.479166667

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

No Thesaurus Improved Thesarus

E (b=0.5)

Page 20: The Use of Machine-Generated Ontologies in Dynamic Information Seeking

CoopIS’2001Trento, Italy

Summary and Future Work

We have introduced:– Automatic ontology creation– Automatic matching process– Preliminary results

Future work oriented towards:– Incorporation of query facilities into the tool– Automatic navigation of web sites for ontology extraction– Dynamic translation between queries against the target ontology to

queries against the multiple candidate ontologies


Recommended