Download - A Framework for Ontology-Based Knowledge Management System Jiangning Wu Dalian University of Technology, China.

A Framework for Ontology-Based Knowledge Management System

Jiangning Wu

Dalian University of Technology, China

Introduction

• Introduction

– Background– Problems– Solution– Focus– Contributions

Research Center of Knowledge Science & Technology, DUT

• The goal of a general KMS is to provide t

he right knowledge to the right people at t

he right time and in the right format.

• Through KMSs, users can access and uti

lize the rich sources of data, information

and knowledge stored in different forms.

BackgroundResearch Center of Knowledge Science & Technology, DUT

Problems

• Traditional KMSs are based on the existing data repositories and users’ needs.

• For knowledge discovering, users submit queries to the system and receive knowledge by keyword match.

• But keyword-based systems cannot understand the meaning of data. They are inflexible and stifle for knowledge creation.


Solution

• The emerging ontology-based KMSs can

find the content-oriented knowledge that

people really want.

• The domain ontology is powerful in know

ledge representation and associated infe

rence.


Focus

• We mainly focus on performing the activity for projects and domain experts matching.

• In project management, it is not easy to choose an appropriate domain expert for a certain project if experts’ research areas and the contents of the projects are not understood very well.


Contributions

• Our contributions are describing experts’ research areas and the contents of the projects by separated ontologies based on the same standard subject category of China.

• So the matching problem is transformed into calculating the semantic similarities between ontologies.


Contributions

• To calculate the similarity between

documents, we propose an integrated

method based on node-based method

and edge-based method to solve this

problem.


Ontology in KR

• Ontology in Knowledge Representation

– Ontology in General

– T.R. Gruber

– Why Ontology

– Our Ontology


Ontology

• Research on knowledge representation

has been a focus of AI and IS disciplines

for a number of years.

• Much of contemporary research extends

the seminal work within AI discipline, of

which research in ontology has been one

of the beneficiaries.


Ontology

• Research in computational ontology ha

s traditionally sought to develop structu

re for the purpose of knowledge subsu

mption.

• The goal of such research aims to deve

lop generic, reusable representations o

f domain ontology.


T.R Gruber

• T.R. Gruber claimed: An ontology is an

explicit specification of a

conceptualization. The term is borrowed

from philosophy, where an ontology is a

systematic account of existence.

• For knowledge-based systems, what

“exists” is exactly that which can be

represented.


Ontology

• An ontology in short is an explicit description of a domain:– concepts– properties and attributes of concepts– constraints on properties and

attributes– Individuals (often, but not always)

• An ontology defines – a common vocabulary– a shared understanding


Why Ontology

• To share common understanding of the structure of information

– among people

– among software agents

• To enable reuse of domain knowledge

– to avoid “re-inventing the wheel”

– to introduce standards to allow interoperability


Why OntologyResearch Center of Knowledge Science & Technology, DUT

• To make domain assumptions explicit– easier to change domain

assumptions (consider a genetics knowledge base)

– easier to understand and update legacy data

• To separate domain knowledge from the operational knowledge– re-use domain and operational

knowledge separately (e.g., configuration based on constraints)

Our Ontology

• The ontology is a collection of concepts and their relationships, and serves as a conceptualized vocabulary to describe an application domain.

• In our study, it is created by means of Protege, which is developed by Stanford University.


Our Ontology

• The initial concepts in our ontology are broadly extracted from the standard subject category of China.

• To make the selected concepts more suitable for our concerned projects and domain experts, a tool called Concept Filler is developed, which is simply an interface to help domain experts assign proper concepts and weights manually.


InterfaceResearch Center of Knowledge Science & Technology, DUT

Our Ontology

• When specifying the concept, the

corresponding weight value ranging from

0 to 1 is also assigned to itself aiming to

distinguish its importance.

• The relationships in an ontology are

explicitly named which can reflect the

context of the domain knowledge.


Relationships

• Many types of relationships can be found in ontology construction as we have known, such as IS-A relation, Kind-of relation, Part-of relation, Substance-of relation, and so on.

• Since IS-A (hyponym / hypernym) relation is the most common concern in ontology presentation, only this kind of relation is therefore introduced in our research for simplification.


Our Ontology

E x is tin g o n to lo g ies

I m p o r to n to lo g ies

D eter m in ed o m ain

R eq u ir em en tan a ly s e

F o r m aliza tio n

E v alu a tio n

D o c u m en t

Up d ate

C r ite r ia

D o c u m en ts

N e w D o ma in

M atu r e o n to lo g ies

S em i- f o r m al o n to lo g ies

Up d ated o n to lo g ies

ProceduresProcedures

in the in the

DevelopmeDevelopme

nt nt

of the of the

Chinese Chinese

OntologyOntology

Matching Method

• Matching Method

– Node-based Method

– Edge-based Method

– Shortcomings

– Integrated Method


Considerations

• Calculating the similarity between concepts based on the complex relationships is a challenging work.

• Unfortunately no method can deal with the above problem effectively up to now.

• Considering some similarity calculation methods have been developed based on the simplest relation - IS-A relation, only this kind of relation is retained in our study.


Node-based Method

• Resnik used information content to me

asure the similarity.

• His point is that the more information c

ontent two concepts share, the more si

milarity two concepts have.


Node-based MethodResearch Center of Knowledge Science & Technology, DUT

The similarity of two concepts c1 and c2 is

1 2

1 2( , )

( , ) max [ log ( )]c Sup c c

sim c c p c

)],([max),( 21)(),(212211

21 ccsimccsimtsenctsenc

tt

Considering many inherited concepts may

have more than one senses, the above

formula is modified as

Edge-based Method

• Leacock and Chodorow summed up the shortest path length and converted this statistical distance to the similarity measure.


]2

),(minlog[),(

max

21)(),(

21221121

d

cclenccsim tsenctsenctt

Shortcomings

• Both node-based and edge-based methods only simply consider two concepts in the same concept tree without expanding to two lists of concepts in different concept trees.

• However the fact is when we describe different documents in the same domain using ontology structures, homogeneous but heteromorphic concept trees are often formed.


Shortcomings

• The matching problem to be solved here is calculating the similarity between two different concept trees, not between two concepts in the same tree.

• So we have to develop a new method that can calculate the similarities between two lists of concepts in different trees, by which the quantified similarity value can show how similar the documents are.


Shortcomings

• The node-based method does not concern the distance between concepts.

• From the four-hierarchy concept tree, we can see that if concepts C21, C31 and C36

have the same sense and the equal frequency, we may get the following result according to the node-based method

sim(C21, C31) = sim(C21, C36)


Shortcomings

• However, it is obvious to see that concepts C21 and C31 are more similar

since C31 is the direct inheritor of C21.


ShortcomingsResearch Center of Knowledge Science & Technology, DUT

Layer 1C11

Layer 2

Layer 4

Layer 3

C22C21 C23

C36C31 C32 C33 C34 C35

C41 C42 C43 C44

Shortcomings

• In contrast to the node-based method, the edge-based method only considers the relationships between concepts and ignores the weights of concepts.

• Both concepts C31 and C32 respectively

have only one edge with C21. According

to the edge-base method, the same similarity value can be obtained.


Shortcomings

• But, if C31 has bigger weight than C32,

C31 is considered to be more important

and the corresponding similarity value between C31 and C21 should be greater.


Integrated Method

• Before conducting the proposed

method, the documents related to

projects and domain experts should be

formalized first that results in two

vectors containing the concepts with

their frequencies.


Integrated Method

• The similarity between cis and cjt


)1),(1),(

log()(

jt

jt

is

isjtis cclen

w

cclen

wc,csim

)],([max),()(),( jtistsenctsenc

t

jt

t

is ccsimccsimjtjtisst

jtis

• The modified similarity

nmSIM

cc

jDociDocsim

m

s

n

t

t

jt

t

isjtis

1 1

),sim(

))(),((

Integrated Method

• The similarity between two documents


)],(max[ jtist

jt

t

is ccsimSIM

Framework

• Ontologies Building

• Documents Formalization

• Similarity Calculation

• User Interface.


FrameworkResearch Center of Knowledge Science & Technology, DUT

Exp ertD o c um ents

P ro jec tD o c um ents

S im ilarity C alc ulatio n

O nto lo gy L ib rary

D o c um ents F o rm alizatio n

O nto lo gies B uild ing

D atab as e

U s er Interfac e

Exp ertC o nc ep t T rees

P ro jec tC o nc ep t T rees

R es ult L is t

Internet

U s ers

Evaluation

• Two measures to verify our ontology-based KMS


| |100%

| |

A RPrecision

A

| |100%

| |

A RRecall

R

EvaluationResearch Center of Knowledge Science & Technology, DUT

Precision Comparison Chart

0.00%

10.00%

20.00%

30.00%

40.00%

1 2 3 4 5

E-based

N-based

Integrated

Precision

EvaluationResearch Center of Knowledge Science & Technology, DUT

Recall Comparison Chart

0.00%10.00%20.00%30.00%40.00%50.00%60.00%

1 2 3 4 5

E-based

N-based

Integrated

Recall

Conclusions

• An ontology-based method to match projects and domain experts is presented.

• The prototype system we developed contains four modules: Ontology building, Document formalization, Similarity calculation and User interface.


Conclusions

• We discuss node-based and edge-based approaches to computing the semantic similarity, and propose an integrated approach to calculating the semantic similarity between two documents.

• The experimental results show that our ontology-based KMS performing the activity for projects and domain experts matching can reach better recall and precision.


Future Works

• As mentioned previously, only the simplest relation “IS-A relation” is considered in our study.

• When dealing with the more complex ontology whose concepts are restricted by logic or axiom, our method is not powerful enough to describe the real semantic meaning by merely considering the hierarchical structure.


Future Works

• So the future work will be focused on

the other kinds of relations that are

used in ontology construction.

• In other words, it will be an exciting

and challenging work for us to

compute the semantic similarity upon

various relations in the future.


THANKS