A Framework for Ontology-Based Knowledge Management System
Jiangning Wu
Dalian University of Technology, China
Introduction
• Introduction
– Background– Problems– Solution– Focus– Contributions
Research Center of Knowledge Science & Technology, DUT
• The goal of a general KMS is to provide t
he right knowledge to the right people at t
he right time and in the right format.
• Through KMSs, users can access and uti
lize the rich sources of data, information
and knowledge stored in different forms.
BackgroundResearch Center of Knowledge Science & Technology, DUT
Problems
• Traditional KMSs are based on the existing data repositories and users’ needs.
• For knowledge discovering, users submit queries to the system and receive knowledge by keyword match.
• But keyword-based systems cannot understand the meaning of data. They are inflexible and stifle for knowledge creation.
Research Center of Knowledge Science & Technology, DUT
Solution
• The emerging ontology-based KMSs can
find the content-oriented knowledge that
people really want.
• The domain ontology is powerful in know
ledge representation and associated infe
rence.
Research Center of Knowledge Science & Technology, DUT
Focus
• We mainly focus on performing the activity for projects and domain experts matching.
• In project management, it is not easy to choose an appropriate domain expert for a certain project if experts’ research areas and the contents of the projects are not understood very well.
Research Center of Knowledge Science & Technology, DUT
Contributions
• Our contributions are describing experts’ research areas and the contents of the projects by separated ontologies based on the same standard subject category of China.
• So the matching problem is transformed into calculating the semantic similarities between ontologies.
Research Center of Knowledge Science & Technology, DUT
Contributions
• To calculate the similarity between
documents, we propose an integrated
method based on node-based method
and edge-based method to solve this
problem.
Research Center of Knowledge Science & Technology, DUT
Ontology in KR
• Ontology in Knowledge Representation
– Ontology in General
– T.R. Gruber
– Why Ontology
– Our Ontology
Research Center of Knowledge Science & Technology, DUT
Ontology
• Research on knowledge representation
has been a focus of AI and IS disciplines
for a number of years.
• Much of contemporary research extends
the seminal work within AI discipline, of
which research in ontology has been one
of the beneficiaries.
Research Center of Knowledge Science & Technology, DUT
Ontology
• Research in computational ontology ha
s traditionally sought to develop structu
re for the purpose of knowledge subsu
mption.
• The goal of such research aims to deve
lop generic, reusable representations o
f domain ontology.
Research Center of Knowledge Science & Technology, DUT
T.R Gruber
• T.R. Gruber claimed: An ontology is an
explicit specification of a
conceptualization. The term is borrowed
from philosophy, where an ontology is a
systematic account of existence.
• For knowledge-based systems, what
“exists” is exactly that which can be
represented.
Research Center of Knowledge Science & Technology, DUT
Ontology
• An ontology in short is an explicit description of a domain:– concepts– properties and attributes of concepts– constraints on properties and
attributes– Individuals (often, but not always)
• An ontology defines – a common vocabulary– a shared understanding
Research Center of Knowledge Science & Technology, DUT
Why Ontology
• To share common understanding of the structure of information
– among people
– among software agents
• To enable reuse of domain knowledge
– to avoid “re-inventing the wheel”
– to introduce standards to allow interoperability
Research Center of Knowledge Science & Technology, DUT
Why OntologyResearch Center of Knowledge Science & Technology, DUT
• To make domain assumptions explicit– easier to change domain
assumptions (consider a genetics knowledge base)
– easier to understand and update legacy data
• To separate domain knowledge from the operational knowledge– re-use domain and operational
knowledge separately (e.g., configuration based on constraints)
Our Ontology
• The ontology is a collection of concepts and their relationships, and serves as a conceptualized vocabulary to describe an application domain.
• In our study, it is created by means of Protege, which is developed by Stanford University.
Research Center of Knowledge Science & Technology, DUT
Our Ontology
• The initial concepts in our ontology are broadly extracted from the standard subject category of China.
• To make the selected concepts more suitable for our concerned projects and domain experts, a tool called Concept Filler is developed, which is simply an interface to help domain experts assign proper concepts and weights manually.
Research Center of Knowledge Science & Technology, DUT
InterfaceResearch Center of Knowledge Science & Technology, DUT
Our Ontology
• When specifying the concept, the
corresponding weight value ranging from
0 to 1 is also assigned to itself aiming to
distinguish its importance.
• The relationships in an ontology are
explicitly named which can reflect the
context of the domain knowledge.
Research Center of Knowledge Science & Technology, DUT
Relationships
• Many types of relationships can be found in ontology construction as we have known, such as IS-A relation, Kind-of relation, Part-of relation, Substance-of relation, and so on.
• Since IS-A (hyponym / hypernym) relation is the most common concern in ontology presentation, only this kind of relation is therefore introduced in our research for simplification.
Research Center of Knowledge Science & Technology, DUT
Our Ontology
E x is tin g o n to lo g ies
I m p o r to n to lo g ies
D eter m in ed o m ain
R eq u ir em en tan a ly s e
F o r m aliza tio n
E v alu a tio n
D o c u m en t
Up d ate
C r ite r ia
D o c u m en ts
N e w D o ma in
M atu r e o n to lo g ies
S em i- f o r m al o n to lo g ies
Up d ated o n to lo g ies
ProceduresProcedures
in the in the
DevelopmeDevelopme
nt nt
of the of the
Chinese Chinese
OntologyOntology
Matching Method
• Matching Method
– Node-based Method
– Edge-based Method
– Shortcomings
– Integrated Method
Research Center of Knowledge Science & Technology, DUT
Considerations
• Calculating the similarity between concepts based on the complex relationships is a challenging work.
• Unfortunately no method can deal with the above problem effectively up to now.
• Considering some similarity calculation methods have been developed based on the simplest relation - IS-A relation, only this kind of relation is retained in our study.
Research Center of Knowledge Science & Technology, DUT
Node-based Method
• Resnik used information content to me
asure the similarity.
• His point is that the more information c
ontent two concepts share, the more si
milarity two concepts have.
Research Center of Knowledge Science & Technology, DUT
Node-based MethodResearch Center of Knowledge Science & Technology, DUT
The similarity of two concepts c1 and c2 is
1 2
1 2( , )
( , ) max [ log ( )]c Sup c c
sim c c p c
)],([max),( 21)(),(212211
21 ccsimccsimtsenctsenc
tt
Considering many inherited concepts may
have more than one senses, the above
formula is modified as
Edge-based Method
• Leacock and Chodorow summed up the shortest path length and converted this statistical distance to the similarity measure.
Research Center of Knowledge Science & Technology, DUT
]2
),(minlog[),(
max
21)(),(
21221121
d
cclenccsim tsenctsenctt
Shortcomings
• Both node-based and edge-based methods only simply consider two concepts in the same concept tree without expanding to two lists of concepts in different concept trees.
• However the fact is when we describe different documents in the same domain using ontology structures, homogeneous but heteromorphic concept trees are often formed.
Research Center of Knowledge Science & Technology, DUT
Shortcomings
• The matching problem to be solved here is calculating the similarity between two different concept trees, not between two concepts in the same tree.
• So we have to develop a new method that can calculate the similarities between two lists of concepts in different trees, by which the quantified similarity value can show how similar the documents are.
Research Center of Knowledge Science & Technology, DUT
Shortcomings
• The node-based method does not concern the distance between concepts.
• From the four-hierarchy concept tree, we can see that if concepts C21, C31 and C36
have the same sense and the equal frequency, we may get the following result according to the node-based method
sim(C21, C31) = sim(C21, C36)
Research Center of Knowledge Science & Technology, DUT
Shortcomings
• However, it is obvious to see that concepts C21 and C31 are more similar
since C31 is the direct inheritor of C21.
Research Center of Knowledge Science & Technology, DUT
ShortcomingsResearch Center of Knowledge Science & Technology, DUT
Layer 1C11
Layer 2
Layer 4
Layer 3
C22C21 C23
C36C31 C32 C33 C34 C35
C41 C42 C43 C44
Shortcomings
• In contrast to the node-based method, the edge-based method only considers the relationships between concepts and ignores the weights of concepts.
• Both concepts C31 and C32 respectively
have only one edge with C21. According
to the edge-base method, the same similarity value can be obtained.
Research Center of Knowledge Science & Technology, DUT
Shortcomings
• But, if C31 has bigger weight than C32,
C31 is considered to be more important
and the corresponding similarity value between C31 and C21 should be greater.
Research Center of Knowledge Science & Technology, DUT
Integrated Method
• Before conducting the proposed
method, the documents related to
projects and domain experts should be
formalized first that results in two
vectors containing the concepts with
their frequencies.
Research Center of Knowledge Science & Technology, DUT
Integrated Method
• The similarity between cis and cjt
Research Center of Knowledge Science & Technology, DUT
)1),(1),(
log()(
jt
jt
is
isjtis cclen
w
cclen
wc,csim
)],([max),()(),( jtistsenctsenc
t
jt
t
is ccsimccsimjtjtisst
jtis
• The modified similarity
nmSIM
cc
jDociDocsim
m
s
n
t
t
jt
t
isjtis
1 1
),sim(
))(),((
Integrated Method
• The similarity between two documents
Research Center of Knowledge Science & Technology, DUT
)],(max[ jtist
jt
t
is ccsimSIM
Framework
• Ontologies Building
• Documents Formalization
• Similarity Calculation
• User Interface.
Research Center of Knowledge Science & Technology, DUT
FrameworkResearch Center of Knowledge Science & Technology, DUT
Exp ertD o c um ents
P ro jec tD o c um ents
S im ilarity C alc ulatio n
O nto lo gy L ib rary
D o c um ents F o rm alizatio n
O nto lo gies B uild ing
D atab as e
U s er Interfac e
Exp ertC o nc ep t T rees
P ro jec tC o nc ep t T rees
R es ult L is t
Internet
U s ers
Evaluation
• Two measures to verify our ontology-based KMS
Research Center of Knowledge Science & Technology, DUT
| |100%
| |
A RPrecision
A
| |100%
| |
A RRecall
R
EvaluationResearch Center of Knowledge Science & Technology, DUT
Precision Comparison Chart
0.00%
10.00%
20.00%
30.00%
40.00%
1 2 3 4 5
E-based
N-based
Integrated
Precision
EvaluationResearch Center of Knowledge Science & Technology, DUT
Recall Comparison Chart
0.00%10.00%20.00%30.00%40.00%50.00%60.00%
1 2 3 4 5
E-based
N-based
Integrated
Recall
Conclusions
• An ontology-based method to match projects and domain experts is presented.
• The prototype system we developed contains four modules: Ontology building, Document formalization, Similarity calculation and User interface.
Research Center of Knowledge Science & Technology, DUT
Conclusions
• We discuss node-based and edge-based approaches to computing the semantic similarity, and propose an integrated approach to calculating the semantic similarity between two documents.
• The experimental results show that our ontology-based KMS performing the activity for projects and domain experts matching can reach better recall and precision.
Research Center of Knowledge Science & Technology, DUT
Future Works
• As mentioned previously, only the simplest relation “IS-A relation” is considered in our study.
• When dealing with the more complex ontology whose concepts are restricted by logic or axiom, our method is not powerful enough to describe the real semantic meaning by merely considering the hierarchical structure.
Research Center of Knowledge Science & Technology, DUT
Future Works
• So the future work will be focused on
the other kinds of relations that are
used in ontology construction.
• In other words, it will be an exciting
and challenging work for us to
compute the semantic similarity upon
various relations in the future.
Research Center of Knowledge Science & Technology, DUT
THANKS
Research Center of Knowledge Science & Technology, DUT