+ All Categories
Home > Documents > NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS...

NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS...

Date post: 04-Oct-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
40
NeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get a spec The contex core techn classificatio as topics contextual importance Document Id Class Deliver Project start Project durat ecycle Sup d Project (IS ST-2004-2.4 D3 Deliverab Deliverab Other Aut erable prov web search good rankin solution th then visua s a weighte lly closer to ifically tailo xt-sensitive nologies fo on of result . Since S izes any te e of the con entifier: N rable: N date: M tion: 4 pport for Ne ST-2005-02 4.7 – “Sem 3.2.4 Con ble Co-ordin ble Co-ordin thors: Dun vides an e scenarios w ng is one of hat automat alized in a p d ranking. o the topics red ranking search of o or contextu ts into a giv SearchPoint extual gene ncepts in the EON/2009/D3 EON EU-IST- arch 1, 2006 years 2 etworked O 27595) mantic-base ntext-sen nator: nating Inst nja Mladen extension o with the use the main p tically gene panel which Every such near the se of the resu ontologies a alization d ven ontology t easily u eral knowled e ontology. 3.2.4/v1.0 -2005-027595 2006–2009 © C Ontologies ed knowled nsitive Se Boštjan Pa titution: nić (JSI), Ma of the web e of other on roblems of erates topic h the user h ranking is elected poin ults with a s as proposed eveloped i y. Concepts pgrades a dge by re-r Date Subm Vers State Distr opyright lies wit dge and co earch of ajntar J. Stefa arko Grobe application ntologies th ontology se cs related to can interac s aligned wi nt get ranked ingle click. d in this deli n NeOn W s that are th any textual ranking it in e due: mission date: sion: e: ribution: th the respective ntent syste Ontolog an Institute elnik (JSI) n SearchPo hat provide c earch engin o a query a ct with. Eve ith the sele d higher. Ef iverable pro WP3. Topic he most rep l search e n accordan August August V1.0 Final Public e authors and th ems” gies e (JSI) oint, which context for nes. In Sear and its res ery point on ected point. ffectively, th ovides one o cs are gen presented g engine, it ce with use t 31, 2009 t 31, 2009 heir institutions. enhances the search. rchPoint we ults. These n this panel I.e. results he user can of the three nerated via get selected effectively er selected . s . e e l s n e a d y d
Transcript
Page 1: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

NeOn: Life

Integrated

Priority: IS

This deliveontology wFinding a gpropose a topics are representssemanticaget a spec

The contexcore technclassificatioas topicscontextualimportance

Document IdClass DeliverProject start Project durat

ecycle Sup

d Project (IS

ST-2004-2.4

D3

Deliverab

Deliverab

Other Aut

erable provweb search good rankin

solution ththen visua

s a weightelly closer toifically tailo

xt-sensitive nologies foon of result. Since Sizes any tee of the con

entifier: Nrable: Ndate: Mtion: 4

pport for Ne

ST-2005-02

4.7 – “Sem

3.2.4 Con

ble Co-ordin

ble Co-ordin

thors: Dun

vides an escenarios w

ng is one of hat automatalized in a pd ranking.

o the topics red ranking

search of oor contextuts into a giv

SearchPointextual genencepts in the

EON/2009/D3EON EU-IST-arch 1, 2006 years

2

etworked O

27595)

mantic-base

ntext-sen

nator:

nating Inst

nja Mladen

extension owith the usethe main ptically genepanel whichEvery suchnear the se of the resu

ontologies aalization d

ven ontologyt easily u

eral knowlede ontology.

3.2.4/v1.0 -2005-027595

2006–2009 © C

Ontologies

ed knowled

nsitive Se

Boštjan Pa

titution:

nić (JSI), Ma

of the web e of other onroblems of

erates topich the user h ranking iselected poinults with a s

as proposedeveloped iy. Conceptspgrades adge by re-r

DateSubmVersStateDistr

opyright lies wit

dge and co

earch of

ajntar

J. Stefa

arko Grobe

applicationntologies thontology se

cs related tocan interac

s aligned wint get rankedingle click.

d in this delin NeOn Ws that are thany textualranking it in

e due: mission date:sion: e: ribution:

NeO

th the respective

ntent syste

Ontolog

an Institute

elnik (JSI)

n SearchPohat provide cearch engino a query act with. Eveith the seled higher. Ef

iverable proWP3. Topiche most repl search en accordan

AugustAugustV1.0 Final Public

On-pro

e authors and th

ems”

gies

e (JSI)

oint, whichcontext for

nes. In Searand its res

ery point onected point. ffectively, th

ovides one ocs are genpresented gengine, it ce with use

t 31, 2009 t 31, 2009

oject

heir institutions.

enhancesthe search.rchPoint weults. These

n this panelI.e. results

he user can

of the threenerated viaget selected

effectivelyer selected

t.org

.

s .

e e l

s n

e a d y d

Page 2: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 2 of 40

NeOn C

This docuCommissiopartners ar

Open KnowlBerrill Milton UnitedContaE-mail

UniveCamp28660Spain ContaE-mail

IntelligCalle d28006Spain ContaE-mail

Instituet en AZIRSTMontb38334FranceContaE-mail

UniveUniver56070GermaContaE-mail

OntopAmalie(Raum76227GermaContaE-mail

Atos OCalle d28037Spain ContaE-mail

onsortiu

ment is a on of the Ere involved

University (Oedge Media InBuilding, WalKeynes, MK

d Kingdom ct person: Mal address: {m.

ersidad Politéus de Monteg

0 Boadilla del M

ct person: Asul address: asu

gent Softwarde Pedro de V

6 Madrid

ct person: Jesl address: jcon

ut National deAutomatique

T – 655 avenubonnot Saint M4 Saint-Ismier e ct person: Jérl address: jero

ersität Koblenrsitätsstrasse

0 Koblenz any ct person: Stel address: staa

prise GmbH. (enbadstr. 36

mfabrik 29) 7 Karlsruhe any ct person: Jürl address: ang

Origin S.A. (Ade Albarracín,

7 Madrid

ct person: Toml address: tom

um

part of theuropean Coin the proje

OU) – Coordinnstitute – KMi ton Hall 7 6AA

artin Dzbor, Endzbor, e.motta

écnica de Madgancedo Monte

unción Gó[email protected]

re ComponenValdivia 10

sús Contreras ntreras@isoco

e Recherche e (INRIA) e de l'Europe

Martin

rôme Euzenatome.euzenat@

nz-Landau (U1

effen Staab ab@uni-koble

(ONTO)

rgen Angele gele@ontopris

ATOS) , 25

más Pariente mas.parientelo

e NeOn resommunitiesect:

nator

nrico Motta a} @open.ac.u

drid (UPM)

z Pérez

nts S.A. (ISOC

o.com

en Informatiq

@inrialpes.fr

KO-LD)

enz.de

se.de

Lobo bo@atosorigin

search projs by the gra

uk

UnInstBesEngD-7ConE-m

SofUhl642GeConE-m

CO) InsJamSI-SloConE-m

que UnDepReg211S14UniConE-m

CoInstVia44 ConE-m

Fooof tVia001ItalyConE-m

n.com

LabC/C080SpaConE-m

ject fundedant number

iversität Karltitut für Angewschreibungsveglerstrasse 1176128 Karlsruhntact person: mail address: p

ftware AG (SAlandstrasse 12297 Darmstadrmany ntact person: mail address: w

stitut ‘Jožef Smova 39 1000 Ljubljana

ovenia ntact person: mail address: m

iversity of Shpt. of Computegent Court 1 Portobello st4DP Sheffield ited Kingdomntact person: mail address: h

nsiglio Naziotitute of cognit

a S. Martino de- 00185 Romantact person: Amail address: a

od and Agricthe United Na

ale delle Terme100 Rome y ntact person: mail address: m

boratorios KICiudad de Gra018 Barcelonaain ntact person: Amail address: a

NeOn Integra

d by the ISIST-2005-0

sruhe – TH (Uwandte Informerfahren – AIF he, Germany Peter Haase [email protected]

AG) 2 dt

Walter Waterfwalter.waterfe

tefan’ (JSI)

a

Marko Grobelmarko.grobeln

heffield (USFDer Science

treet

Hamish Cunnhamish@dcs.

onale delle Ritive sciences ella Battaglia, a-Lazio, Italy Aldo Gangemaldo.gangemi@

ulture Organations (FAO) e di Caracalla

Marta Iglesiasmarta.iglesias

N, S.A. (KIN) anada, 123 a

Antonio Ló[email protected]

ated Project EU

ST Program027595. Th

UKARL) atik und Form

FB

-karlsruhe.de

feld eld@softwarea

lnik [email protected]

D)

ningham shef.ac.uk

cerche (CNRand technolog

mi @istc.cnr.it

ization

1

s [email protected]

z s

U-IST-027595

mme of thehe following

male

ag.com

R) gies

5

e g

Page 3: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 3 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

Work package participants

The following partners have taken an active part in the work leading to the elaboration of this document, even if they might not have directly contributed to the writing of this document or its parts: JSI, OU.

Change Log

Version Date Amended by Changes

0.1 20-07-2009 Boštjan Pajntar Overall structure of the report

0.2 01-08-2009 Boštjan Pajntar Executive Summary, introduction

0.3 03-08-2009 Boštjan Pajntar Began approach description chapter

0.4 15-08-2009 Dunja Mladenić Chapter 2, Overall revision

0.5 28-08-2009 Boštjan Pajntar Chapter3

0.6 12-09-2009 Marko Grobelnik Overall revision

0.7 28-09-2009 Boštjan Pajntar Figures, Overall revision

0.8 15-10-2009 Christopher Buttenshaw Final QA

Executive Summary

This report is describing a software deliverable developed as an extension of the web application SearchPoint [Pajntar and Grobelnik, 2008]. The main goal of SearchPoint is to enhance search engines by allowing the users to get multiple rankings of the results for each query. We achieve this by generating topics for the given query and its result set and visualizing these topics on a panel named “Ranking Space”. Each point in this ranking space maps to specific ranking. For example, if a point is selected near one topic, results that are on that topic are ranked higher.

The topics can be generated by clustering of the results and we have implemented this method as a baseline to compare with. The more advanced method for generating topics is the classification of hits and query into a selected ontology and then selecting the concepts with most results to serve as topics. This allows for visualization of a small enough number of topics that can be understood by the user on the one hand whilst retaining as much of the domain covered by the current result set.

Topics must be visualized in an intelligent way. Since the selection of a point in between two topics promotes hits that cover either of them, it makes sense to visualize similar topics close together. In the baseline scenario of taking centroids of clusters for the topics, they are visualized by drawing a complete weighted graph, each node representing a topic and edges being weighted by the similarity of the two nodes. In the scenario of classifying and selecting most prominent concepts from the ontology, we visualize in accordance to the underlying structure of the ontology.

This is the last of the three core contextualizing technologies stemming from WP3. It provides means for study of any non-structured, general knowledge, in a context of a selected ontology. The other two core contextualizing technologies are complementary: the context can be provided by one networked ontology for the other. This was implemented in OntoConto (D3.2.2, D4.5.2), consuming Alignment Server (D3.3.1, D3.3.2) and second, a general background knowledge can provide means for the contextualization of an ontology, which was implemented in OntoAtlas (D3.7.1, D4.3.1, D3.7.2)

Page 4: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 4 of 40

Table of1. Introduc2. Approa2.1 System2.2 Topic g2.3 Topic g2.4 Visuali2.5 Calcula

3. Exampl3.1 Basic d3.2 Discus3.3 Showc

4. ConclusAppendix A.1 The cu

f Contenction .........ch Descrip

m architectugeneration –generation –zation and ation of the

e usage ofdescription ssion on Usacase ...........

sion and fuA ..............

urrent WSD

nts ..................ption .........ure .............– Clustering– Classificaranking sparanking spa

f the systemof the functability of Se.................

uture work ..................L file to the

..................

..................

..................g ................ation ...........ace ............ace ............

m ...............tionalities ...earchPoint ...................

................................... web servic

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................ce ...............

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................

NeOn Integra

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................

ated Project EU

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................

..................

U-IST-027595

............. 6 

............. 7 

............... 8

............... 9

............... 1

............... 1

............... 1

............. 17

............... 1

............... 1

............... 1

............. 24

............. 25

............... 2

5

  0 2 6 

7 7 9 

Page 5: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 5 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

List of Figures

Figure 1: illustration of basic SearchPoint in action. For a query “ontology” ambiguous hits get returned. To see only the hits in the context of philosophy the red focus point is moved near the automatically generated topic “philosophy”. Hits returned are initially low ranked (99, 11, 68...) but all talk about philosophy, logic, Aristotle... 7 

Figure 2: The same result page, only the focus point is moved between topics “metadata”, “owl”, “online”. The user immediately gets returned a list of hits about i.e. semantic mark-up, information and computer science. 8 

Figure 3: Architecture of SearchPoint 9 

Figure 4: The hand cursor is above "Philosophy" topic. The additional words are: exist, concerns, kinds, part, nature. 13 

Figure 5: Classifier method for the query “ontology”. The ontology (taxonomy) used is DMOZ. The relevant concepts classified are: Philosophy, Knowledge Management, Social Sciences, Languages, Internet and Artificial Intelligence. The four main categories are Society, Reference, Science and Computers. Mouse over the concept “Internet” shows Top/Computers/Software/ Internet. The concept Software was left out of the visualization in order to make it clearer. 15 

Figure 6: Visualization for the query “semantic web”. Immediately it is seen that this topic is mostly modelled with Computers and its descendants. Additionally there is some business and Knowledge management linked to it. 15 

Figure 7: Yahoo web search in the context of Dmoz. 20 

Figure 8: Yahoo web search with clustering. 21 

Figure 9: Yahoo web search in the context of EuroVoc. 22 

Figure 10: Swoogle Ontology Search in the context of EuroVoc. 23 

Page 6: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 6 of 40

1. Introd

Big searchfriendly mconstitutesthe correctis no apparanking anthe rankingsetting. ThHowever, corporate solution fo

Some websubcategohits and awww.vivisi

On mindseof his que"shopping"

SearchPoitopics. Wequery and on how the

The main pre-trainedontology, wthe classif2.2).

Since thertwo topics user selecsimilar toptopics if he

The remaoverview owe offer soscreenshoin the NeO

duction

h engines ethod. A u

s the enginet result-set

arent best wnd a lot of feg should be

he rankings the rankingor specializr rankings.

b applicatiories of the

a user can mo.com, ww

et.research.ery, he can" or of "rese

nt web appe have exten

result set. Te user can s

work of thisd OntoLightwhich in turfication into

re are moreit was easy

cts. Here, wics are visu

e is intereste

inder of thof the architome discusts (Section

On toolkit (S

(i.e. www.guser must ee’s input, ais a well kn

way of calcueatures aree personaliare at least

g quickly bzed (image

ons build oresult-set. Treformulat

ww.clusty.c

yahoo.comn tune the earching". T

plication bunded the idThere are aselect his pr

s deliverablt Classifier rn provides

o ontologies

e than two ty since a simwe must usualized closeed in results

his deliveratecture, befossion about3.3). In theection 4).

google.comenter a qu

at which ponown probleulating ranke used for tzed. Usualt initially calecomes wo search, on

on top of These categte his quercom, www.k

m another apranking by

This web app

ilds on theea in a sen

also an arbireferences.

le lies in ex[Grobelnik e

s best conces, we also

topics, it is mple slider se the similer than nons featuring t

ble initiallyore describt the possibe end we di

, www.yahouery in the int, the eng

em with a wking. A lot ohis processly results alculated froorse if therntology sea

the usual gories are ry by clickikartoo.com,

pproach is py defining plication ga

idea of rense that topitrary numb

xtending Seet al., 2008] epts for theuse cluster

not trivial tbar sufficedarity of top

n-similar. Thtwo similar

describes ing topic ge

ble usage (Sscuss on fu

oo.com) usform of ty

gine returnswell known s

f resourcess. There is are very effm the undere is no unarch) search

search mein one way ng on themwww.ujiko.

presented. Aif results s

ave us the in

-ranking hitics are not er of topics

earchPoint w for the us

e visualizedring of resu

to position td for determpics to drawhis enables topics with

the approeneration anSection 3.2uture work t

NeOn Integra

se a simpleyped wordss a ranked solution. Ons are spent continuous

ficient in theerlying graphnderlying sth scenarios

ethod. It isor another

m. Example.com.

After the usshould be mnitial idea fo

ts in accordpredefined

s per query,

with the pose of classifd topics (Seults for topi

them on thmining how mw a graph othe user to a single clic

oach (Sectind visualiza

2) and showthat will inc

ated Project EU

e, effectives or sentenresult-set.

n the other hon calculatdebate on

e general wh of linked wructure. Fos do not ha

s possible r presented es of such

ser receivesmore in theor our appro

dance to thbut depend providing a

ossibility of fication of h

ection 2.3). ic generatio

e ranking smuch of eacof topics. Iselect poin

ck (Section

on 2), firstation techniqwcase the scorporate cu

U-IST-027595

and user-nces whichCalculatinghand, thereting optimalhow much

web searchweb pages.

or example,ave a good

to identifybeside thesite's are:

s the resultse sense ofoach.

he selecteddant on thea challenge

adding anyhits into anApart from

on (Section

space. Withch topic then this way,

nts between2.4).

t giving anques. Then

system withurrent effort

5

-h g e l

h h . ,

d

y e :

s f

d e e

y n

m n

h e ,

n

n n h t

Page 7: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 7 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

2. Approach Description

SearchPoint in essence is a search engine add-on. This means it needs a search engine it can consume to provide additional functionality. The basic search functionality, input query, output ranked result-set, is unhindered. In the beginning a user has to provide a query and a ranked result-set is returned. Besides this, topics of interest are calculated. How this is done will be explained later, for now, all we have to know is, these topics correspond to the search query and top portion of the result-set.

Topics are placed on a plane as part of a graphical user interface (GUI) in such a manner that similar topics lay close. A focus point is placed at the origin. This focus can be moved by either dragging it to a desired location or alternatively a point on the plain can be clicked in order for the focus to move there. Each position of this focus corresponds with one ranking.

For example a user who clicks near one topic will get hits ordered mostly by the sense of that topic, if the focus is moved to a position between two topics, results that share similarity with both these topics will tend to be higher ranked. In truth, at any moment all the topics influence the ranking; however influence decreases with the distance between topic and focus point. For an illustration of usability see Figure 1 and Figure 2.

This approach can be used with any search engine that provides textual results. However, the problem of a general web search has been mostly solved, as underlying graph of linked web sites offers a lot of information for the importance of a single node – web page. On the other hand, our approach is very useful in the search scenarios without an underlying graph structure. For example, corporate search engines must work on a relatively small site with a mostly tree like structure, yet important content could be anywhere on this graph. Another example is that of prolific search scenarios, for example image search or ontology search.

In this deliverable, we have concentrated on the ontology search scenarios. There are several ontology search engines, so we tested our approach on swoogle.umbc.edu and google.com search with the defined result type as .owl or .rdf. This work will be integrated with Watson search [d'Aquin 2008] and NeOn toolkit platform as part of effort in WP4.

Figure 1: illustration of basic SearchPoint in action. For a query “ontology” ambiguous hits get returned. To see only the hits in the context of philosophy the red focus point is moved near the automatically generated topic “philosophy”. Hits returned are initially low ranked (99, 11, 68...) but all talk about philosophy, logic, Aristotle...

Page 8: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 8 of 40

Figure 2: “online”. Tand compu

2.1 Syste

The Searcto for examof process

The implemodule, fodistributionchained int

1. Age

2. Seasni

3. Aut

4. Graweb

5. Grarea

Some com

In the currin order to Next, modapplicationand manipToolkit) or

The same he user immuter science

em archite

chPoint is a mple on thees, which w

mentation or examplen channel. to a pipeline

ent (usually

arch engineppets

tomatic topi

aph drawingb service)

aphical Usearranging th

mments on t

rent applicaminimize th

dules 1 ann. It seems pulating resu

to separate

result pagemediately ge.

ecture

tool for seae web. In facwe will call m

of SearchP, to changThe architee (Figure 3)

y user) prov

e processes

ic generatio

g and sub-g

er Interface e results dy

he modules

tion, the mohe number d 5 are a natural to

ults. It is, hoe them (for u

e, only the fgets returne

arching. Hoct the usuamodules.

Point is moe a searchecture of S):

ides a quer

s the query

on module (

graph extra

(GUI) for thynamically.

s:

odules 3 anof calls madcomponen

provide theowever, veruse in an ev

focus point ed a list of h

owever, thisl search en

odular as thh engine, aSearchPoint

ry (usually in

y and return

implemente

action for vi

he visualiza

nd 4 are ende; howevent of the se user with ry easy to cven bigger s

is moved bhits about i.

s is not the gine is mer

his makes adopt a newt (Fig 1) co

n a GUI)

ns a result-

ed as a web

isualization

ation of the

nveloped inser, if neededsame GUI a single pla

change the solution).

NeOn Integra

between tope. semantic

usual searcrely one mo

it very easw classifieronsists of t

-set of shor

b service)

of the topi

topics, foc

side a singld, they coulprovided inace for posdistribution

ated Project EU

pics “metadc mark-up,

ch we are aodule in a lo

sy to changr, and provthe followin

rt textual do

cs (implem

us point se

le web servd be easily n the formsing queries channel (i.

U-IST-027595

data”, “owl”,information

accustomedonger chain

ge a singlevide a newng modules

ocuments –

mented as a

election and

vice. This isseparated. of a webs, receivinge. to NeOn

5

n

d n

e w s

a

d

s .

b g n

Page 9: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 9 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

Figure 3: Architecture of SearchPoint

2.2 Topic generation – Clustering

The most obvious way to automatically generate topics out of a corpus of documents is to cluster the available results into a predefined number of clusters and use a centroid or medoid of each cluster as an individual topic. The technical details of the implementation follow below.

Document clustering (Steinbach et al., 2000) is based on a general data clustering algorithm adopted for textual data by representing each document as a word-vector, which for each word contains some weight proportional to the number of occurrences of the word (usually TFIDF weight as given in equation (2.1)).

)(log)(),(),()(

iiii

i

WDFDWwhereIDFWIDFdWTFd == (2.1)

Where D is the number of documents; document frequency DF(W) is the number of documents the word W occurred in at least once; and TF(W,d) is the number of times word W occurred in document d. The exact formula used in different approaches may vary somewhat but the basic idea remains the same – namely, that the weighting is a measure of how frequently the given word occurs in the document at hand and of how common (or otherwise) the word is in an entire document collection.

Page 10: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 10 of 4

The similavector repdocuments

Several cluhave selecwhen retur

The basic clustering execution points to proximity wour approaiteratively change in as a parammeans algassignmenindepende The calculwords whicrepresent tread with s The webse http://searc More inform

2.3 Topic

The clustetopics. Theimportant necessarily

Therefore,clustering, human and

Instead of given ontobesides bea given inte

Any knowlcontextualwith the sproviding dwhole corp

0

rity of two dpresentations based on

ustering algcted k-meanrning results

k-means calgorithms and apparebe the ce

with regardsach, becaurecomputedcentroid po

meter in ourgorithm is runts are deteent clusterin

ated centroch are partithe cluster some intera

ervices for t

chpoint.ijs.s

mation is gi

c generatio

ering approae main disawords founy also the m

we choosewe classify

d can theref

classifying logy. Such

eing used aerest of the

edge represized with thame querydifferent ranpus.

documents ns of the dtheir simila

gorithms canns becauses and topics

clustering ato be appl

ent suitabilntroids of s to a certaise they ared for each ositions. Thr web-servicun ten timeermined ong results.

oids are actuicularly promthat is visu

action.

his method

si/Classifier/

ven in the A

on – Class

ach is veryadvantage hnd in the

most informa

e a differenty each docfore also be

into an arba classifica

as a query re user.

sented in ahe use of d, different o

nkings of the

is commonocuments (rity, putting

n be used oe of its highs to the use

algorithm [Klied to text,ity to textuclusters, an distance/e invariantcluster, anis algorithmce. We choses with diffen the basis

ually vectorminent for t

ualized in th

and other t

/WS_Class

Appendix.

sification

y efficient inhowever, is centroid thative to the

t approach.cument into e understoo

bitrary list ofation providrefinement t

textual cordifferent ontontologies e hits. Each

ly measure(see equatsimilar doc

on TFIDF reh speed, siners.

Kanungo et , which still

ual data disand groupin/similarity mto the leng

nd documenm depends hse random

erent randoof best in

rs in TFIDF the whole che GUI and

topic genera

ify.asmx

n separatingthe presen

hat define user.

. Instead ofa preset li

od by a hum

f topics we es a deepetool, also be

rpus and thetologies. Onreturn diffe

h ranking ca

d by the coion (2.2)).

cuments in t

epresented nce the clus

al 2002] isl produces stributions. ng docume

measure. Wegth of eachnts regroupheavily on tinitial positim positionstra – inter

space, so ecluster. From

we also lis

ation metho

g the documntation of thand separa

f automaticaist of topics

man.

can do mucer understane used to in

erefore beinn the sameerent topicsan be consi

NeOn Integra

osine-similaThe clustethe same gr

documentsstering mus

s one of thgood resulIt involves nts arounde have chosh documentped until thethe choice oioning of ces of initial cratio for cl

each centrom this we dst several to

ods can be

ments into hese topics ate the top

ally generats that have

ch better wnding of avanvestigate t

ng searchabe data set ts the user cidered as a

ated Project EU

rity betweenering algoritroup.

. For our apst be done

e oldest anlts becauserandomly

d centroids sen cosine t. Then, ceere is sufficof k, which

entroids. Thcentroids. Fluster simila

oid has a raderive the bop words th

found at:

a selected to the user

pic the mo

ting the topbeen gene

with classificailable topicthe actual s

ble, can in tthat is beingcan naviga different vi

U-IST-027595

n the word-thm groups

pproach weat run time

nd simpleste of its fastchoosing kbased on

similarity inentroids areciently littlewe specify

he whole k-Final clusterarity of the

anked list ofest word to

he user can

number ofr. The mostost are not

ics througherated by a

ation into acs and can,subtopics of

this way beg searchedte through,iew into the

5

-s

e e

t t k n n e e y -r e

f o n

f t t

h a

a , f

e d ,

e

Page 11: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 11 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

2.3.1 Text Classification Text classification can be applied when a set of predefined categories (classes), such as “arts, education, science”, are provided as well as a set of documents labelled with those categories. The task is to classify new (previously unseen) documents by assigning each document one or more content categories. This is usually performed by representing documents as word-vectors (usually referred to as the ‘bag-of-words’ representation) and using documents that have already been assigned the categories, to generate a model for assigning content categories to new documents. In the word-vector representation of a document, a vector of word frequencies is formed taking all the words occurring in all the documents (usually several thousands of words). The representation of a particular document contains many zeros, as most of the words from the collection do not occur in a particular document. The categories can be organized into an ontology, for example, the MeSH ontology for medical subject headings or the Yahoo! hierarchy of Web documents that can be seen as a topic ontology. Other applications of document categorization into hierarchies/taxonomies are of US patents, Web documents (McCallum et al., 1998; Mladenić, 1998; Mladenić and Grobelnik, 2003), and Reuters news articles (Kholer and Sahami, 1997).

Cosine-similarity that is commonly used in document clustering can be also used for document classification as follows. Given a new document, cosine-similarity is used to find the most similar documents (e.g., using k-Nearest Neighbour algorithm (Mitchell, 1997)). Cosine-similarity between all the documents and the new document is used to find the k most similar documents whose categories (topics) are then used to assign categories to a new document. For documents id and dj , the similarity is calculated as given in equation (2.2). Note that the cosine similarity between two identical documents is 1 and between two documents that share no words is zero.

(2.2)

2.3.2 OntoLight OntoLight [Grobelnik et al., 2008] is a software suite which implements basic reasoning functionalities for contextualized ontologies. It is limited to light-weight ontologies which are grounded with appropriate text corpora. The representation and reasoning scales to the largest currently available ontologies, comprising up to one million concepts. In particular, OntoLight currently incorporates the following five ontologies: AgroVoc and ASFA (relevant for the Food and Agricultural Organization of the UN), EuroVoc (EU legislation), Cyc (common-sense knowledge) and DMoz (a WWW directory).

There are two basic reasoning mechanisms implemented in OntoLight. First, new textual instances without a known class can be classified into the selected ontology. Second, soft (probabilistic) mappings between a pair of selected ontologies can be computed, thus providing a contextual relationship between the ontologies.

OntoLight was used as a basic building block for extensions to OntoGen [Fortuna et al., 2006], where contextual mappings are used to improve semi-automatic construction of light-weight ontologies from text corpora. The same mechanism of contextual reasoning will be used to extend OntoGen to support simultaneous, collaborative development of an ontology. Soft mappings between grounded ontologies also complement methods for ontology alignment, where mappings are computed on the basis of common, background ontologies (as provided by Swoogle, for example) [Sabo et al. 2008]. The main functionality we cover is the contextualization of ontologies through generation of soft mappings between ontologies, thus enabling us to view concepts of one ontology through the perspective of another one. OntoLight also supports the scalability needed for large case studies – i.e. being able to deal with large ontologies such as AgroVoc and ASFA. To

∑ ∑

∑=

l mjmil

kjkik

jidd

dddd

22),cos(

Page 12: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 12 of 4

achieve thtargeted fu

For the usetrained witclassifier. query and

Since ontochoose thomany docuclassifying similaritiesbut is not concept wtake into aranked list

2.4 Visua

Our main such a wafocus poin

Once we athe focus space.

2.4.1 GrapThe input using the cthe k-meaten. Whenselect the d

While the we use it o

To visualizbetween toactually re

In order torepresentswe can als

To draw a which is vsecond. W

0

his, the repunctionality

e of topic gth OntoLighWe can ususe them to

ologies can ose concepuments are a docum

s. In order nreally the

we sum all taccount thos

of concepts

alization an

goal is to vay that similt by draggin

achieve thispoint. Since

ph drawingfor the visu

clustering mns algorithm using Ontodesired top

visualizatioonly for clus

ze these topopics also presented a

o position s a node anso weight th

graph we uvery robust

We restate so

presentationneeded in t

generation inht. OntoLigse this claso provide to

be huge anpts that havee classified ent into anot to miss most impothe similaritse similaritis which get

nd ranking

visualize aular or relateng a red poi

, it is possibe each poin

g ualization is

method we cm. In SearcoLight class n (default t

n describedstering. The

pics we neerequires thas vectors s

or draw thd similarity e nodes by

use Fruchteand for su

ome of the

n is constrathe case stu

n SearchPoht is a tool

ssifier to finopics.

nd we can e the most into each

n ontologyany conce

rtant one fties of simies that aret to be used

g space

utomaticallyed topics laint that is al

ble to calcunt on the p

s always acan select tchPoint we sifiers, we gten).

d below couchanges fo

ed to positioat similar toso we can c

e topics, wbetween tw

y the classifi

erman-Reinuch a smalobservation

ained to a udies.

oint we will l that easily

nd the conc

actually visdocumentsconcept we

y, OntoLighept that actufor any, welar docume above a p

d in the visu

y generatedy close toglso part of th

late new raanel maps

list of topihe desired leave this

get a long l

uld be usedor the classi

on them. Oopics lay c

calculate the

we model two topics giier score of

ngold (FR) al graph of ns in [Pajnta

light-weigh

be able to uy transformcepts that a

sualize onlys classified e rather fo

ht can actuually covers

e rather useents. Similaarticular thralization an

d topics andgether. Aparhe GUI.

nkings of thto a specif

ics, which anumbers ofnumber as list of ranke

d for both cifier will be d

ur applicaticlose togetheir cosine s

hem as a ves a weigthe topics.

algorithm [Faround ten

ar 2006].

NeOn Integra

ht ontology

use any of ms any grouare most re

y so many dinto. Insteallow a diffeually provids many doce these simar documenreshold. In nd GUI.

d position trt from this

he documenfic ranking,

are presenf topics-cena paramet

ed classes-

clustering adescribed in

on of beingher. In bothimilarities.

full weighthted edge.

Fruchtermann nodes wo

ated Project EU

y model wh

the classifieunded ontoelevant for

distinct topicd of just co

erent approde documecuments in milarities annts mean ththis way we

them on thethe user c

nts for eachwe call it t

ted as vecntroid, by seter which isconcepts a

nd classifien the next s

g able to seh cases the

ed graph. In the class

n and Reinorks in a fr

U-IST-027595

hich covers

ers that areology into athe current

cs, we onlyounting howach. Whenent-conceptthe corpus

nd for eachhat we onlye can get a

e screen inan select a

h position ofthe ranking

tors. Whenelecting k ins by defaultnd we also

er methods,subsection.

lect a pointe topics are

Each topicsifier model

gold 1991],raction of a

5

s

e a t

y w n t s h y a

n a

f g

n n t

o

,

t e

c l

a

Page 13: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 13 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

In this algorithm only two criteria are demanded for a good graph drawing:

• Connected nodes should be close.

• No two different vertices should be too close.

There is no penalization for edge crossings in this algorithm. Edge crossings are very important for clear drawing of a graph in general; however, we mostly need the final position of the nodes and graph structure is not very interesting since we have a full graph. There would also be a heavy computational penalty in calculating all the edge crossings (n4) in every iteration.

In every iteration of the algorithm we calculate all the attracting forces from connected vertices, and the repulsing forces from all the nodes. Since there is no scalar penalty for edge crossings, the result is a vector, which points out not just how much out of place the vertex is (vector size) but also into which way it should move. The size of the actual move is confined to current temperature, which is linearly decreased in every iteration.

This very simple algorithm provides excellent results. Its main advantage is its speed and robustness. Even today, years after its creation, it is still one of the most popular algorithms for graph drawing.

After we get positions we can visualize the topics on the screen. We also visualize the most important word or bigram (a common pair of words) on top of each topic. Since for the clustering method there is a possibility of a poor word on the top place, we also visualize some additional top ranking words in a tooltip on a mouse over (Figure 4).

Figure 4: The hand cursor is above "Philosophy" topic. The additional words are: exist, concerns, kinds, part, nature.

Page 14: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 14 of 4

2.4.2 Sub In the clasHowever, wproblem issolution toconnectedvisualized.

From the everything extract thiscase whertop conceprocess is

To visualizto visualizethree levelwith focus

For the firsontology mcomputed However, wthe secondontology. Tis especial

The actuaconsecutivnodes get visualized

Some visuvisualized descendanvisualizatiosubpart of instances dAll of this c

0

Graph Extssifier modewe would li

s that we cao work on subgraph

given set to work e

s subgraphre the graphpt in everya connecte

ze this we fie three levels are too mpoint selec

st level we models the

relevant towe found it d level we This informsly useful wh

al visualizatve level on cspace propand represe

ual aides haas the siz

nt conceptson; howevethe ontologdo. On moucan be seen

raction el we couldke to retainan only visuany ontolothat conta

of relevant ven on tax. Since the h provided by connecteded tree, reac

irst need soels of this tremuch informtion.

choose thecurrent qu

opics. In theimportant tchose to v

s the user ohen the roo

tion is straconcentric c

portional to tent the anc

ave also bee of nodes

s in the oner, we foundgy models thuse over a sn on Figure

d easily adon some of thualize a sm

ogy even oains all the

concepts xonomies w

only relatioby the SubCd part to aching all the

ome sort of ee. We wanmation for t

e root of theery and itse ontology to unify the visualize thof which pat topic is als

ightforwardcircles. Eacthe third levestry relatio

een implems and for thntology. Acd it providehe current rsingle topic5, 6.

opt graph dhe structure

mall numberon simple te relevant c

we must fwe must uson in treelikConceptOf virtual nod

e relevant c

abstractionnt to visualizthe user an

e tree. This results. Fothese concvisualizatio

he actual srt of the onso the root o

d. We positch third levevel nodes thon between

mented in Ghe second

ctual score es more infresults in coc the ancest

drawing to ve that is inhr of topics faxonomiesconcepts a

first extract se a very bke taxonomrelation is nde we namoncepts tha

n. After quaze as muchnd also neg

s basically sor the thirdcepts are n

on of the resecond levetology the tof the ontol

tion the roel node hashey parent. nodes.

GUI. The aclevel nodefrom the c

formation toontrast to hotor list of th

NeOn Integra

visualize thherent in thefor one que. This is w

and then m

a subgrapbasic relatioy is the subnot connect

me “Top”. Tat provide to

alitative ana as possible

gatively affe

shows the level we c

not generalsults of the el below thethird level toogy, or the

ot at the os equal spaThe links b

ctual score es we sum classifier coo visualize ow much a e concept i

ated Project EU

e best topie ontology mery. We wowhy we firsmodify it so

ph. Becauson from whbsumtion reted, we con

The end reopics.

lysis we hae, however,

ect the rera

user which chose to vily on the sclassifying e root concopics belonvirtual “Top

origin and ce, while seetween leve

of third levthe scores

ould be ushow muchsingle concs written in

U-IST-027595

cs-classes.model. Theuld like our

st extract ao it can be

e we wanthich we willelation, in annect everyesult of this

ave decided, more thannking done

part of thesualize the

same level.phase. For

cept of theng to, whichp” node.

draw eachecond levelels are also

vel topics iss of all thesed for theh the wholecept and its the tooltip.

5

. e r a e

t l

a y s

d n e

e e . r e h

h l

o

s e e e s

Page 15: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 15 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

Figure 5: Classifier method for the query “ontology”. The ontology (taxonomy) used is DMOZ. The relevant concepts classified are: Philosophy, Knowledge Management, Social Sciences, Languages, Internet and Artificial Intelligence. The four main categories are Society, Reference, Science and Computers. Mouse over the concept “Internet” shows Top/Computers/Software/ Internet. The concept Software was left out of the visualization in order to make it clearer.

Figure 6: Visualization for the query “semantic web”. Immediately it is seen that this topic is mostly modelled with Computers and its descendants. Additionally there is some business and Knowledge management linked to it.

Page 16: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 16 of 4

2.5 Calcu

Once the tto each toruntime whthis must b

When the search engvirtual invispoint. We observe th

When the position ofsimilarity mand invers

The set T Euclidian dhits get ord

0

lation of th

topics are ppic, we muhen the usebe done effi

SearchPoingine, as wesible node also add de original ra

focus poinf the focusmatrix S(i, j)ely proporti

of all topicsdistance fordered and v

he ranking

positioned ast calculate

er actually mciently.

nt starts, wee do not wand positio

document –anking.

t moves, ws f(x, y), po). The scoreonal to the

s tj also conr the dist(f, visualized b

score (

space

and visualizee the rankinmoves it. Wh

e also want want to hindon it in the – virtual no

we must calosition of toe for each distance to

tains the intj) and we y this scorin

∑=t

id )(

ed and all tng space. Ehen draggin

to maintainder the basorigin – the

ode similarit

culate a scopics ti(x,y)document i

o that topic.

visible nodeadd a smang function.

∑∈Tj

dist

the documeEach selectng, this hap

n the originasic functionae same as ties to docu

core for eac), documenis proportio

e that provill ε in order.

jtfjiS),(),(

NeOn Integra

ents are equted focus popens tens o

al ordering ality. Becauthe startinguments in

ch documennts di, and onal to the s

des the origr to prevent

+ ε))

ated Project EU

uipped withoint gets caof times per

that is provuse of this,g position osuch a wa

nt. We firstDocuments

similarity of

ginal rankint division by

U-IST-027595

similaritiesalculated inr second so

vided by the, we add aof the focusy that they

t define thes – Topicsf each topic

ng. We takey zero. The

5

s n o

e a s y

e s c

e e

Page 17: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 17 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

3. Example usage of the system

In this section we will briefly describe the basic functionalities of the prototype (Section 3.1), then delve a bit deeper in the scenarios where the SearchPoint benefits the users (Section 3.2) and in the end we will showcase one use case accompanied with the screenshots (Section 3.3)

3.1 Basic description of the functionalities

GUI consists of an input field where the user can provide the query. The user can submit a query in several ways. Each method for topic extraction is connected with one submit button. Apart from this, there is a result list and the ranking space, where the user will be able to choose the focus point.

The search engine is selected by the URL of the web application. For the testing in this deliverable the following search engines have been used:

• Yahoo: http://searchpoint.ijs.si/

• Swoogle: http://searchpoint.ijs.si/swoogle/

• Google ontology search: http://searchpoint.ijs.si/googleowl/

After the user inputs the query and selects the method for topic extraction the result list is returned by the current search engine. At the beginning the results are ranked the same way as in the search engine. Each result item is visualized together with the original ranking. At the start this is therefore (1, 2, 3, 4…). The user can now inspect the topics for the current search and select any point on the ranking space or drag the focus point around the ranking space. The hits get reordered in real time, so the user is able to pinpoint a good selection for the focus easily, by just observing the quality of the displayed results.

There is also a history of the positions of the focus point. In case the user has found a good position and then moved the focus to see other rankings it is easily possible to navigate to a previous position by clicking on one of the three buttons of the history bar.

In the history bar there are three positions:

• Back: for returning the focus to one step back

• Forward: to move it to the next position

• Origin: to clear the history and return to default ranking provided by the search engine.

The buttons are only visible when they are available. For example the Forward button is not visible until Back has been clicked. No button is visible in the start, or upon clicking origin button, since there is no history yet at that time.

3.2 Discussion on Usability of SearchPoint

SearchPoint provides several benefits for the user. Here we will discuss and give examples for many of them. However, first we would like to point out that the basic functionality of the underlying search engine is not hindered in any way. For example, if the search engine provides adequate top ranked results for the query, the user can immediately follow that information without interacting with SearchPoint. SearchPoint provides additional functionalities at no extra cost to the user.

Page 18: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 18 of 4

3.2.1 QuerThe most very short for exampl

This can bcase it is vwith this prvariety of providing sthe formerfrom. Howe

SearchPoitopics, whihits to theadvantageof the meaThe user mthe focus p

3.2.2 Sub-Even wherelated to provided bsubtopic avisualized of the subt

For an exmethod ret

• “Pa

• “Ad

• “En

• “Re

• “Fopro

• “Ch

Even for thsome subtas it is des

3.2.3 RankThe biggesis usually When the

0

ry disambigobvious usqueries, use: owl, onto

be a problemvery hard toroblem by rameanings some additir in the senever, as sim

nt actually ich usually e selected e is that the anings, whicmerely has tpoint.

-Topic expln the actuathe results

by the topicsand immedclose and t

topics.

xample of sturning the

assword Ma

dministrator

ncryption”...

ecovery”... f

orgotten Paocedure of w

hange Pass

he user whotopics in thesigned in ord

king in a gest advantaga bag of ucorpus is e

guation

se of Searcsually consiology, jagua

m when reso find the fanking the in the first ional wordsse that afte

mple as it is

serves ascontain thetopic, can user does

ch can causto recognize

loration al meanings. For an us in the raniately get dthe user can

such a subfollowing su

anager”... fo

r”... for tools

for the vari

for tools wh

assword”... what to do t

sword”... for

o is familiar e results. Seder to enab

eneral corpge of Searchunrelated doxtensive, re

hPoint is qsting only o

ar, a4 or kiw

sults of onefew results results of thten results

s that furtheer the refine to refine a

s semi autoe one mean

be seen not have to

se problemse the topic

of the queuninformed

nking spacedocuments n select the

-topic exploub-topics:

or the docum

s to modera

ious algorith

en losing a

also dealinhan tools.

r best practi

with the topearchPoint

ble the user

pus hPoint is crocuments, teturning a la

uery disamof a single wwi, the result

e meaning dof the othe

he less press. Another per separate ement, the query it is s

omatic quening the useby the use

o come up ws to a user or even onl

ery is not auser the

can be beabout it. W

e focus poin

oration, we

ments abou

te the pass

hms connec

password

ng with los

ces when d

pic it is somis extremelto select se

reation of ththe usual sarge numbe

mbiguation. word. In thets of both o

dominate ther meaning(sented meapossibility fthe meanin

user actualstill tasking

ry refinemeer wants aner as a rewith the cornew to the y recognize

ambiguousvery summneficial. WhWhen two t in betwee

e give query

t software f

words

cted with pa

st passwor

dealing with

metimes hardy useful wh

everal subto

he ranking ssearch engier of results

NeOn Integra

Users are ae case this wr all the me

he search s(s). Big webanings highefor the userngs. This aly has ten rthe user.

ent. The usnd the one definition orrect best wtopic or to

e the results

there can marization ohat is more,

subtopics n to get doc

y “passwor

for managin

asswords

rds, but mo

passwords

d to find thehen dealing opics with a

space. In a ines fail to s for a singl

ated Project EU

accustomedword is amb

eanings get

space, sinceb search ener in order tr is to refinpproach is results he c

ser is presclick re-ran

of the querword for the

a non-nativs, by random

be severaof the main, the user care related

cuments tou

rd” with the

ng password

ore in the

s

e best querywith subtop

a single click

general corproduce ae query, thi

U-IST-027595

d to writingbiguous as,returned.

e in such angines dealo present a

ne a query,better than

can choose

sented withnking of thery. Anotherseparation

ve speaker.mly moving

al subtopicsn subtopicscan select ad, they areuching both

e clustering

ds

context of

y to get justpic profilingk.

rpus, whichny ranking.is becomes

5

g ,

a l

a ,

n e

h e r n .

g

s s a e h

g

f

t g

h

s

Page 19: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 19 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

a real problem. The visualized ranking is usually nothing more than a randomized order of all the relevant documents.

Most of the special searches have this problem. This is the reason we have chosen ontology search engines to showcase our solution. Another example would be a repository of images. We have an example at: http://searchpoint.ijs.si/photo12. This is a repository of textually annotated images provided by a company Photos12 that sells them. SearchPoint provides a useful service by providing ranking, sub-topic profiling and also by quickly finding images with two motifs which are presented as two topics in the ranking space.

3.3 Showcase

In the showcase we will demonstrate the contextualization power of SearchPoint. The main goal is to contextualize search over a given background knowledge with the use of automatic topic generation techniques and re-ranking possibilities of the ranking space.

The scenario will be that of a user tasked with finding the most suitable ontology to model the domain of fisheries. The user can first get a general overview of the domain by profiling the sub topics of the general web search engine results.

On searchpoint.ijs.si yahoo search engine is used. By querying for “fisheries” and selecting the classification method into Dmoz, the user can get a basic model of what concepts are connected with fisheries in an everyday context of general public (Figure 7).

Next, the user can assess how well represented topics are in truth on the web. This can be achieved using a clustering method. As can be seen on Figure 8, several topics (Alaska, Lake) seem to be over represented. This can probably be explained by the great importance the fishing industry has for special regions. Science, the most represented topic from before, is much less prominent. This is mostly on account of the industry (products, processes, fisheries management) that was missing before.

For the same query, “fisheries”, there is an obvious context switch from the more scientifically oriented Dmoz to the more economical World Wide Web.

The last context the user can interpret the fisheries results with is that of a legal vocabulary of European Union (EuroVoc). As can be seen in Figure 9, there is a broad range of topics. There is a cluster of more environmental topics (Aquaculture, Fishing regulations and Fisheries policy), however, more in a governing context than a scientific one. Industry is also well represented (Fishing industry, Fishery Product, Fishery Produce).

The user now has a firm understanding of the domain and several ways of modelling. On searchpoint.ijs.si/swoogle he can access the Swoogle ontology search engine. The third method is chosen for contextualization. When changing the search engine, topics also change (Figure 10). This is because ontology search space is different than general web search. Topics are smaller, probably due to somewhat lacking descriptions of the result ontologies, also, the industrial concepts seem to dominate over environmental concepts in the ontology space.

The user can nevertheless search for the ontology more easily and with respect to the chosen context.

Page 20: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 20 of 4

Figure 7: YGeneral tomostly comsome notiois an enviro

The focus institutions

0

Yahoo web

opic of intereme from scon of regiononmental co

point is mos and scienc

search in t

est as proviience (Agrinal fisheriesoncern in th

oved to thece departme

the context

ided in the culture, Envs. Fisherieshe society.

e centre of ents are ran

of Dmoz.

Dmoz openvironment, also seem

the sciencenked on top

n directory. Biodiversity

m connected

e group anp.

NeOn Integra

The topics y, Earth Scd to recreati

d in the firs

ated Project EU

most relateciences). Thional fishing

st eight res

U-IST-027595

ed to fishinghere is alsog and there

sults mostly

5

g o e

y

Page 21: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 21 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

Figure 8: Yahoo web search with clustering.

Several prominent topics are extracted. There is a clear presence of economical topic (fisheries management, processing, products) and some more specific (Alaska, Lake).

The focus is moved towards Alaska topic, and there truly are many websites talking about Alaskan fisheries.

Page 22: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 22 of 4

Figure 9: YBoth indu(Aquacultumore in the

The focus ranging frbusiness (

0

Yahoo web

ustrial (Fisure, Fishing e context of

is moved trom enviro145, 22) an

search in t

hing Indusregulations

f a governm

to the envirnmental (1

nd fishing so

the context

stry, Fishes, Fisheries

ment.

ronmental p124 Philippocieties (49

of EuroVoc

ery Producs policy) asp

part of the pine enviro, 28) share

c.

ce, Fisherypects are w

panel near onment lawthe Aquacu

NeOn Integra

y Produce)well presente

the Aquacws, 17), eulture conte

ated Project EU

) and enved. Howeve

culture topicnvironment

ext.

U-IST-027595

vironmentaler, this time

c. Websitestally aware

5

l e

s e

Page 23: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 23 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

Figure 10: Swoogle Ontology Search in the context of EuroVoc.

The concepts in the ontology search space show most ontologies deal with the business oriented modelling. Also, the concepts are much smaller than before, because of lacking descriptions provided by the swoogle search engine.

Page 24: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 24 of 4

4. Conc

In this workcontext. Anselecting a clustering oresults for tspace ontol

Any textual retrieved. Hproblem haand due alscorpus of dindeed diffedifferent con

SearchPoinquery, provspace. All ththe ranking Only topics special cont

This is a finwill be consused by theengine, probwith the aut

0

lusion an

k we have exny general k

focus point or classificatithe user, thisogy.

backgroundHowever, whe

s been solveso to the redocuments or

erent rankingntext.

nt enables thide a one-clihis is providespace ontolthat are rele

textualized ra

nished workinsidered how e users. Thobably with ketomatic topic

nd future

xtended a seknowledge rein the rankinon into an os provides c

d knowledgeen the corpued for web-sdundancy ofr other moregs are neede

e user to geck query refoed by the stalogy and, threvant to bothanking space

ng prototypeto best integ

ought will alsey concepts extraction a

e work

earch engineepresented ing space. Raontology. Sincontextualiza

e can be indus being seasearch mainlf information

e specific seaed for differe

t a continuumormulation, oandard Searcrough this, ch the searche in which th

e available asgrate it insideso be given a(developed i

and actual ran

e add-on, Sen a textual anking spacence the diffeation of gene

exed and dorched is extely due to then on the webarch tasks went users an

m of rankingor can even bchPoint. On change the ch results andhe user can r

s a web appe the NeOn as to how it in WP4) and nking proces

earchPoint, inform can noe is created rent choice eral knowled

ocuments coensive, the pe underlying b. This appr

where there isnd even for t

gs for one qube used for stop of this, it

context in whd the currenterank the res

plication. ThisToolkit platfowill be integontology sim

ss.

NeOn Integra

n order to enow be easilyby visualizinof topic extrge with the

ontaining queproblem of go

graph structoach does ns no underlythe same us

uery which hesubtopic proft is possible fhich the rankt context aresults.

s work will corm, to be mgrated with Wmilarities (de

ated Project EU

nable searchy reranked bng topics steraction providselection of

ery words caood ranking ture of the linot work on ying graph stser when se

elps to disamfiling of the cfor the user king space is visualized a

continue in Wmost easily aWatson Ontoeveloped in W

U-IST-027595

hing within aby the user,

emming fromdes differentf the ranking

an easily beoccurs. Thisnking pagesan arbitrary

tructure, andearching in a

mbiguate thecurrent resultto exchanges calculated.and create a

WP4 where itnd intuitivelyology searchWP3) to help

5

a ,

m t g

e s s y d a

e t

e . a

t y h p

Page 25: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 25 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

Appendix A

Here we give some technical information about the SOAP/http webservice which does the automatic topic extraction and also provides the positioning of the topics with graph drawing. The WSDL specification of this web service can be found at http://searchpoint.ijs.si/Classifier/WS_Classify.asmx?WSDL and is also given in A.1.

The main methods are:

• ClassifyKMeans which does the clustering method for topic extraction and graph drawing for the positioning of the topics

• ClassifyDMoz which does the classification to DMOZ open directory for topic generation and positions the topics by extracting the relevant sub-tree.

• ClassifyEuroVoc which does the classification into EuroVoc Terminology extended with Acquis communautaire legislation documents that provide grounding.

The data is returned as a BowPartStruct structure, which contains three sub structures:

• Information about the topics, with relevant keywords or place in the ontology for each cluster and concept respectively is stored in Node structure.

• It also contains the similarities between each topic. This is stored in Links structure

• Next, all the results are given a vector of similarities to each topic. This is stored in Documents structure.

Several search engines can be called for the actual results:

• Big web search engines: Google, Yahoo, Bing

• Newspapers: NYTimes, About.com, Mladina (Slovenian)

• Ontology search: Watson, Swoogle, GoogleOWL

• Some specific search engines: EBay, Enron, CCA, Photo12

• It is also possible to provide results of any search engine:

o As a serialized xml: String

o As a url to such an xml: File

A.1 The current WSDL file to the web service

<?xml version="1.0" encoding="utf‐8"?> 

<wsdl:definitions  xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/" xmlns:tm="http://microsoft.com/wsdl/mime/textMatching/" xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/" xmlns:mime="http://schemas.xmlsoap.org/wsdl/mime/"  xmlns:tns="http://searchpoint.ijs.si/Classifier" xmlns:s="http://www.w3.org/2001/XMLSchema" xmlns:soap12="http://schemas.xmlsoap.org/wsdl/soap12/" xmlns:http="http://schemas.xmlsoap.org/wsdl/http/" targetNamespace="http://searchpoint.ijs.si/Classifier" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"> 

  <wsdl:types> 

    <s:schema elementFormDefault="qualified" targetNamespace="http://searchpoint.ijs.si/Classifier"> 

      <s:import namespace="http://searchpoint.ijs.si/Classifier/BowPart.xsd" /> 

Page 26: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 26 of 4

      <s:imponamespace

      <s:eleme

        <s:com

          <s:seq

            <s:el

            <s:el

            <s:el

            <s:el

            <s:el

          </s:se

        </s:com

      </s:elem

      <s:simpl

        <s:restr

          <s:enu

          <s:enu

          <s:enu

          <s:enu

          <s:enu

          <s:enu

          <s:enu

          <s:enu

          <s:enu

          <s:enu

          <s:enu

          <s:enu

          <s:enu

          <s:enu

          <s:enu

          <s:enu

          <s:enu

        </s:rest

      </s:simp

      <s:eleme

        <s:com

          <s:seq

            <s:el

              <s:c

                <s:

                  <s

                </s

              </s:

            </s:e

0

rt  schem="http://sea

ent name="C

mplexType> 

quence> 

ement minO

ement minO

ement minO

ement minO

ement minO

equence> 

mplexType> 

ment> 

eType name

riction base=

umeration va

umeration va

umeration va

umeration va

umeration va

umeration va

umeration va

umeration va

umeration va

umeration va

umeration va

umeration va

umeration va

umeration va

umeration va

umeration va

umeration va

triction> 

pleType> 

ent name="C

mplexType> 

quence> 

ement minO

complexType

:sequence> 

s:any names

s:sequence>

:complexTyp

element> 

maLocation=archpoint.ijs.

ClassifyKMea

Occurs="1" m

Occurs="0" m

Occurs="1" m

Occurs="1" m

Occurs="1" m

e="DataSour

="s:string"> 

alue="Googl

alue="Yahoo

alue="Live" /

alue="About

alue="About

alue="About

alue="EBay" 

alue="NYTim

alue="Mladi

alue="Watso

alue="Enron

alue="CCA" /

alue="Photo

alue="String

alue="File" /

alue="Googl

alue="Swoog

ClassifyKMea

Occurs="0" m

e> 

space="http:

pe> 

="http://locasi/Classifier/

ans"> 

maxOccurs="

maxOccurs="

maxOccurs="

maxOccurs="

maxOccurs="

ce"> 

e" /> 

o" /> 

/> 

t" /> 

tViaGoogle" 

tViaYahoo" /

/> 

mes" /> 

na" /> 

on" /> 

" /> 

/> 

o12" /> 

" /> 

/> 

eOnto" /> 

gle" /> 

ansResponse

maxOccurs="

//searchpoin

alhost:60107/BowPart.xsd

1" name="D

1" name="Q

1" name="N

1" name="N

1" name="N

/> 

/> 

e"> 

1" name="Cl

nt.ijs.si/Class

7/Classifier/Wd" /> 

S" type="tns

Query" type="

umHits" typ

umMinHits"

umCategorie

lassifyKMean

sifier/BowPa

NeOn Integra

WS_Classify.a

s:DataSource

"s:string" /> 

e="s:int" /> 

 type="s:int"

es" type="s:i

nsResult"> 

art.xsd" /> 

ated Project EU

asmx?schem

e" /> 

" /> 

int" /> 

U-IST-027595

ma=BowPart"

5

Page 27: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 27 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

          </s:sequence> 

        </s:complexType> 

      </s:element> 

      <s:element name="ClassifyKMeansIn"> 

        <s:complexType> 

          <s:sequence> 

            <s:element minOccurs="1" maxOccurs="1" name="DS" type="tns:DataSource" /> 

            <s:element minOccurs="0" maxOccurs="1" name="Query" type="s:string" /> 

            <s:element minOccurs="1" maxOccurs="1" name="NumHits" type="s:int" /> 

            <s:element minOccurs="1" maxOccurs="1" name="NumMinHits" type="s:int" /> 

            <s:element minOccurs="1" maxOccurs="1" name="NumCategories" type="s:int" /> 

          </s:sequence> 

        </s:complexType> 

      </s:element> 

      <s:element name="ClassifyKMeansInResponse"> 

        <s:complexType> 

          <s:sequence> 

            <s:element  minOccurs="1"  maxOccurs="1"  name="ClassifyKMeansInResult" type="tns:BowPartStruc" /> 

          </s:sequence> 

        </s:complexType> 

      </s:element> 

      <s:complexType name="BowPartStruc"> 

        <s:sequence> 

          <s:element minOccurs="0" maxOccurs="1" name="Clusters" type="tns:ArrayOfCluster" /> 

          <s:element minOccurs="0" maxOccurs="1" name="Links" type="tns:ArrayOfLink" /> 

          <s:element minOccurs="0" maxOccurs="1" name="Documents" type="tns:ArrayOfDocument" /> 

        </s:sequence> 

      </s:complexType> 

      <s:complexType name="ArrayOfCluster"> 

        <s:sequence> 

          <s:element minOccurs="0" maxOccurs="unbounded" name="Cluster" type="tns:Cluster" /> 

        </s:sequence> 

      </s:complexType> 

      <s:complexType name="Cluster"> 

        <s:sequence> 

          <s:element minOccurs="1" maxOccurs="1" name="Clusters_Id" type="s:int" /> 

          <s:element minOccurs="0" maxOccurs="1" name="Title" type="s:string" /> 

          <s:element minOccurs="0" maxOccurs="1" name="Color" type="s:string" /> 

          <s:element minOccurs="1" maxOccurs="1" name="Quality" type="s:double" /> 

          <s:element minOccurs="1" maxOccurs="1" name="X" type="s:double" /> 

          <s:element minOccurs="1" maxOccurs="1" name="Y" type="s:double" /> 

        </s:sequence> 

      </s:complexType> 

Page 28: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 28 of 4

      <s:comp

        <s:sequ

          <s:ele

        </s:seq

      </s:com

      <s:comp

        <s:sequ

          <s:ele

          <s:ele

          <s:ele

          <s:ele

        </s:seq

      </s:com

      <s:comp

        <s:sequ

          <s:ele

        </s:seq

      </s:com

      <s:comp

        <s:sequ

          <s:ele

          <s:ele

          <s:ele

        </s:seq

      </s:com

      <s:eleme

        <s:com

          <s:seq

            <s:el

            <s:el

            <s:el

            <s:el

            <s:el

          </s:se

        </s:com

      </s:elem

      <s:eleme

        <s:com

          <s:seq

            <s:el

              <s:c

                <s:

                  <s

0

plexType nam

uence> 

ment minOc

quence> 

plexType> 

plexType nam

uence> 

ment minOc

ment minOc

ment minOc

ment minOc

quence> 

plexType> 

plexType nam

uence> 

ment minOc

quence> 

plexType> 

plexType nam

uence> 

ment minOc

ment minOc

ment minOc

quence> 

plexType> 

ent name="C

mplexType> 

quence> 

ement minO

ement minO

ement minO

ement minO

ement minO

equence> 

mplexType> 

ment> 

ent name="C

mplexType> 

quence> 

ement minO

complexType

:sequence> 

s:any names

me="ArrayOf

ccurs="0" ma

me="Link"> 

ccurs="1" ma

ccurs="1" ma

ccurs="1" ma

ccurs="1" ma

me="ArrayOf

ccurs="0" ma

me="Docume

ccurs="1" ma

ccurs="1" ma

ccurs="0" ma

ClassifyDMoz

Occurs="1" m

Occurs="0" m

Occurs="1" m

Occurs="1" m

Occurs="1" m

ClassifyDMoz

Occurs="0" m

e> 

space="http:

fLink"> 

axOccurs="u

axOccurs="1

axOccurs="1

axOccurs="1

axOccurs="1

fDocument">

axOccurs="u

ent"> 

axOccurs="1

axOccurs="1

axOccurs="1

z"> 

maxOccurs="

maxOccurs="

maxOccurs="

maxOccurs="

maxOccurs="

zResponse">

maxOccurs="

//searchpoin

nbounded" n

" name="id"

" name="id1

" name="id2

" name="Qu

nbounded" n

" name="id"

" name="rel

" name="dcS

1" name="D

1" name="Q

1" name="N

1" name="N

1" name="N

1" name="Cl

nt.ijs.si/Class

name="Link"

" type="s:int"

1" type="s:in

2" type="s:in

uality" type="

name="Docu

" type="s:dou

evance" type

Sim" type="s

S" type="tns

Query" type="

umHits" typ

umMinHits"

umCategorie

lassifyDMozR

sifier/BowPa

NeOn Integra

" type="tns:L

" /> 

t" /> 

t" /> 

"s:double" />

ument" type

uble" /> 

e="s:double"

s:string" /> 

s:DataSource

"s:string" /> 

e="s:int" /> 

 type="s:int"

es" type="s:i

Result"> 

art.xsd" /> 

ated Project EU

Link" /> 

="tns:Docum

" /> 

e" /> 

" /> 

int" /> 

U-IST-027595

ment" /> 

5

Page 29: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 29 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

                </s:sequence> 

              </s:complexType> 

            </s:element> 

          </s:sequence> 

        </s:complexType> 

      </s:element> 

      <s:element name="ClassifyDMozContext"> 

        <s:complexType> 

          <s:sequence> 

            <s:element minOccurs="1" maxOccurs="1" name="DS" type="tns:DataSource" /> 

            <s:element minOccurs="0" maxOccurs="1" name="Query" type="s:string" /> 

            <s:element minOccurs="1" maxOccurs="1" name="NumHits" type="s:int" /> 

            <s:element minOccurs="1" maxOccurs="1" name="NumMinHits" type="s:int" /> 

            <s:element minOccurs="1" maxOccurs="1" name="NumCategories" type="s:int" /> 

          </s:sequence> 

        </s:complexType> 

      </s:element> 

      <s:element name="ClassifyDMozContextResponse"> 

        <s:complexType> 

          <s:sequence> 

            <s:element minOccurs="0" maxOccurs="1" name="ClassifyDMozContextResult"> 

              <s:complexType> 

                <s:sequence> 

                  <s:any namespace="http://searchpoint.ijs.si/Classifier/BowPart.xsd" /> 

                </s:sequence> 

              </s:complexType> 

            </s:element> 

          </s:sequence> 

        </s:complexType> 

      </s:element> 

      <s:element name="ClassifyGY"> 

        <s:complexType> 

          <s:sequence> 

            <s:element minOccurs="1" maxOccurs="1" name="DS" type="tns:DataSource" /> 

            <s:element minOccurs="0" maxOccurs="1" name="Query" type="s:string" /> 

            <s:element minOccurs="1" maxOccurs="1" name="NumHits" type="s:int" /> 

            <s:element minOccurs="1" maxOccurs="1" name="NumCategories" type="s:int" /> 

          </s:sequence> 

        </s:complexType> 

      </s:element> 

      <s:element name="ClassifyGYResponse"> 

        <s:complexType> 

          <s:sequence> 

Page 30: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 30 of 4

            <s:el

              <s:c

                <s:

                  <s

                </s

              </s:

            </s:e

          </s:se

        </s:com

      </s:elem

      <s:eleme

        <s:com

          <s:seq

            <s:el

            <s:el

            <s:el

            <s:el

            <s:el

          </s:se

        </s:com

      </s:elem

      <s:eleme

        <s:com

          <s:seq

            <s:el

              <s:c

                <s:

                  <s

                </s

              </s:

            </s:e

          </s:se

        </s:com

      </s:elem

      <s:eleme

        <s:com

          <s:seq

            <s:an

          </s:se

        </s:com

      </s:elem

      <s:eleme

    </s:schem

0

ement minO

complexType

:sequence> 

s:any names

s:sequence>

:complexTyp

element> 

equence> 

mplexType> 

ment> 

ent name="C

mplexType> 

quence> 

ement minO

ement minO

ement minO

ement minO

ement minO

equence> 

mplexType> 

ment> 

ent name="C

mplexType> 

quence> 

ement minO

complexType

:sequence> 

s:any names

s:sequence>

:complexTyp

element> 

equence> 

mplexType> 

ment> 

ent name="B

mplexType> 

quence> 

ny namespac

equence> 

mplexType> 

ment> 

ent name="B

ma> 

Occurs="0" m

e> 

space="http:

pe> 

ClassifyEuroV

Occurs="1" m

Occurs="0" m

Occurs="1" m

Occurs="1" m

Occurs="1" m

ClassifyEuroV

Occurs="0" m

e> 

space="http:

pe> 

BowPart" nil

ce="http://se

BowPartStru

maxOccurs="

//searchpoin

Voc"> 

maxOccurs="

maxOccurs="

maxOccurs="

maxOccurs="

maxOccurs="

VocResponse

maxOccurs="

//searchpoin

lable="true"

earchpoint.ij

c" type="tns

1" name="Cl

nt.ijs.si/Class

1" name="D

1" name="Q

1" name="N

1" name="N

1" name="N

e"> 

1" name="Cl

nt.ijs.si/Class

"> 

js.si/Classifie

s:BowPartStr

lassifyGYRes

sifier/BowPa

S" type="tns

Query" type="

umHits" typ

umMinHits"

umCategorie

lassifyEuroV

sifier/BowPa

er/BowPart.x

ruc" /> 

NeOn Integra

sult"> 

art.xsd" /> 

s:DataSource

"s:string" /> 

e="s:int" /> 

 type="s:int"

es" type="s:i

ocResult"> 

art.xsd" /> 

xsd" /> 

ated Project EU

e" /> 

" /> 

int" /> 

U-IST-0275955

Page 31: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 31 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

  </wsdl:types> 

  <wsdl:message name="ClassifyKMeansSoapIn"> 

    <wsdl:part name="parameters" element="tns:ClassifyKMeans" /> 

  </wsdl:message> 

  <wsdl:message name="ClassifyKMeansSoapOut"> 

    <wsdl:part name="parameters" element="tns:ClassifyKMeansResponse" /> 

  </wsdl:message> 

  <wsdl:message name="ClassifyKMeansInSoapIn"> 

    <wsdl:part name="parameters" element="tns:ClassifyKMeansIn" /> 

  </wsdl:message> 

  <wsdl:message name="ClassifyKMeansInSoapOut"> 

    <wsdl:part name="parameters" element="tns:ClassifyKMeansInResponse" /> 

  </wsdl:message> 

  <wsdl:message name="ClassifyDMozSoapIn"> 

    <wsdl:part name="parameters" element="tns:ClassifyDMoz" /> 

  </wsdl:message> 

  <wsdl:message name="ClassifyDMozSoapOut"> 

    <wsdl:part name="parameters" element="tns:ClassifyDMozResponse" /> 

  </wsdl:message> 

  <wsdl:message name="ClassifyDMozContextSoapIn"> 

    <wsdl:part name="parameters" element="tns:ClassifyDMozContext" /> 

  </wsdl:message> 

  <wsdl:message name="ClassifyDMozContextSoapOut"> 

    <wsdl:part name="parameters" element="tns:ClassifyDMozContextResponse" /> 

  </wsdl:message> 

  <wsdl:message name="ClassifyGYSoapIn"> 

    <wsdl:part name="parameters" element="tns:ClassifyGY" /> 

  </wsdl:message> 

  <wsdl:message name="ClassifyGYSoapOut"> 

    <wsdl:part name="parameters" element="tns:ClassifyGYResponse" /> 

  </wsdl:message> 

  <wsdl:message name="ClassifyEuroVocSoapIn"> 

    <wsdl:part name="parameters" element="tns:ClassifyEuroVoc" /> 

  </wsdl:message> 

  <wsdl:message name="ClassifyEuroVocSoapOut"> 

    <wsdl:part name="parameters" element="tns:ClassifyEuroVocResponse" /> 

  </wsdl:message> 

  <wsdl:message name="ClassifyKMeansHttpGetIn"> 

    <wsdl:part name="DS" type="s:string" /> 

    <wsdl:part name="Query" type="s:string" /> 

    <wsdl:part name="NumHits" type="s:string" /> 

    <wsdl:part name="NumMinHits" type="s:string" /> 

    <wsdl:part name="NumCategories" type="s:string" /> 

Page 32: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 32 of 4

  </wsdl:me

  <wsdl:mes

    <wsdl:par

  </wsdl:me

  <wsdl:mes

    <wsdl:par

    <wsdl:par

    <wsdl:par

    <wsdl:par

    <wsdl:par

  </wsdl:me

  <wsdl:mes

    <wsdl:par

  </wsdl:me

  <wsdl:mes

    <wsdl:par

    <wsdl:par

    <wsdl:par

    <wsdl:par

    <wsdl:par

  </wsdl:me

  <wsdl:mes

    <wsdl:par

  </wsdl:me

  <wsdl:mes

    <wsdl:par

    <wsdl:par

    <wsdl:par

    <wsdl:par

    <wsdl:par

  </wsdl:me

  <wsdl:mes

    <wsdl:par

  </wsdl:me

  <wsdl:mes

    <wsdl:par

    <wsdl:par

    <wsdl:par

    <wsdl:par

  </wsdl:me

  <wsdl:mes

    <wsdl:par

  </wsdl:me

0

essage> 

ssage name=

rt name="Bo

essage> 

ssage name=

rt name="DS

rt name="Qu

rt name="Nu

rt name="Nu

rt name="Nu

essage> 

ssage name=

rt name="Bo

essage> 

ssage name=

rt name="DS

rt name="Qu

rt name="Nu

rt name="Nu

rt name="Nu

essage> 

ssage name=

rt name="Bo

essage> 

ssage name=

rt name="DS

rt name="Qu

rt name="Nu

rt name="Nu

rt name="Nu

essage> 

ssage name=

rt name="Bo

essage> 

ssage name=

rt name="DS

rt name="Qu

rt name="Nu

rt name="Nu

essage> 

ssage name=

rt name="Bo

essage> 

="ClassifyKM

ody" element

="ClassifyKM

S" type="s:st

uery" type="

umHits" type

umMinHits" 

umCategorie

="ClassifyKM

ody" element

="ClassifyDM

S" type="s:st

uery" type="

umHits" type

umMinHits" 

umCategorie

="ClassifyDM

ody" element

="ClassifyDM

S" type="s:st

uery" type="

umHits" type

umMinHits" 

umCategorie

="ClassifyDM

ody" element

="ClassifyGYH

S" type="s:st

uery" type="

umHits" type

umCategorie

="ClassifyGYH

ody" element

eansHttpGet

t="tns:BowP

eansInHttpG

tring" /> 

"s:string" />

e="s:string" /

type="s:strin

es" type="s:s

eansInHttpG

t="tns:BowP

MozHttpGetIn

tring" /> 

"s:string" />

e="s:string" /

type="s:strin

es" type="s:s

MozHttpGetO

t="tns:BowP

MozContextHt

tring" /> 

"s:string" />

e="s:string" /

type="s:strin

es" type="s:s

MozContextHt

t="tns:BowP

HttpGetIn">

tring" /> 

"s:string" />

e="s:string" /

es" type="s:s

HttpGetOut"

t="tns:BowP

tOut"> 

Part" /> 

GetIn"> 

/> 

ng" /> 

tring" /> 

GetOut"> 

PartStruc" />

n"> 

/> 

ng" /> 

tring" /> 

Out"> 

Part" /> 

ttpGetIn"> 

/> 

ng" /> 

tring" /> 

ttpGetOut">

Part" /> 

/> 

tring" /> 

"> 

Part" /> 

 

NeOn Integraated Project EUU-IST-0275955

Page 33: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 33 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

  <wsdl:message name="ClassifyEuroVocHttpGetIn"> 

    <wsdl:part name="DS" type="s:string" /> 

    <wsdl:part name="Query" type="s:string" /> 

    <wsdl:part name="NumHits" type="s:string" /> 

    <wsdl:part name="NumMinHits" type="s:string" /> 

    <wsdl:part name="NumCategories" type="s:string" /> 

  </wsdl:message> 

  <wsdl:message name="ClassifyEuroVocHttpGetOut"> 

    <wsdl:part name="Body" element="tns:BowPart" /> 

  </wsdl:message> 

  <wsdl:portType name="WS_ClassifySoap"> 

    <wsdl:operation name="ClassifyKMeans"> 

      <wsdl:input message="tns:ClassifyKMeansSoapIn" /> 

      <wsdl:output message="tns:ClassifyKMeansSoapOut" /> 

    </wsdl:operation> 

    <wsdl:operation name="ClassifyKMeansIn"> 

      <wsdl:input message="tns:ClassifyKMeansInSoapIn" /> 

      <wsdl:output message="tns:ClassifyKMeansInSoapOut" /> 

    </wsdl:operation> 

    <wsdl:operation name="ClassifyDMoz"> 

      <wsdl:input message="tns:ClassifyDMozSoapIn" /> 

      <wsdl:output message="tns:ClassifyDMozSoapOut" /> 

    </wsdl:operation> 

    <wsdl:operation name="ClassifyDMozContext"> 

      <wsdl:input message="tns:ClassifyDMozContextSoapIn" /> 

      <wsdl:output message="tns:ClassifyDMozContextSoapOut" /> 

    </wsdl:operation> 

    <wsdl:operation name="ClassifyGY"> 

      <wsdl:input message="tns:ClassifyGYSoapIn" /> 

      <wsdl:output message="tns:ClassifyGYSoapOut" /> 

    </wsdl:operation> 

    <wsdl:operation name="ClassifyEuroVoc"> 

      <wsdl:input message="tns:ClassifyEuroVocSoapIn" /> 

      <wsdl:output message="tns:ClassifyEuroVocSoapOut" /> 

    </wsdl:operation> 

  </wsdl:portType> 

  <wsdl:portType name="WS_ClassifyHttpGet"> 

    <wsdl:operation name="ClassifyKMeans"> 

      <wsdl:input message="tns:ClassifyKMeansHttpGetIn" /> 

      <wsdl:output message="tns:ClassifyKMeansHttpGetOut" /> 

    </wsdl:operation> 

    <wsdl:operation name="ClassifyKMeansIn"> 

      <wsdl:input message="tns:ClassifyKMeansInHttpGetIn" /> 

Page 34: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 34 of 4

      <wsdl:ou

    </wsdl:op

    <wsdl:op

      <wsdl:in

      <wsdl:ou

    </wsdl:op

    <wsdl:op

      <wsdl:in

      <wsdl:ou

    </wsdl:op

    <wsdl:op

      <wsdl:in

      <wsdl:ou

    </wsdl:op

    <wsdl:op

      <wsdl:in

      <wsdl:ou

    </wsdl:op

  </wsdl:por

  <wsdl:bind

    <soap:bin

    <wsdl:op

      <soap:o

      <wsdl:in

        <soap:b

      </wsdl:i

      <wsdl:ou

        <soap:b

      </wsdl:o

    </wsdl:op

    <wsdl:op

      <soap:o/> 

      <wsdl:in

        <soap:b

      </wsdl:i

      <wsdl:ou

        <soap:b

      </wsdl:o

    </wsdl:op

    <wsdl:op

      <soap:o

      <wsdl:in

        <soap:b

0

utput messa

peration> 

eration nam

nput message

utput messa

peration> 

eration nam

nput message

utput messa

peration> 

eration nam

nput message

utput messa

peration> 

eration nam

nput message

utput messa

peration> 

rtType> 

ding name="

nding transpo

eration nam

peration soa

nput> 

body use="li

nput> 

utput> 

body use="li

output> 

peration> 

eration nam

peration  so

nput> 

body use="li

nput> 

utput> 

body use="li

output> 

peration> 

eration nam

peration soa

nput> 

body use="li

ge="tns:Clas

e="ClassifyD

e="tns:Classi

ge="tns:Clas

e="ClassifyD

e="tns:Classi

ge="tns:Clas

e="ClassifyG

e="tns:Classi

ge="tns:Clas

e="ClassifyE

e="tns:Classi

ge="tns:Clas

WS_Classify

ort="http://s

e="ClassifyK

apAction="ht

teral" /> 

teral" /> 

e="ClassifyK

oapAction="h

teral" /> 

teral" /> 

e="ClassifyD

apAction="ht

teral" /> 

ssifyKMeans

DMoz"> 

ifyDMozHttp

ssifyDMozHt

DMozContext

ifyDMozCon

ssifyDMozCo

GY"> 

ifyGYHttpGe

ssifyGYHttpG

EuroVoc"> 

ifyEuroVocH

ssifyEuroVoc

Soap" type=

schemas.xm

KMeans"> 

ttp://searchp

KMeansIn">

http://search

DMoz"> 

ttp://searchp

InHttpGetOu

pGetIn" /> 

tpGetOut" /

t"> 

textHttpGet

ontextHttpGe

etIn" /> 

GetOut" /> 

ttpGetIn" />

cHttpGetOut

"tns:WS_Cla

lsoap.org/so

point.ijs.si/C

hpoint.ijs.si/

point.ijs.si/C

ut" /> 

In" /> 

etOut" /> 

" /> 

assifySoap">

oap/http" />

lassifier/Clas

/Classifier/Cla

lassifier/Clas

NeOn Integra

ssifyKMeans

assifyKMean

ssifyDMoz" s

ated Project EU

" style="doc

nsIn"  style=

style="docum

U-IST-027595

ument" /> 

"document"

ment" /> 

5

Page 35: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 35 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

      </wsdl:input> 

      <wsdl:output> 

        <soap:body use="literal" /> 

      </wsdl:output> 

    </wsdl:operation> 

    <wsdl:operation name="ClassifyDMozContext"> 

      <soap:operation  soapAction="http://searchpoint.ijs.si/Classifier/ClassifyDMozContext" style="document" /> 

      <wsdl:input> 

        <soap:body use="literal" /> 

      </wsdl:input> 

      <wsdl:output> 

        <soap:body use="literal" /> 

      </wsdl:output> 

    </wsdl:operation> 

    <wsdl:operation name="ClassifyGY"> 

      <soap:operation soapAction="http://searchpoint.ijs.si/Classifier/ClassifyGY" style="document" /> 

      <wsdl:input> 

        <soap:body use="literal" /> 

      </wsdl:input> 

      <wsdl:output> 

        <soap:body use="literal" /> 

      </wsdl:output> 

    </wsdl:operation> 

    <wsdl:operation name="ClassifyEuroVoc"> 

      <soap:operation soapAction="http://searchpoint.ijs.si/Classifier/ClassifyEuroVoc" style="document" /> 

      <wsdl:input> 

        <soap:body use="literal" /> 

      </wsdl:input> 

      <wsdl:output> 

        <soap:body use="literal" /> 

      </wsdl:output> 

    </wsdl:operation> 

  </wsdl:binding> 

  <wsdl:binding name="WS_ClassifySoap12" type="tns:WS_ClassifySoap"> 

    <soap12:binding transport="http://schemas.xmlsoap.org/soap/http" /> 

    <wsdl:operation name="ClassifyKMeans"> 

      <soap12:operation  soapAction="http://searchpoint.ijs.si/Classifier/ClassifyKMeans"  style="document" /> 

      <wsdl:input> 

        <soap12:body use="literal" /> 

      </wsdl:input> 

      <wsdl:output> 

        <soap12:body use="literal" /> 

Page 36: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 36 of 4

      </wsdl:o

    </wsdl:op

    <wsdl:op

      <soap12/> 

      <wsdl:in

        <soap1

      </wsdl:i

      <wsdl:ou

        <soap1

      </wsdl:o

    </wsdl:op

    <wsdl:op

      <soap12

      <wsdl:in

        <soap1

      </wsdl:i

      <wsdl:ou

        <soap1

      </wsdl:o

    </wsdl:op

    <wsdl:op

      <soap12style="docu

      <wsdl:in

        <soap1

      </wsdl:i

      <wsdl:ou

        <soap1

      </wsdl:o

    </wsdl:op

    <wsdl:op

      <soap12

      <wsdl:in

        <soap1

      </wsdl:i

      <wsdl:ou

        <soap1

      </wsdl:o

    </wsdl:op

    <wsdl:op

      <soap12/> 

      <wsdl:in

0

output> 

peration> 

eration nam

2:operation s

nput> 

12:body use=

nput> 

utput> 

12:body use=

output> 

peration> 

eration nam

2:operation s

nput> 

12:body use=

nput> 

utput> 

12:body use=

output> 

peration> 

eration nam

2:operation ument" /> 

nput> 

12:body use=

nput> 

utput> 

12:body use=

output> 

peration> 

eration nam

2:operation s

nput> 

12:body use=

nput> 

utput> 

12:body use=

output> 

peration> 

eration nam

2:operation 

nput> 

e="ClassifyK

soapAction=

="literal" /> 

="literal" /> 

e="ClassifyD

soapAction="

="literal" /> 

="literal" /> 

e="ClassifyD

="literal" /> 

="literal" /> 

e="ClassifyG

soapAction="

="literal" /> 

="literal" /> 

e="ClassifyE

soapAction=

KMeansIn">

"http://sear

DMoz"> 

"http://searc

DMozContext

soapActio

GY"> 

"http://searc

EuroVoc"> 

="http://sea

chpoint.ijs.si

chpoint.ijs.si

t"> 

on="http://se

chpoint.ijs.si

rchpoint.ijs.s

i/Classifier/C

i/Classifier/C

earchpoint.ij

i/Classifier/C

si/Classifier/

NeOn Integra

ClassifyKMea

ClassifyDMoz

js.si/Classifie

ClassifyGY" st

ClassifyEuro

ated Project EU

ansIn" style=

z" style="doc

er/ClassifyDM

tyle="docum

Voc"  style=

U-IST-027595

"document"

cument" />

MozContext"

ment" /> 

"document"

5

Page 37: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 37 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

        <soap12:body use="literal" /> 

      </wsdl:input> 

      <wsdl:output> 

        <soap12:body use="literal" /> 

      </wsdl:output> 

    </wsdl:operation> 

  </wsdl:binding> 

  <wsdl:binding name="WS_ClassifyHttpGet" type="tns:WS_ClassifyHttpGet"> 

    <http:binding verb="GET" /> 

    <wsdl:operation name="ClassifyKMeans"> 

      <http:operation location="/ClassifyKMeans" /> 

      <wsdl:input> 

        <http:urlEncoded /> 

      </wsdl:input> 

      <wsdl:output> 

        <mime:mimeXml part="Body" /> 

      </wsdl:output> 

    </wsdl:operation> 

    <wsdl:operation name="ClassifyKMeansIn"> 

      <http:operation location="/ClassifyKMeansIn" /> 

      <wsdl:input> 

        <http:urlEncoded /> 

      </wsdl:input> 

      <wsdl:output> 

        <mime:mimeXml part="Body" /> 

      </wsdl:output> 

    </wsdl:operation> 

    <wsdl:operation name="ClassifyDMoz"> 

      <http:operation location="/ClassifyDMoz" /> 

      <wsdl:input> 

        <http:urlEncoded /> 

      </wsdl:input> 

      <wsdl:output> 

        <mime:mimeXml part="Body" /> 

      </wsdl:output> 

    </wsdl:operation> 

    <wsdl:operation name="ClassifyDMozContext"> 

      <http:operation location="/ClassifyDMozContext" /> 

      <wsdl:input> 

        <http:urlEncoded /> 

      </wsdl:input> 

      <wsdl:output> 

        <mime:mimeXml part="Body" /> 

Page 38: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 38 of 4

      </wsdl:o

    </wsdl:op

    <wsdl:op

      <http:op

      <wsdl:in

        <http:u

      </wsdl:i

      <wsdl:ou

        <mime

      </wsdl:o

    </wsdl:op

    <wsdl:op

      <http:op

      <wsdl:in

        <http:u

      </wsdl:i

      <wsdl:ou

        <mime

      </wsdl:o

    </wsdl:op

  </wsdl:bin

  <wsdl:serv

    <wsdl:po

      <soap:ad

    </wsdl:po

    <wsdl:po

      <soap12

    </wsdl:po

    <wsdl:po

      <http:ad

    </wsdl:po

  </wsdl:ser

</wsdl:defi

0

output> 

peration> 

eration nam

peration loca

nput> 

urlEncoded /

nput> 

utput> 

:mimeXml p

output> 

peration> 

eration nam

peration loca

nput> 

urlEncoded /

nput> 

utput> 

:mimeXml p

output> 

peration> 

nding> 

vice name="W

rt name="W

ddress locati

ort> 

rt name="W

2:address loc

ort> 

rt name="W

ddress locatio

ort> 

rvice> 

nitions> 

e="ClassifyG

ation="/Class

/> 

art="Body" /

e="ClassifyE

ation="/Class

/> 

art="Body" /

WS_Classify"

WS_ClassifySo

ion="http://

WS_ClassifySo

cation="http

WS_ClassifyHt

on="http://l

GY"> 

sifyGY" /> 

/> 

EuroVoc"> 

sifyEuroVoc"

/> 

"> 

oap" binding=

localhost:60

oap12" bindi

://localhost:

ttpGet" bind

ocalhost:601

" /> 

="tns:WS_Cl

0107/Classifie

ng="tns:WS_

60107/Class

ing="tns:WS

107/Classifie

assifySoap">

er/WS_Class

_ClassifySoap

sifier/WS_Cla

S_ClassifyHtt

er/WS_Classi

NeOn Integra

sify.asmx" />

p12"> 

assify.asmx" 

pGet"> 

ify.asmx" /> 

ated Project EU

 

/> 

U-IST-0275955

Page 39: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

D3.2.4 Context-sensitive Search of Ontologies Page 39 of 40

2006–2009 © Copyright lies with the respective authors and their institutions.

References

[d'Aquin 2008] d'Aquin, M. Building Semantic Web Based Applications with Watson. 17th International World Wide Web Conference (WWW2008) Developers Track.

[Fortuna et all, 2005] B. Fortuna, M. Grobelnik, D. Mladenic: Semi-automatic Construction of Topic Ontology. Semantics, Web and Mining, Joint International Workshop, EWMF 2005 and KDO 2005, Porto, Portugal, October 3–7, 2005.

[Fruchterman and Reingold, 1991] T. M. J. Fruchterman and E. M. Reingold. Graph drawing by force directed placement. Softw. Pract. Exper., 1991.

[Grobelnik and Mladenić, 2006] Grobelnik, M., Mladenić, D., (2006). Automated Knowledge Discovery in Advanced Knowledge Management, journal of Knowledge Management.

[Grobelnik et al., 2008] Marko Grobelnik, Janez Brank, Blaž Fortuna, Igor Mozetič. Contextualizing Ontologies with OntoLight: A Pragmatic Approach, Informatica. 2008.

[Kanungo et all, 2002] T. Kanungo, D. M. Mount, N. Netanyahu, C. Piatko, R. Silverman, and A. Y. Wu (2002). An efficient k-means clustering algorithm: Analysis and implementation, IEEE Trans. Pattern Analysis and Machine Intelligence, 24 (2002), 881-892.

[Koller and Sahami, 1997] D., Sahami, M., (1997). Hierarchically classifying documents using very few words, Proceedings of the 14th International Conference on Machine Learning ICML-97, pp. 170-178, Morgan Kaufmann, San Francisco, CA.

[McCallum et all, 1998] McCallum A., Rosenfeld R., Mitchell T., Ng A., (1998). Improving Text Classification by Shrinkage in a Hierarchy of Classes, Proceedings of the 15th International Conference on Machine Learning ICML-98, Morgan Kaufmann, San Francisco, CA.

[Mitchell, 1997] Mitchell, T.M. (1997). Machine Learning. The McGraw-Hill Companies, Inc.

[Mladenić, 1998] Mladenić, D. (1998). Turning Yahoo into an Automatic Web-Page Classifier. Proc. 13th European Conference on Artificial Intelligence (ECAI'98, John Wiley & Sons), 473–474.

[Mladenić and Grobelnik, 2003] Mladenić, D., Grobelnik, M. (2003). Feature selection on hierarchy of web documents. Journal of Decision support systems, 35, 45-87.

[Pajntar, 2006] Pajntar B. (2006), Overview of algorithms for graph drawing, In proc. of Slovenian KDD Conference 2006. Oct. 2006

Page 40: NeOn 2009 D324neon-project.org/deliverables/WP3/NeOn_2009_D324.pdfNeOn: Life Integrated Priority: IS This delive ontology w Finding a g propose a topics are represents semantica get

Page 40 of 4

[Pajntar anParadigm Developers

[Sabou et aBackgrounOntology M

[SteingbaccomparisoGrobelnik,

0

nd Grobelniof Web

s Track.

all, 2006] nd KnowledMatching (O

ch et all, 200n of documM., Mladen

k, 2008] Search. 17

M. dge for OntoOM-2006), c

00] ment clustenić, D. and M

Boštjan7th Interna

Sabou, M. ology Mapp

collocated w

Steinbaering technMilic-Fraylin

n Pajntar, ational Wo

d’Aquin, aping, In Prowith ISWC-0

ach, M., Kniques. Prong, N.), Bos

Marko Groorld Wide

and E. Motoceedings o06.

Karypis, Goc. KDD Wston, MA, U

NeOn Integra

obelnik. SeWeb Con

ta: Using tof the Intern

G. and KuWorkshop o

SA, 109–11

ated Project EU

earchPoint ference (W

the Semantrnational Wo

umar, V. on Text Mi10.

U-IST-027595

– a NewWWW2008)

tic Web asWorkshop on

(2000). Aning. (eds.

5

w )

s n

A .


Recommended