+ All Categories
Home > Documents > Concept maps as the first step in an ontology construction method

Concept maps as the first step in an ontology construction method

Date post: 06-Dec-2016
Category:
Upload: jose-maria
View: 213 times
Download: 0 times
Share this document with a friend
13
Concept maps as the first step in an ontology construction method Rodrigo Rizzi Starr n,1 , Jose ´ Maria Parente de Oliveira nn Divis ~ ao de Ciˆ encia da Computac - o (IEC), Instituto Tecnolo ´gico de Aerona ´utica Prac - a Marechal Eduardo Gomes, 50 Vila das Aca ´cias, 12228-900 S ~ ao Jose´ dos Campos, SP, Brazil article info Available online 30 May 2012 Keywords: Ontologies Concept maps Knowledge acquisition Ontology engineering abstract A method is proposed to be used as the first step in the ontology acquisition process. This method is based on the use of concept maps as a means of expression for the expert, followed by an application that assists the expert in detailing the structure of the knowledge represented in the map. This application analyses the concept map, taking into account the map topology and key words used by the expert. From this analysis a series of questions are presented to the expert that, when answered, reduce the map ambiguity and identify some common patterns in ontological representations, such as generalizations and mereologic relations. This information can be used by the knowledge engineer during further knowledge acquisition sessions or to direct the expert to a further improvement of the map. The method was tested by a group of volunteers, most of them engineers working at the aerospace sector, and the results suggest that both the use of concept mapping as well as the refining step are acceptable from the point of view of the end user, supporting the claim that this method is viable as an option to reduce some of the difficulties in large scale ontology construction. & 2012 Elsevier Ltd. All rights reserved. 1. Introduction Currently, ontologies are receiving a lot of attention as useful knowledge representation structures in several fields of software engineering, as in the semantic web [1], semantic web services [2], knowledge management [3], knowledge- based systems (KBS), etc. Nevertheless, the more widespread use of ontologies is still hindered by the knowledge acquisi- tion bottleneck: the effort committed to create and maintain an ontology is usually larger than the perceived benefits that a formalized ontology will bring, as well-argued in [4]. In his discussion of the problem, Hepp identifies three (interrelated) aspects of ontology creation that are mainly dependent on the knowledge acquisition process: the pro- blems of (1) resource consumption, (2) ontology engineering lag versus conceptual dynamics and (3) communication between ontology creators and users. The problem of resource consumption is just a rein- statement of the knowledge acquisition bottleneck. 2 His- torically, knowledge acquisition has been one of the main difficulties since KBSs began to be developed [6,7]. This problem has been well described and studied but not yet satisfactorily solved. The ontology engineering lag versus conceptual dynamics problem is related to the slow but constant changes in meaning that concepts usually undergo. This may create two problems: firstly, it may be the case that just after the ontology is ready to be deployed the concepts represented there have already changed a little. Secondly, in domains where the concepts change very fast it is possible that the cost to maintain the ontology updated is prohibitively Contents lists available at SciVerse ScienceDirect journal homepage: www.elsevier.com/locate/infosys Information Systems 0306-4379/$ - see front matter & 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.is.2012.05.010 n Principal corresponding author. nn Corresponding author. Tel.: þ55 12 3947 6941. E-mail addresses: [email protected] (R.R. Starr), [email protected] (J.M. Parente de Oliveira). 1 Tel.: þ55 12 8142 4437. 2 In this paper we consider ontology acquisition as a type of knowledge acquisition, since it is concerned only with (ideally) applica- tion independent declarative knowledge, or, in the CommonKADS terminology, is a type of domain knowledge [5]. Information Systems 38 (2013) 771–783
Transcript
Page 1: Concept maps as the first step in an ontology construction method

Contents lists available at SciVerse ScienceDirect

Information Systems

Information Systems 38 (2013) 771–783

0306-43

http://d

n Prinnn Cor

E-m

parente1 Te

journal homepage: www.elsevier.com/locate/infosys

Concept maps as the first step in an ontologyconstruction method

Rodrigo Rizzi Starr n,1, Jose Maria Parente de Oliveira nn

Divis~ao de Ciencia da Computac-o (IEC), Instituto Tecnologico de Aeronautica Prac-a Marechal Eduardo Gomes, 50 Vila das Acacias,

12228-900 S ~ao Jose dos Campos, SP, Brazil

a r t i c l e i n f o

Available online 30 May 2012

Keywords:

Ontologies

Concept maps

Knowledge acquisition

Ontology engineering

79/$ - see front matter & 2012 Elsevier Ltd. A

x.doi.org/10.1016/j.is.2012.05.010

cipal corresponding author.

responding author. Tel.: þ55 12 3947 6941.

ail addresses: [email protected] (R.R. S

@ita.br (J.M. Parente de Oliveira).

l.: þ55 12 8142 4437.

a b s t r a c t

A method is proposed to be used as the first step in the ontology acquisition process.

This method is based on the use of concept maps as a means of expression for the

expert, followed by an application that assists the expert in detailing the structure of

the knowledge represented in the map. This application analyses the concept map,

taking into account the map topology and key words used by the expert. From this

analysis a series of questions are presented to the expert that, when answered, reduce

the map ambiguity and identify some common patterns in ontological representations,

such as generalizations and mereologic relations. This information can be used by the

knowledge engineer during further knowledge acquisition sessions or to direct the

expert to a further improvement of the map. The method was tested by a group of

volunteers, most of them engineers working at the aerospace sector, and the results

suggest that both the use of concept mapping as well as the refining step are acceptable

from the point of view of the end user, supporting the claim that this method is viable

as an option to reduce some of the difficulties in large scale ontology construction.

& 2012 Elsevier Ltd. All rights reserved.

1. Introduction

Currently, ontologies are receiving a lot of attention asuseful knowledge representation structures in several fieldsof software engineering, as in the semantic web [1], semanticweb services [2], knowledge management [3], knowledge-based systems (KBS), etc. Nevertheless, the more widespreaduse of ontologies is still hindered by the knowledge acquisi-tion bottleneck: the effort committed to create and maintainan ontology is usually larger than the perceived benefits thata formalized ontology will bring, as well-argued in [4].

In his discussion of the problem, Hepp identifies three(interrelated) aspects of ontology creation that are mainlydependent on the knowledge acquisition process: the pro-blems of (1) resource consumption, (2) ontology engineering

ll rights reserved.

tarr),

lag versus conceptual dynamics and (3) communicationbetween ontology creators and users.

The problem of resource consumption is just a rein-statement of the knowledge acquisition bottleneck.2 His-torically, knowledge acquisition has been one of the maindifficulties since KBSs began to be developed [6,7]. Thisproblem has been well described and studied but not yetsatisfactorily solved.

The ontology engineering lag versus conceptual dynamics

problem is related to the slow but constant changes inmeaning that concepts usually undergo. This may createtwo problems: firstly, it may be the case that just after theontology is ready to be deployed the concepts representedthere have already changed a little. Secondly, in domainswhere the concepts change very fast it is possible that thecost to maintain the ontology updated is prohibitively

2 In this paper we consider ontology acquisition as a type of

knowledge acquisition, since it is concerned only with (ideally) applica-

tion independent declarative knowledge, or, in the CommonKADS

terminology, is a type of domain knowledge [5].

Page 2: Concept maps as the first step in an ontology construction method

R.R. Starr, J.M. Parente de Oliveira / Information Systems 38 (2013) 771–783772

high. One example presented by the author is in obsoles-cence in the microprocessor manufacturing industry,where technologies evolve fast, in about a year there areseveral new terminologies created.

Finally, the communication between creators and users

problem is related to the consensual aspect of ontologydevelopment, namely, does the ontology really represent theconsensus of meaning in the community where it is to beused? The process of formalization of a knowledge domainmay bring forward some previously unnoticed inconsisten-cies or the restrictions of the representation language mayforce the ‘‘ontologist’’ to use constructions that are incom-plete or strange for the user. This fact can be used, forexample, to reinterpret the results shown in [8], where it isfound that most users do not agree with the conceptualiza-tions used in some upper level ontologies.

Historically, the resource consumption problem hasbeen more widely studied, since it impacted the develop-ment (and maintainability) of KBS from the beginning.One widely accepted approach to ease this problem is theuse of an intermediate representation language [9] toensure that both the expert and the knowledge engineershare the same conceptual constructs. Some successfulKBS development methodologies have been built aroundthis approach, such as MOKA [10]. Nonetheless, thedifficulty in modeling and representing expert knowledgestill remains a big impediment to the more widespreaddevelopment and deployment of ontologies.

An almost ideal solution to this problem would be thecreation of a KBS that did the knowledge engineer’s job ofunderstanding the expert’s constructs and automaticallytranslated that to an unambiguous machine interpretablerepresentation. This hypothetical system would automa-tically run the several refinement cycles usually asso-ciated with knowledge acquisition.

Of course, there is yet no such a system, but someapproaches are directed to this goal. One that is quiteclosely related to that ideal view is the use of naturallanguage processing (NLP) techniques. The most success-ful way of using NLP in ontology acquisition is theautomatic learning of ontologies from text [11]. This canbe especially effective if the ontologies to be learned areessentially taxonomies and the text’s structure is verysimple. Another possible approach is the use of NLP toimprove the communication between the expert and anancillary tool. This could reduce the dependency of theprocess on a knowledge engineer. Other approaches try tocreate better human–machine interfaces to facilitate thedefinition of ontologies by experts (e.g. [12]).

The method proposed in this paper is based on the useof concept maps as an intermediate representation lan-guage, followed by the application of a tool that processesthis map and, after questioning the human expert, gen-erates a formal representation in OWL DL and also out-puts a log with all the user answers. Even though thisOWL representation usually will not be ready for deploy-ment, it is able to capture more of the conceptualstructure of the domain than a simple concept map. Weclaim that this pre-processing may ease the task of theknowledge engineer, especially in the context of knowl-edge acquisition for knowledge management systems.

Under these circumstances, the ontologies used by mosttasks can be simpler than ontologies used, for example, inKBSs. On the other hand, they must be more comprehen-sive, so to describe most of the domains related to thebusiness of an organization.

This paper is organized as follows. The next sectionpresents a literature review on the topic of conceptmapping for ontology construction. Also, this sectionprovides some background on concept maps. In Section3, the proposed method and the tool are described indetail. Section 4 presents the experimental results on theusage of the method and tool by a group of experts.Concluding remarks and considerations on future workare drawn up in Section 5.

2. Ontology construction and concept maps

To address the knowledge acquisition problem, severalmethodologies were proposed. The use of methodologiesbrings the following benefits:

It makes the process repeatable, in the sense that it isnot so dependent on the project team. � It avoids the known errors. In this sense, a methodol-

ogy works as a very formal practice of knowledgemanagement.

� It delivers a shared vision of the process to the several

people involved: managers, team members, customers,stakeholders, etc.

� It also delivers a complete view of the project’s life

cycle. This eases decision making, management andthe use of metrics.

According to Bussmann et al. [13], a methodologymust be composed of the following parts:

An (optional) definition of the problem space. � A set of models that represent different aspects of the

problem domain or the solution at different stages.

� A set of methods that transform instances of one

model into another model.

� A set of procedural guidelines which define an order for

the systematic application of the methodological steps.

Different methodologies focus on different aspects ofthe problem. Methodologies that are focused on KBSdevelopment tend to specify the knowledge acquisitionand formalization phase in a very structured but also rigidway, considering, for example, the division of knowledgein task-specific and method-specific. Such division can beseen in the CommonKADS [5], in MIKE [14], and in MOKA[10]. MOKA is a specially good example because, since thedomain of the methodology is restricted to engineeringdesign problems, the methodology itself includes pre-defined ontologies, called product and design models [15].

Other methodologies have a stronger focus on specificparts of the knowledge acquisition phase. For example,the Protege methodology [16] is very focused on theacquisition of domain instances, in such a way that thiscould be done without assistance from a knowledge

Page 3: Concept maps as the first step in an ontology construction method

R.R. Starr, J.M. Parente de Oliveira / Information Systems 38 (2013) 771–783 773

engineer [17]. The LAL methodology [18] has a strong focuson acquisition of the domain ontology, but coupled with aspecific methodology for requirement analysis [19].

Finally, the On-To-Knowledge (OTK) methodology [20]aims at managing the complete cycle of development ofontologies specifically tailored for knowledge manage-ment. Nevertheless, this methodology does not specify indetail the knowledge acquisition process, but is moreconcerned with the higher levels activities, such as theontology life-cycle and updating, the management of theontology creation project etc.

Most of the methodologies reviewed above do notfocus on the specific problem of how to reduce the needof interaction between the knowledge engineer and theexpert. As argued before, since for knowledge manage-ment activities comprehensive ontologies are necessary,this is a desirable direction to pursue. In Section 3, amethod that is a step in this direction is proposed, andone of the tenets of this method is the use of conceptmaps as an informal expression language for the expert.

Concept maps were created as a tool to support knowl-edge construction and learning. It has been extensivelyvalidated in experiments involving people from school-agechildren to adults. They also have been suggested as a toolfor use in corporate environments, to foster and improvecommunication among a team [21]. They are a graphicalrepresentation tool, composed of concepts, represented asboxes with text, and linking phrases, represented asoriented arcs with a label. Concepts are intended to repre-sent ‘‘regularities perceived in events and objects in theworld’’ [21] while linking phrases should form, togetherwith the concepts they connect, assertions that are true inthat knowledge domain.

Due to their intuitive nature and research supportingtheir effectiveness [21,22], concept maps have been stu-died in several ways as a knowledge acquisition tool. Thissection reviews some literature on this use.

Some recent work has focused on the use of conceptmaps as an intermediate language for creating and visua-lizing ontologies. In [23,24], the maps were used to allowseveral remotely located experts to collaborate in thecreation of an ontology for nutrigenomics and genealogyof vegetable varieties. In this study, concept maps wereconsidered as a successful tool to share knowledgebetween experts, to communicate consensus and to helpthe communication between expert and knowledge engi-neer. In a related application, concept maps were alsoused in a system to formalize knowledge in the field ofperoxisomal pathways [25].

In [26], concept maps were used as the main repre-sentation method in a knowledge management system.The maps were created by experts and then loaded intothe system, allowing heuristics and practical knowledgeto be attached to technical documents, training manuals,etc., creating a rich network of annotations on knowledgethat was already partially explicit. Yet, it was reportedthat this system had a few disadvantages, mainly becausethe effort to add and update knowledge to the system wasconsiderable.

As a tool for interfacing with formal knowledge repre-sentation, concept maps are attractive because they share

some structural resemblance with computational knowl-edge representation schemes. For example, semanticnetworks may be seen as hypergraphs [27], and therepresentation schemes used in semantic web technolo-gies evolved from the study of these networks [28].Concept maps are also hypergraphs [29], so that theymay work as a visualization tool for knowledge represen-tation. As an instance of this idea, in [30], a method tocreate and edit ontologies in OWL DL using concept mapsas a graphical visualization of the ontology is shown.Nevertheless, in the proposed method it is still a respon-sibility of the ontology author to use the correct syntax.

A different approach was proposed in [31,32], where,besides representing the ontology directly, the system allowsthe user to filter out information that is not important, suchas comment nodes or standard type information, such asinheritance from owl:Thing. Other heuristics were imple-mented, such as the transformation of URI’s to shorter names,the use of color codes to signify when lists were collapsed toa node and so on. Also, instead of using OWL operator namesdirectly, terms more ‘‘english-like’’ are used and sometemplates for common constructs are also provided.

The approaches described above make OWL ontologieseasier to view and to edit, but they still require knowledgeof OWL from the part of the user. In [33,34], a step beyondthis approach was taken. In this work, the authorsproposed an automatic tool to help convert concept mapsinto OWL ontologies. In the proposed process, severalexperts create, individually, concept maps representingthe knowledge in a given domain. Then, a knowledgeengineer uses a tool to support the conversion of the mapto OWL and then uses another tool to align the ontologies.

The process proposed in the present work is anextension of this approach where, instead of a knowledgeengineer using the support tool, the expert herself orhimself uses a tool that generates questions about theconcept map. After that, this tool then generates a pre-liminary OWL ontology, together with a log of the useranswers, this way capturing more of the structure of thedomain than a standard concept map would be able to do.

This proposal is related to the previous work in severalways to allow an expert lay in knowledge representationexpress her/his knowledge in a formal way. In [12,35], amethod to allow experts, using a graphical representation, toupdate a knowledge base using generic classes and relation-ships previously defined by experts is described. This systemneeded some training from the part of the users (the authorsreport a one week training period) but the system has showngood results in a limited set of experiments.

In [36], a system that guides the experts in creatingontologies using a controlled natural language (Rabbit) isdescribed. This system helps the user create the ontologyby various means, such as finding errors when using theRabbit language, keeping a list of classes that have beenmentioned but not yet defined and following a formalontology development methodology.

3. Proposed method

The method of ontology acquisition proposed in thispaper consists of two parts. First, the human expert creates

Page 4: Concept maps as the first step in an ontology construction method

R.R. Starr, J.M. Parente de Oliveira / Information Systems 38 (2013) 771–783774

a concept map about a domain of interest, using freevocabulary and free structure for the concept map. After-wards, this map is loaded into a support tool that tries tointerpret it and questions the user about the meaning ofsome constructs in the map. As a result, the application isable to reduce the ambiguity level of the map and capturemore of the ontological structure (i.e. the relationshipsbetween concepts) of the domain.

An example of ambiguities is words or phrases that mayrepresent mereologic relationships, such as ‘‘has’’. In somecontexts, this phrase can be understood as synonymouswith ‘‘has parts’’, while in other contexts it cannot. Whenfaced with this phrase, the application would question theuser about the meaning of the relation. For example,suppose that two phrases are found in a concept map:‘‘car has engine’’ and ‘‘car has color’’. The first represents amereologic relationship, while the second does not.

Fig. 1 shows the complete process. One of the advantagesof this method is that it can be ‘‘plugged in’’ several morecomplete or detailed ontology acquisition methodologies.

It can be seen from Fig. 1 that two trainings arenecessary for the user: one in concept mapping andontologies and the other in the tool (where some basicconcepts of description logic are also introduced). Overall,these trainings should not take long. In the context of theexperiment described in Section 4, both trainings tookless than 60 min.

With the user correctly trained, the first step in theprocess is the creation of focus questions to direct theconcept map development. In the experiments, two focusquestions were produced for each map. The goal of thesequestions is to guide the creation of the concept map andthey can be further used as competence questions toevaluate the ontology generated by the knowledge engi-neer. After being created, the map is fed into the supporttool, so that more details can be put into the map.

3.1. Support tool description

The main difference of the above-mentioned process isthe support tool. To test the acceptance of this method bythe expert, a proof-of-concept implementation of the tool

Fig. 1. Description of the

was prepared. Some relevant implementation details arediscussed in this section.

In this implementation, a question–answer user inter-action paradigm was chosen. This kind of interaction hasan advantage: it puts the user in a passive position inrelation to the application. This makes the training pro-cess easier while also making the application develop-ment simpler, since the interaction is more controlled.With this model, the main flow of the application is asfollows:

1.

pro

Read the concept map.

2. Use heuristics to generate suppositions about the

meaning of structures in the concept map.

3. Present these suppositions as a list to the user. The

user is allowed to mark a proposition as true or false.

4. Re-evaluate the suppositions on the light of this new

information.

5. If there are any unanswered suppositions, then go

back to 2.

6. Otherwise, save the result and exit.

The application is composed of two subsystems:a graphical user interface (GUI) subsystem and a sub-system responsible for evaluating the heuristics. The GUIcan be seen in Fig. 2.

The GUI is composed of two main areas. In the largerone, the concept map currently loaded is shown, and inthe second, the smaller one, a list of suppositions made bythe application is shown. Each supposition is composedof a text showing the supposition and a pair of yes/nobuttons for the user to inform the application if thesupposition is valid. Some suppositions may require, insteadof a yes/no input, the selection of one among severaldifferent options.

The inference subsystem is composed of three parts:a module for evaluating the heuristics (an inferenceengine), used to generate the suppositions about mean-ings of map elements or structures; a module to managethe generated suppositions and if the user has set them astrue or false, and a module implementing a simple truthmaintenance system.

posed method.

Page 5: Concept maps as the first step in an ontology construction method

Fig. 2. Example of the application screen.

Fig. 3. Example of a candidate transitive relation found in a

concept map.

R.R. Starr, J.M. Parente de Oliveira / Information Systems 38 (2013) 771–783 775

The heuristic execution module allows the implemen-tation of rules about the concept map, dealing with twocharacteristics of the map: its topology (or structure) andthe words used in concepts or linking phrases. Forexample, the heuristic to find transitive propertiessearches for linking phrases with the same text, connect-ing several concepts so as to form a chain, as shown inFig. 3. This heuristic uses only information about maptopology. Several heuristics were implemented in thisway to keep the application ‘‘domain-neutral’’, that is, toavoid the inclusion of a large vocabulary with complexnatural language relationships as part of the application.This was done mainly due to the unavailability of suchvocabulary. As an example, Wordnet is a good source forthis sort of relationship in English, but it is not nearly ascomprehensive in other languages, such as Portuguese.

On the other hand, an example of a heuristic that usesvocabulary information is the one used to identify gen-eralizations. This heuristic searches the map for linkingphrases that hint for a generalization relation, such as ‘‘isa’’, or ‘‘type of’’. To implement this text identification inthe map, a dictionary approach was used, so that eachheuristic that uses vocabulary has its own word/phraselist.3

Each heuristic, when executed, produces some suppo-sitions about the map. A supposition is a statement, suchas ‘‘hydraulic line is a subclass of means of energytransmission’’. This supposition is presented to the enduser in the form of a question, such as ‘‘Is hydraulic line atype of means of energy transmission?’’. After the useranswers one of these questions, this new data is enteredin the knowledge base, and the truth maintenance systemchecks which suppositions are still valid and if there areany new suppositions that became possible due to thisanswer. In the example above, if the user answers ‘‘Yes’’ to

3 In fact, it is a list of regular expressions.

the question, two other suppositions are automaticallyfilled: if any of ‘‘hydraulic line’’ or ‘‘means of energytransmission’’ is a class. Since they both are part ofgeneralization relationship, they both are classes (orperhaps one of them is an individual, depending on thelinking phrase label, this option is given to the user).

Two activities are performed by the supposition man-agement module: the management of each supposition’sstate and the presentation of these suppositions in theGUI. Each supposition may be in one of three states: notanswered, answered or invalidated. The first two statesoccur when the supposition has been created and displayedto the user and after the user has answered a question. The‘‘invalidated’’ state is entered when an answer to anothersupposition turns the supposition invalid or defines it astrue. This is shown to the user as a grayed-out question.These suppositions are not automatically thrown awaybecause, in case the user undoes a previous answer, thesupposition may become valid again.

At last, the truth maintenance module provides ser-vices to implement a knowledge base for executing theheuristics and the propositions. In implementation terms,most of this has been implemented using SWI-Prolog,with some parts implemented in Python (as the rest of theapplication).4 This module is used both to implement the‘‘undo’’ option for answered questions as well as todeactivate questions which answers can be inferred dueto other answers. The implementation of this module isvery simple, being based on three predicates that trackthe state of a supposition: it can be possible (that is, it isneither true nor false), true or false. With the addition ofthe ‘‘possible’’ state, suppositions can easily depend onothers, such as a linking phrase is a candidate relationshiponly if the predicate ‘‘relation’’ is ‘‘possible’’. If the useranswers a question informing that the linking phraserepresents a generalization relation, then a Prolog ruleautomatically infers that the relation predicate is false,therefore turning off the question that would ask if thatwas an important relation in the domain.

Overall, 16 heuristics have been implemented in thesystem. Some of them were based on the work by Macedo[34] and Brilhante et al. [33], while others were directlybased on the meaning of OWL DL operators. Some of theseheuristics are explained in detail below. It should benoted that these heuristics are all shallows in relation tothe concept map domain, that is, none of them makes useof domain specific vocabulary or constructs. This designdecision was taken to allow the tool to be generic: it canbe applied to any knowledge domain without the pre-vious preparation or the construction of a high levelontology about that domain (as was done, for example,in [12]).

As will be seen from the titles of the subsectionsbelow, most heuristics map directly onto OWL constructs.So, for each question the user answers affirmatively, a factis generated in the Prolog database with all the detailsthat must be saved in the resulting OWL ontology. When

4 The bridge between Python and SWI-Prolog was built using

PySWIP (http://code.google.com/p/pyswip/).

Page 6: Concept maps as the first step in an ontology construction method

5 Natural Language Toolkit—a natural language processing library

written in Python.

R.R. Starr, J.M. Parente de Oliveira / Information Systems 38 (2013) 771–783776

saved, the text of the concept or linking phrase isconverted to a ‘‘safe’’ format and the original text is savedas an rdfs:label property. While programming theheuristics, care was taken so that no heuristic wouldgenerate a construct that would create an OWL Fullontology.

3.1.1. Classes, individuals and relations

These heuristics are extremely simple and just supposethat every linking phrase is a candidate relation and thatevery concept that is connected to a linking phrase maybe a class or an individual. They work mostly as a fallbackif no other heuristic can generate a supposition about thatelement. The suppositions generated by these heuristicsare presented to the user as the last ones.

As an example, the ‘‘Relations’’ heuristic generates aquestion of the form ‘‘Is [y] an important relation?’’,where ‘‘[y]’’ is changed to the linking phrase text.

It should be noted that any concept node that isdisconnected from a linking phrase is ignored by theapplication. This decision was taken because, if thatconcept has no relationship with anything else in thedomain, it is probably not useful.

3.1.2. Inverse properties

This heuristic searches the map for linking phrasesthat are oriented exactly opposite from each other,namely: the arrows point towards reverse directions andevery concept that is on the right side of one of them mustbe on the left side of the other, and vice versa. Theselinking phrases are considered possible inverse proper-ties. It should be noted that complete agreement betweenthe concepts on the left side of one relation and those onthe right side of the other relation is necessary for thisheuristic to work.

3.1.3. Transitive properties

This heuristic searches for transitive relations, as pre-viously explained in Fig. 3.

3.1.4. Enumerations

Some classes may be defined by enumeration. Thisheuristic looks for these cases by inspecting the text of thelinking phrases and using a vocabulary of words that maysignal an enumeration. For example, linking phrases like ‘‘oneof’’ and ‘‘any of’’ would generate candidate enumerations.

3.1.5. Generalization and instantiation

According to concept maps own creators, the relation-ship that is most easily represented in concept maps is thetaxonomic relationship [21,37]. So, some heuristics thatsearch for this kind of relationship were implemented.Basically, these heuristics check if the text of the linkingphrase corresponds to a word in a vocabulary of wordswith a generalization meaning and if both sides of thelinking phrase are candidates for classes. To account forthe possibility of a linking phrase meaning an instantia-tion relation, a vocabulary for this case has also beencreated. Because these vocabularies overlap, sometimesthe user may have to decide if a given relation representsa specialization, an instantiation or none of them.

3.1.6. Noun phrases

This heuristic searches for concepts in the concept mapthat are noun phrases which may hint at a taxonomicrelationship. So, for example, a phrase like ‘‘institutionalcommittee’’ would generate the question ‘‘Is an institu-tional committee a type of committee?’’. To find thesetypes of phrases a part-of-speech tagger was used, builtwith the NLTK5 [38].

3.1.7. Functional properties

Functional properties are usually used to describeobjects attributes and are often used as part of morecomplex expressions to describe classes [39]. The heur-istic used to find these properties is quite simple and usesa dictionary to find linking phrases that may denote afunctional property, such as ‘‘has’’.

3.1.8. Different individuals

Since the unique name assumption is not valid in OWL,it is necessary to specify explicitly when individuals are infact different. This heuristic treats this case. This is one ofthe heuristics that does not generate a yes or no question.Instead, it presents the user with a list of individuals andthe user has to check those that are different amongthemselves.

3.1.9. Disjoint classes

This one is equivalent to the previous heuristic, exceptthat it deals with classes instead of individuals.

3.1.10. Mereologic relationships

Mereologic relationships are also important relation-ships in concept maps, specially in some domains, astechnical ones [10]. This paper used the work by Rectoret al. [40] as a guideline to implement the description ofmereologic relations. The heuristic that searches mereo-logic relations also uses a dictionary, but it differs fromthe aforementioned ones in that it does not show the usera yes/no question, but a list of possible concepts in theconcept map that were found to be possible componentsof the concept being described. The user can select theones that make sense. This approach, that is differentfrom the general rule used with other relations, is usedbecause, as a more detailed meaning is given to therelation, the expert may find that some of the conceptsthat she/he defined as ‘‘part of’’ another concept cannot,under this more strict interpretation, be considered parts.

A second heuristic has also been implemented that,after a mereologic relation has been declared, questionsthe user if every individual composed of the same partsdefined in that mereologic relation is also member of thedefined class. For example, if the user defines that ‘‘air-plane have wings’’, the system would ask her/him some-thing like ‘‘Should I consider that every individual that haswings is an airplane?’’.

Page 7: Concept maps as the first step in an ontology construction method

R.R. Starr, J.M. Parente de Oliveira / Information Systems 38 (2013) 771–783 777

3.1.11. Dictionaries and corpus

As was cited in the description of some heuristics,dictionaries were used to identify some special kinds ofrelationships. By ‘‘dictionary’’ it is meant a list of regularexpressions matching possible linking phrases. The origi-nal vocabulary was created by analyzing existing conceptmaps. This was done mainly in English and the expres-sions were then translated into Portuguese. Afterwards,the vocabulary was expanded with analysis of some mapsin Portuguese. The application has a feature that allows anadvanced user to expand the vocabulary. This part of theapplication can be translated quite simply to other lan-guages, given that there is a collection of maps to createthe vocabulary.

For the heuristics that use part-of-speech tagging, themac-morpho6 corpus was used to train the tagger. Thispart is more difficult to translate to other languages,mainly due to idiosyncrasies in the grammar rules ofdifferent languages. This can make the heuristics easier ormore difficult to implement.

4. Experiments and results

To evaluate the method proposed above, an experi-ment was conducted with 24 people, to check theirreaction to the method and test if the method would beaccepted by experts working in a corporate environment.

The main questions that this experiment tries toanswer are the following:

1.

Will domain experts consider concept maps as anacceptable knowledge representation means?

2

Will domain experts consider the use of the conversionapplication acceptable?

3.

Do domain experts consider that the process can bedone in an unsupervised (that is, without the presenceof a knowledge engineer) mode?

Also, some other impressions from the domain expertswere measured. These are presented in Section 4.4.

4.1. Experimental protocol

The experiment was designed so that each participantdid a complete iteration of the proposed method, asshown in Fig. 1. After that, each participant was askedto answer an evaluation form, describing her/his opinionsand sensations, both regarding the use of concept map-ping and the use of the support application. Before theexperiment, a small introduction to the goals of theexperiment and the trainings necessary for the processwere performed. The time spent in each step of theprocess was measured to ensure uniformity.

Each interview was composed of basically six parts: (1)introduction to the experiment, (2) training in ontologies,concept mapping and basic OWL concepts, (3) focusquestion creation, (4) concept map creation, (5) supportapplication usage and (6) answering the questionnaire.

6 http://www.nilc.icmc.usp.br/lacioweb/corpora.htm

Trainings. Regarding the trainings, short (maximum of10 min each) introductions were given regarding ontolo-gies, concept maps and OWL. The introduction on ontol-ogies focused on explaining the usefulness of them, togive each participant a sense of purpose for the experi-ment. The focus of this explanation was on ontologies asapplied to knowledge management activities. The expla-nation of concept maps focused on how to build them.Finally, the introduction to OWL explained briefly whatwas a class, an individual and a relation. This explanationwas very short (the average training time was less than5 min). The explanations were done verbally but were allbased on a standard text.

Focus questions. The standard methodology to produceconcept maps starts with the creation of a focus question,to guide the concept map creation and main theme [21,Appendix I]. From the authors previous experience withpeople inexperienced in concept mapping, it was knownthat sometimes the first question is too specific and theexpert is not able to start producing the map. So, heur-istically it was decided that each expert should providetwo focus questions, to provide a ‘‘fallback’’ question inthe case the expert could not advance in the map. Theexperiment was not designed to validate this heuristicbut, as will be seen in Section 4.3, the results regardingconcept map creation were positive.

In each interview, the concept map subject and thefocus questions related to it were chosen by the inter-viewee. The subject was always a technical one on whichthe interviewee was working by that time, worked with ina recent past or did some graduate level study. In eachexperiment two focus questions were suggested: one ofthe form ‘‘What isy’’ and the other of the form ‘‘What isthe importance ofy’’,7 although if the interviewee did notfeel comfortable with them they could be changed. Forexample, one of the maps was about ‘‘computationalaeroacoustics’’. The questions created in this case were(1) ‘‘What is computational aeroacoustics?’’ and (2)‘‘What is the importance of computational aeroacousticsin the aeronautical business?’’.

Concept map creation. Each interviewee had from 20 to40 min to create the concept map. Even though the limitof 40 min may be somewhat restrictive, this was done toensure the whole procedure would not take more thanone and a half hour (this is generally in line with, forexample, the experiment reported in [41]). To draw themaps, the tool CMapTools8 [42] was used. While creatingthe map the users were told only to use oriented arcs,because most heuristics depend on the orientation of thearcs. Before drawing the map, a short explanation on thetool usage was given (this explanation always took lessthan 5 min).

Support tool usage. After the map was finished, it wasloaded into the support tool. A brief explanation on thesupport tool usage was given and the users whereallowed to answer the questions. During the concept

7 The interviews and questions were all carried out in Portuguese, so

in this paper translations are presented.8 http://cmap.ihmc.us/.

Page 8: Concept maps as the first step in an ontology construction method

Table 1Questionnaire answered by each interviewee after the experiment.

Question Scale

1 What was your level of knowledge about concept maps before the experiment? Lð4Þ2 Have you ever been to a knowledge acquisition interview before? Yes/No

3 Were the questions provided by the support application relevant? Lð4ÞþN4 Did these questions help detect concepts that were not present in the map on the first time? Lð4Þ5 Did the questions help detect relationships that were not in the map on the first time? Lð4Þ6 Did the questions generate some insight on the subject matter? Lð4Þ7 Did you feel yourself comfortable creating the concept map? Lð3ÞþN8 How do you evaluate the concept map generated with respect to the depth of the subject? Lð5ÞþN9 Which of the following concepts would you associate to the process of creating the concept map? Please select up to five concepts n

10 How would you evaluate the ease of use of the support application? Lð5Þ11 Did you feel yourself comfortable using the support application? Lð4Þ12 Do you believe it is possible to execute this process unaided (i.e. without an expert in ontologies or concept maps physically present)? Lð4Þ13 Which of the following concepts would you associate to the support application? Please select up to five concepts n

LðnÞ, Likert scale with n items; N , neutral option; n, word selection list.

R.R. Starr, J.M. Parente de Oliveira / Information Systems 38 (2013) 771–783778

map generation step and the conversion tool usage step,interaction with the interviewee was reduced to a mini-mum, only to answer some basic questions about usage ofany of the tools. Care was taken not to hint in any way atthe creation of specific relationships in the concept mapor at answering questions in the conversion tool in anyspecific way. After creating the concept map and answer-ing the questions of the support application, the resultingontology was shown and explained to the intervieweeusing Protege9 [43], with the only goal of giving theexpert a sense of purpose.10

Questionnaire. The questionnaire is composed of 13 mainquestions plus 3 more questions used to check if theexplanations of ontologies, concept mapping and OWL wereclear enough (this was used to ensure that a negative wasnot due to a poor explanation or understanding of thegoals). In most of these questions, a Likert scale [21] wasused, as shown in Table 1. Before the interviewee begananswering the questionnaire, she/he was reminded that theanswers should be as honest as possible and that she/heshould not be concerned about giving negative answers.

In the table, the notation LðnÞ was used to indicate aLikert scale with n levels. So, Lð4Þ denotes a scale that goesfrom ‘‘Most of’’ to ‘‘None’’ on question 3 and from ‘‘Uncom-fortable’’ to ‘‘Extremely comfortable’’ on question 11. Thenotation N denotes that, besides the Likert scale there was aneutral option, such as ‘‘No opinion’’. In most of the questionswhere Likert scales were used, an even number of optionswas chosen, to enforce the interviewee to take a side.

On questions 9 and 13, the interviewee was asked tochoose up to 5 words from a list of 16 words. The goal ofthese questions was to identify the most common asso-ciations with concept maps and the support application.

Participants. The study involved 24 participants,11 allwho answered invitations to participate in the experiment.It is acknowledged that this introduces a bias in the results,

9 http://protege.stanford.edu/.10 In fact, this was a change in the original protocol after the first two

interviewees asked to see what they had produced. It was decided there-

after to include this as a standard procedure for the following interviews.11 In fact, 25 interviews were done, but 1 had to be ignored because

it was interrupted midway.

since those who answered the invitation already have amore positive view of the experiment than those who didnot respond. Nevertheless, this bias is considered unavoid-able. The interviewees background were mostly in severalkinds of engineering (18 out of 24). The other participantswere two teachers, two systems analysts, one humanresources analyst and one helicopter pilot. Out of thispopulation, 18 worked in an aerospace company (17 engi-neers and the HR analyst) and were professional colleaguesof one of the authors (Rodrigo Starr). All of them were onthe same hierarchical level, so there is no reason to suspectthat this would induce a bias. The other six were graduatestudents at ITA12 with several backgrounds and professions.

The median profile of the interviewees was 29.5 yearsof age, with 2 years of experience in the current area.Concerning education level, 12 had a masters degree and1 had a Ph.D. Arguably, the experience time is small forthem to be considered experts, but more experiencedpeople were not available for the experiment.

The sample size was considered in line with otherresearches in the knowledge acquisition area (such as[44,36]), given the typical difficulties in finding peoplewith an adequate profile.

Data collection. Each interview was conducted withoutinterruption. No previous preparation was given to theinterviewee. Except for one instance, only one interviewwas carried out at a time. In this exceptional instance, twointerviews were carried out at the same time. All inter-views were done by one of the authors (Rodrigo Starr).

4.2. Results

The answers to the questionnaire are shown in Table 2,together with the most or the two most selected answers.It can be seen that, overall, most answers are concen-trated on one side of the scale (i.e. with most usersagreeing in a positive or negative answer).

The total time spent with training was always below32 min (18 min in average), which is regarded as evidence

12 ITA is the Technological Institute of Aeronautics, the institution to

which the authors are affiliated.

Page 9: Concept maps as the first step in an ontology construction method

Table 2Answers to the questionnaire.

Q. Answers Most common answers

1 2 3 4 5 6

3 Were the questions provided by the support application relevant?

18 5 1 0 0 1: Most

4 Did these questions help detect concepts that were not in the map on the firs time?

1 4 8 11 4: No new concept was found

3: From 1 to 3 new concepts were found

5 Did the questions help detect relationships that were not initially in the map?

1 3 9 11 4: No new relationship was found

3: From 1 to 3 new relationships were found

6 Did the questions generate some insight on the subject matter?

7 10 3 4 2: Generated at least one insight

1: Did not generate any insight

7 Did you feel yourself comfortable creating the concept map?

3 3 18 0 3: Comfortable: the map construction was pleasant

8 How do you evaluate the generated map with respect to the subject’s depth?

5 4 2 11 2 0 4: Reasonably detailed: the map may be successfully used

to explain subject details to a lay person

10 How would you evaluate the ease of use of the support application?

0 3 3 11 7 4: Easy: most of the application features were clear

5: Very easy: the application use is intuitive

11 Did you feel yourself comfortable using the support application?

1 4 15 4 3: Comfortable

12 Do you believe it is possible to execute this process unaided?

6 12 5 1 2: Unaided, with an expert accessible by phone

1: Completely unaided: a text description is enough

Q.: question number.

R.R. Starr, J.M. Parente de Oliveira / Information Systems 38 (2013) 771–783 779

of the method’s easiness. Also, it is important to note thatsome interviewees complained that the time to create theconcept map was too short. Overall, most intervieweeshad only simple doubts during concept map creation andwhen answering the support application questions.

The questionnaire contained five more questions thatwere used to ensure that the explanations given weregood enough and that the population was not biased byincluding too many people with a knowledge of conceptmaps. These responses are shown in Table 3. About 58% ofthe interviewees (14/24) had never heard about conceptmaps before and only 12% (3/24) had used it at least once.As for the explanations, mostly all subjects consideredthem acceptable, especially those for concept maps andon the usage of the support tool. The explanation of thefew description logic concepts was also considered goodenough, though in this case the responses were not asstrong as with concept maps.

4.3. Analysis

To evaluate the main hypotheses, the answers weregrouped in two categories: positive and negative. Thena standard binomial test [45] was applied, considering ap-value always smaller that 5% ðpo0:05Þ. All the hypoth-eses refer to a proportion of the population of experts.Since there is no previous expected value for the popula-tion’s proportion, the test was used to find the minimumproportion respecting that p-value.

Hypothesis 1. Concept maps are an acceptable knowl-edge representation means.

This hypothesis was evaluated by question 7 of thequestionnaire. In this case, two results were considered:(1) only answers that confirmed the expert was comfor-table building the map (answer number 3) and (2)

Page 10: Concept maps as the first step in an ontology construction method

Table 3Answers for questions used to ensure that the explanations were clear.

Q. Answers Most common answers

1 2 3 4

1 What was your level of knowledge about concept maps before the experiment?

14 7 3 0 1: None

2 Have you ever been to a knowledge acquisition interview before?

2 22 2: No

14 After the provided explanation was it clear how to build concept maps?

0 3 4 17 4: Yes

3: Generally yes

15 Were the concepts used by the support application clear after the explanation?

0 3 7 14 4: Yes

3: Generally yes

16 After the provided explanation was it clear how to use the support application?

0 1 12 11 3: After some minutes I had no

problems using it

4: Yes

Q.: question number.

R.R. Starr, J.M. Parente de Oliveira / Information Systems 38 (2013) 771–783780

answers indicating that the expert found usage of the mapcomfortable or as good as any other means of expression(answers numbers 2 and 3).

For the first case, there were 18 out of 24 positiveresponses. With a po0:05, it is possible to infer that atleast 56% of the population considered concept maps as acomfortable means of expression.

Considering the second case, 21 out of 24 expertsconsidered concept maps as good as others means ofexpression. This allows the inference that at least 70% ofthe population consider concept maps as an acceptablemeans of expression.

These results are interpreted as evidence that conceptmaps can be used as a tool to assist in knowledgeacquisition. The relatively high acceptance rate in thesecond case indicates that, even when they are not thepreferred method of expression, they are considered asgood as other methods.

Hypothesis 2. Use of the support application is accepta-ble by the experts.

This hypothesis was evaluated by question 11 of thequestionnaire. The answers 4 (extremely comfortable)and 3 (comfortable) were considered positive answers,while the other two were considered negative. Followingthe same procedure shown before, it is possible to inferthat at least 61% of the population will consider the use ofthe support application acceptable (again, with po0:05).

Again, this result was interpreted as evidence that theproposed way of generating ontologies from conceptmaps will be generally accepted by experts. Anotherpoint, that was not measured directly but that is worthmentioning, is that most of the negative responses to thesupport application were related to concept maps that didnot trigger any ‘‘complex’’ heuristics, such as identifica-tion of mereologic or generalization relationships. Thissuggests that improvements to the heuristics wouldincrease the acceptability rate.

Hypothesis 3. Experts consider that the process can bedone without supervision.

Question 12 evaluated this hypothesis. Two resultswere considered: (1) the process could be done in acompletely unsupervised way (answer number 1) and(2) the process could be done completely unsupervised orwith an expert in the process available by phone (answernumbers 1 and 2).

For the first case (completely unsupervised), there wereonly 6 out of 24 positive responses. This gives an inferredproportion of the population of only 11% ðpo0:05Þ thatwould accept the process in a completely unsupervised way.

Considering the second case, where an expert on theprocess is available by phone, the inferred proportionrises to 56% ðpo0:05Þ. It can be argued that this value isstill low, although far better than the previous one.

From these results, it was concluded that the processcannot be done completely without supervision. Even theproportion that would accept the process with an expertavailable by phone was not considered high enough to beable to accept the hypothesis (to the best of the authorsknowledge, there is no reference value available in theliterature). Some experts mentioned that, after some moreexperience both with concept maps and the conversiontool, they would be able to work without supervision. Thisraises the possibility that a longer training could make theprocess more acceptable.

4.4. Additional comments

The responses for question 3 indicates that the questionsproduced by the support application were relevant to theinterviewees. Twenty-three interviewees responded that atleast some of the questions were relevant. This is anindication that the application was good enough to beuseful for evaluating the method.

Questions 4 and 5 ask if the support tool helped detectnew concepts or relations in the original concept map.The answers show that, in general, few new concepts orrelations were detected while using the support tool. Thisis taken as evidence that the questions asked by thesupport application were too shallow with respect tothe domain of the concept map, that is to say, they did

Page 11: Concept maps as the first step in an ontology construction method

R.R. Starr, J.M. Parente de Oliveira / Information Systems 38 (2013) 771–783 781

not explore the concept map very well. This may beimproved through better heuristics.

Interestingly, the answers to question 6 indicate that, forabout half (13) of the interviewees, the use of the supportapplication generated at least one insight on the conceptmap subject matter. This was interpreted as a result of thefocus on ontological relationships of the questions, focusthat is usually uncommon for an expert. This interpretationis in agreement with the theory that expert knowledge ismore pragmatic than theoretical and that the knowledgeacquisition activity must be considered as a modelingactivity where the expert structures her/his knowledge,sometimes for the first time [46,47]. The description of theinsight generated was not part of the questionnaire, but, asan example, one user reported that, while answering thesupport tool questions, he noticed some relationshipsbetween concepts that he had never thought about before.

In question 9 (not shown in Table 2), it was asked for theinterviewees to select up to five words (from a list of 16)that they thought were related to the concept mappingactivity. In this case, the most selected words were ‘‘Illus-trative’’ (18 votes) and ‘‘Interesting’’ (16 votes). Also, most ofthe words selected (93 out of a total of 101 votes) had apositive connotation, indicating that most interviewees’feelings towards the task of concept mapping building werepositive. Also, contrary to what was previously reported [21,p. 195], ‘‘Difficult’’ had only three votes, what has beenconsidered as additional evidence that concept maps wouldbe well accepted in this population.

Question 13 (also not shown in the Table) asked thesame question but about the support tool. The most votedwords were ‘‘Useful’’, with 14 votes and ‘‘Clarifying’’ with 12votes. Nevertheless, in this case the proportion betweennegative and positive votes was different, with 56 votes forpositive words and 21 votes for negative words.

4.4.1. Concept maps and ontologies quality

Throughout the interviews, the size and comprehen-siveness of the generated concept maps and ontologieswere quite variable. To better illustrate this point, Table 4shows some data about the concept maps. In this table,the line labeled ‘‘Total’’ is the number of map elements(i.e. the number of concepts plus the number of linkingphrases) and the line labeled ‘‘Connections’’ counts thenumber of arcs in the map (where both arcs going to andcoming from a linking phrase were counted). The line‘‘Difference’’ describes the difference between the numberof connections and the number of elements. The negativenumber in the ‘‘Difference’’ row is because one of themaps was in fact a tree.

Table 4Some descriptive statistics about the maps.

Feature Min. Max. Average Median SD

Concepts 12 43 22.79 20.5 8.36

Linking phrases 5 69 20.75 18.5 13.10

Total 17 112 43.54 38 20.62

Connections 19 155 50.25 43 27.89

Difference �1 43 6.71 5 8.51

SD: standard deviation.

One of the ways to analyze the map structure iscounting the difference between the number of connec-tions and the number of map elements. In case the graphwas a tree (i.e. a connected graph without cycles), thisdifference would always be �1. Assuming that the graphis connected, each unit different from �1 means theexistence of a cycle, so a graph with a value 1 for thedifference must have two cycles. Taking into account thatonly one experiment did not produce a connected conceptmap, it can be seen that, in general, the maps did not havetoo many cycles (about 6.71 cycles per map). This is anindication that the concept maps generated had in generalfew details, because, according to the cognitive theoryunderlying concept maps [21], a map that representsdeeply the knowledge domain should have several cycles.

Table 5 shows some descriptive statistics on thegenerated ontologies. These ontologies are basic ontolo-gies generated after the user responded all questionsposed by the support application.

Notwithstanding the fact that there is so far no domainindependent way to analyze the quality of an ontology[48], the size of the generated ontologies was in generalsmall (with less than 11 classes or 18 individuals), reflect-ing in part the small size of the concept maps. Part of thiscan be attributed to the inexperience of the interviewees increating concept maps, part of it to the relatively shorttime available to create the map and part to deficienciesinherent to the support application. The average number ofgeneralization relations produced was very low, probablydue to the inability of the support application to detectmore generalization relations and the lack of experience ofthe interviewees with concept mapping.

At last, an item that was not asked directly but wasbrought up by several users during the interviews was thequality of the generated ontology. After visualizing theresult (in Protege), several users considered it acceptable,but said that, after they had seen the end product andunderstood the process better, they would have includedother information in the map. This was seen as evidencethat it is possible to guide the expert to generate bettercontent yet before the first interaction with the knowl-edge engineer, given good enough tool support.

4.4.2. Number of questions generated by the support

application

The number of questions generated by the applicationwas not recorded as part of the protocol, but somecomments on it are important in light of the experiments.

Table 5Some descriptive statistics on the resulting ontologies.

OWL Construct Min. Max. Average Median SD

Classes 1 36 15 13 9.65

Individuals 0 34 10.88 10 10.96

Relations 5 23 13.88 13 5.59

Generalizations 0 16 4.67 2.5 5.35

Domains 0 29 7.46 8 6.88

Ranges 0 23 6.46 5.5 6.05

SD: standard deviation.

Page 12: Concept maps as the first step in an ontology construction method

R.R. Starr, J.M. Parente de Oliveira / Information Systems 38 (2013) 771–783782

From the heuristics presented, it can be seen that theuser will be presented with at least one question perconcept and one question per linking phrase in theconcept map. This number of questions can overwhelmthe user in large maps. Nevertheless, answers to most ofthe other heuristics, if positive, produce a marked reduc-tion in the number of questions. As an example, in thebiggest concept map produced in the experiment, 161questions were produced (for a map with 43 concept and69 linking phrases). After answering positively the first 22questions, the number of active questions (that is, ques-tions that still make sense after the given answers) hasfallen to 102.

Of course, the reduction in the number of questionspresented to the expert depends on the quality of theheuristics and on the kind of relationships that the expertchose to express in the map. During the experiments,there were some cases of very successful matching ofheuristics, where most of the linking phrases drawn bythe user represented taxonomic or mereologic relation-ships. Also, there where cases were only the concept andrelation heuristics were matched, and in these cases theuser had to answer a lot of repetitive questions (this wasin fact mentioned by some users).

5. Conclusion

This paper presented a method that can be used as thefirst step in an ontology acquisition methodology, and themain contributions are

The use of conceptual mapping as the intermediateknowledge representation language. � The use of a tool to process the generated conceptual

map and question the expert with the goal of refiningthe ontological descriptions presented in the map,before the first iteration with the knowledge engineer.

To test the method, a proof of concept of the supportapplication was built and a set of 24 volunteers tested themethod, answering a questionnaire after completion. Theresults have shown that the following two hypothesismay be considered true, with a po0:05

1.

Concept maps are an acceptable knowledge represen-tation means for at least 70% of the experts.

2.

The use of the proposed support application is accep-table by at least 61% of the experts.

13 Versions of Wordnet in different languages can be found in http://

multiwordnet.fbk.eu/english/home.php.

Notwithstanding these two positive results, the thirdhypothesis had to be rejected, that is, experts did notconsider that the proposed process could be done com-pletely without supervision. If this requirement is relaxedto allow support on the process provided by phone, theagreement proportion of the experts raise to 56%, which isa far better result but that was not yet considered enoughto support the claim that the use of this process canreduce the participation of the knowledge engineer in thefirst stages of the knowledge acquisition process.

These results nevertheless confirmed the previous resultsthat concept maps are a viable way for experts to interactwith a knowledge acquisition tool and, more important, thatthe concept proposed for the support application is alsoviable. It is believed that more improvements in the supportapplication and a slightly longer training in concept mapsand the support tool may increase the acceptability rate ofthe process as an unsupervised knowledge acquisitionprocess by the experts. It is important to notice that allthe training given amounted to no more than 30 min.

Some research avenues opened by this work are thefollowing:

1.

Study the interaction of experts with larger conceptmaps and ontologies and to improve the support toolto use previously defined ontologies together with thegenerated questions. This would facilitate, for exam-ple, the creation of several concept maps, by differentexperts and the merging of these concept maps super-vised by the experts.

2.

Testing of this process step together with other ontol-ogy development methodologies.

3.

Study of more heuristics to be used to convert fromconcept maps representations to ontological constructs.The set used in this application may be considered as astarting point.

Also, there is interest in improving the natural lan-guage processing capabilities of the tool. One of the moredirect improvements would be to allow it to use vocabu-lary databases, such as the Princeton Wordnet and Word-nets in other languages,13 to avoid term ambiguity and tosuggest new relationships from synsets.

Another improvement would be the possibility ofconnecting the tool to the existing ontologies and suggestconcepts or relations to the user or to ask the user aboutsynonyms (and with the answers begin the alignment ofthe user ontology with the preexisting one). This featurewould also allow the integration of upper level ontologiesinto the process.

Finally, it would be interesting to study how theapplication generalizes to other languages. The applica-tion’s heuristics were built in Portuguese and it is con-sidered that most heuristics (14 of the 16) would be easilyconvertible to English as well, while the other two areconvertible but would require some more effort. Otherlanguages may pose more specific problems, such as thosethat do not follow a subject–verb–object order.

Acknowledgments

The authors would like to thank all those who parti-cipated in the experiments, both during debugging of thetool and during the data collection phase.

Also, the authors would like to thank Carina T. Rizzi forkindly revising the manuscript.

Page 13: Concept maps as the first step in an ontology construction method

R.R. Starr, J.M. Parente de Oliveira / Information Systems 38 (2013) 771–783 783

References

[1] T. Berners-Lee, J. Hendler, O. Lassila, The semantic web, ScientificAmerican 284 (5) (2001) 28–37.

[2] D. Fensel, C. Bussler, Y. Ding, V. Kartseva, M. Klein, M. Korotkiy,B. Omelayenko, R. Siebes, Semantic web application areas, in:Proceedings of the Seventh International Workshop on Applicationsof Natural Language to Information Systems (NLDB 2002), Citeseer,Stockholm, Sweden, 2002.

[3] J. Davies, D. Fensel, F. van Harmelen (Eds.), Towards the SemanticWeb: Ontology-Driven Knowledge Management, John Wiley & Sons,Ltd., Chichester, 2003.

[4] M. Hepp, Possible ontologies: how reality constrains the develop-ment of relevant ontologies, IEEE Internet Computing 11 (1) (2007)90–96.

[5] G. Schreiber, H. Akkermans, A. Anjewierden, R. de Hoog, N. Shadbolt,B. Wielinga, Knowledge Engineering and Management—The Com-monKADS Methodology, MIT Press, Cambridge, 1999.

[6] A.C. Scott, J.E. Clayton, E.L. Gibson, A Practical Guide to KnowledgeAcquisition, Addison-Wesley Publishing Co., Inc., Boston, 1991.

[7] A. Hart, Knowledge Acquisition for Expert Systems, 2nd ed.,McGraw-Hill, New York, 1992.

[8] J. Evermann, J. Fang, Evaluating ontologies: towards a cognitivemeasure of quality, Information Systems 35 (4) (2010) 391–403.

[9] K.M. Ford, J.M. Bradshaw, J.R. Adams-Webber, N.M. Agnew, Knowl-edge acquisition as a constructive modeling activity, InternationalJournal of Intelligent Systems 8 (1993) 9–32.

[10] M. Stokes (Ed.), Managing Engineering Knowledge, ASME Press,New York, 2001.

[11] A. Zouaq, D. Gasevic, M. Hatala, Towards open ontology learningand filtering, Information Systems 36 (7) (2011) 1064–1081.

[12] P. Clark, J. Thompson, K. Barker, B. Porter, V. Chaudhri, A. Rodriguez,J. Thomere, S. Mishra, Y. Gil, P. Hayes, T. Reichherzer, Knowledgeentry as the graphical assembly of components, in: Proceedings ofthe First International Conference on Knowledge Capture, ACM,2001, pp. 22–29.

[13] S. Bussmann, N.R. Jennings, M. Woolridge, Multiagent Systems forManufacturing Control—A Design Methodology, Springer-Verlag,Berlin, 2004.

[14] R. Studer, V.R. Benjamins, D. Fensel, Knowledge engineering:principles and methods, Data & Knowledge Engineering 25 (1–2)(1998) 161–197.

[15] The MOKA Consortium, Coventry, MOKA User Guide (MOKAModelling Language Core Definitions), June 2000.

[16] H. Eriksson, R.W. Fergerson, Y. Shahar, M.A. Musen, Automaticgeneration of ontology editors, in: Proceedings of the 12th BanffKnowledge Acquisition Workshop, 1999.

[17] N.F. Noy, W. Grosso, M.A. Musen, Knowledge-acquisition interfaces fordomain experts: an empirical evaluation of protege-2000, in: Pro-ceedings of the 12th International Conference on Software andKnowledge Engineering. Chicago, USA, July 5–7, 2000, Citeseer, 2000.

[18] K.K. Breitman, J.C.S. do Prado Leite, Ontology as a requirementsengineering product, in: Proceedings of the 11th IEEE InternationalRequirements Engineering Conference, IEEE Computer Society,Washington, 2003, pp. 309–319.

[19] C.H. Felicıssimo, L.F. da Silva, K. Breitman, J.C.S.P. Leite, Gerac- ~ao deontologias subsidiada pela engenharia de requisitos, in: Anaisdo WER03—Workshop em Engenharia de Requisitos, 2003,pp. 255–269.

[20] Y. Sure, R. Studer, A methodology for ontology-based knowledgemanagement, in: Davies et al. (Eds.), Towards the Semantic Web:Ontology-Driven Knowledge Management, John Wiley & Sons, Ltd.,Chichester, 2003, pp. 33–46 (Chapter 3).

[21] J.D. Novak, Learning, Creating and Using Knowledge: Concept Mapsas Facilitative Tools in Schools and Corporations, Lawrence ErlbaumAssociates, Mahwah, 1998.

[22] J.D. Novak, A.J. Canas, The Theory Underlying Concept Maps andHow to Construct and Use Them, Technical Report 2006-01 Rev 01-2008, Florida Institute for Human and Machine Cognition (IHMC),2008.

[23] A.G. Castro, P. Rocca-Serra, R. Stevens, C. Taylor, K. Nashar,M.A. Ragan, S.-A. Sansone, The use of concept maps during knowl-edge elicitation in ontology development processes—the nutrige-nomics use case, BMC Bioinformatics 7 (1) (2006) 267.

[24] A. Garcia, S.-A. Sansone, P. Rocca-Serra, C. Taylor, M.A. Ragan, Theuse of conceptual maps for two ontology developments: nutrige-nomics, and a management system for genealogies, in: EighthInternational Protege Conference, 2005.

[25] A. Willemsen, G. Jansen, J. Komen, S. Van Hooff, H. Waterham,P. Brites, R. Wanders, A. van Kampen, Organization and integrationof biomedical knowledge with concept maps for key peroxisomalpathways, Bioinformatics 24 (16) (2008) i21–i27.

[26] J.W. Coffey, R.R. Hoffman, A.J. Canas, K.M. Ford, A concept map-based knowledge modeling approach to expert knowledge sharing,in: The IASTED International Conference on Information andKnowledge Sharing, ACTA Press, 2002.

[27] F. Lehmann, Semantic networks, Computers and Mathematics withApplications 23 (2–5) (1992) 1–50.

[28] D. Nardi, R.J. Brachmann, An introduction to description logics, in:Baader et al. (Eds.), The Description Logic Handbook: Theory,Implementation and Applications, Cambridge University Press,Cambridge, 2002, pp. 5–44.

[29] R. Kremer, Constraint Graphs: A Concept Map Meta-Language,Ph.D. Thesis, The University of Calgary, 1997.

[30] H. Gomez-Gauchıa, B. Dıaz-Agudo, P. Gonzalez-Calero, Two-layeredapproach to knowledge representation using conceptual mapsdescription logic, in: Proceedings of the First International Conferenceon Concept Mapping, Universidad Publica de Navarra, Pamplona, 2004.

[31] P. Hayes, R. Saavedra, T. Reichherzer, A Collaborative DevelopmentEnvironment for Ontologies (CODE), in: Semantic IntegrationWorkshop (ISWC), 2003.

[32] P. Hayes, T.C. Eskridge, M. Mehrotra, D. Bobrovnikoff, T. Reich-herzer, R. Saavedra, COE: Tools for Collaborative Ontology Devel-opment and Reuse, Technical Report, Institute for Human &Machine Cognition, FL, 2005.

[33] V. Brilhante, G.T. Macedo, S.F. Macedo, Heuristic transformation ofwell-constructed conceptual maps into OWL preliminary domainontologies, in: Proceedings of the Second Workshop on Ontologiesand their Applications, CEUR-WS, Ribeir~ao Preto, 2006.

[34] G.T. de Macedo, Construc- ~ao de ontologias de domınio a partir demapas conceituais, Mestrado, Universidade Federal do Amazonas,March 2007.

[35] V. Chaudhri, K. Murray, J. Pacheco, P. Clark, B. Porter, P. Hayes,Graph-based acquisition of expressive knowledge, EngineeringKnowledge in the Age of the Semantic Web 3257 (2004) 231–247.

[36] V. Dimitrova, R. Denaux, G. Hart, C. Dolbear, I. Holt, A.G. Cohn,Involving domain experts in authoring OWL ontologies, LectureNotes in Computer Science 5318 (2008) 1–16.

[37] N. Derbentseva, F. Safayeni, A.A. Canas, Experiments on the effectsof map structure and concept quantification during concept mapconstruction, in: Proceedings of the First International Conferenceon Concept Mapping, vol. 1, Universidad Publica de Navarra,Pamplona, 2004.

[38] S. Bird, E. Klein, E. Loper, Natural Language Processing with Python,Oreilly & Associates Inc., 2009.

[39] A. Borgida, R.J. Brachman, Conceptual modeling with descriptionLogics, in: Baader et al. (Eds.), The Description Logic Handbook:Theory, Implementation and Applications, Cambridge UniversityPress, Cambridge, 2002, pp. 359–381 (Chapter 10).

[40] A. Rector, C. Welty, N. Noy, E. Wallace, Simple Part-Whole Relationsin OWL Ontologies, August 2005.

[41] T. Reichherzer, D. Leake, Understanding the role of structure inconcept maps, in: Proceedings of the 28th Annual Conference of theCognitive Science Society, Citeseer, Vancouver, Canada, 2006,pp. 2004–2009.

[42] A.J. Canas, G. Hill, R. Carff, N. Suri, J. Lott, T. Eskridge, G. Gomez, M.Arroyo, R. Carvajal, CmapTools: A knowledge modeling and sharingenvironment, in: Concept Maps: Theory, Methodology, Technology.Proceedings of the First International Conference on ConceptMapping, vol. 1, Citeseer, 2004, pp. 125–133.

[43] M. Horridge, H. Knublauch, A. Rector, R. Stevens, C. Wroe, APractical Guide to Building OWL Ontologies using the Protege-OWL Plugin and CO-ODE Tools, University of Manchester, 1st ed.,August 2004.

[44] N. Milton, D. Clarke, N. Shadbolt, Knowledge engineering andpsychology: towards a closer relationship, International Journal ofHuman–Computer Studies 64 (12) (2006) 1214–1229.

[45] D.C. Montgomery, G.C. Hunger, Applied Statistics and Probabilityfor Scientists and Engineers, 3rd ed., John Wiley & Sons, 2003.

[46] B.R. Gaines, Modeling practical reasoning, International Journal ofIntelligent Systems 8 (1993) 51–70.

[47] B.R. Gaines, M.L.G. Shaw, J.B. Woodward, Modeling as frameworkfor knowledge acquisition methodologies and tools, InternationalJournal of Intelligent Systems 8 (1993) 155–168.

[48] J. Brank, M. Grobelnik, D. Mladenic, A survey of ontology evaluationtechniques, in: Proceedings of the Conference on Data Mining andData Warehouses (SiKDD 2005), Citeseer, 2005, pp. 166–170.


Recommended