Ontology development and evaluation Dr. Alexandra I. Cristea.

Post on 18-Jan-2016

221 views 0 download

transcript

Ontology development and evaluationDr. Alexandra I. Cristea

Why develop an ontology?Shared understanding of information structure

(between people or agents)Enable reuse of domain knowledgeMake domain assumptions explicitSeparate domain knowledge from operational knowledgeAnalyse the domain knowledgeOntology Development 101: A Guide to Creating Your First Ontology

2

What is an ontology? (reminder)A way of encoding domain knowledge, linking the knowledge, which allows for reasoning with the dataOntologies allow for data integration and inference, for automated query-answering and automated use of dataformal explicit description of concepts in a domain of discourse (classes/concepts), concept properties (slots/roles/ properties), and restrictions on

slots (facets /role restrictions)

3

Developing an ontology in praxis (reminder)

defining classesarranging classes in hierarchy (sub/superclass)defining properties/slots & their allowed values, filling in values for slots for instances

4

Rules in ontology design: a simple knowledge engineering methodology

1. There is no one correct way to model a domain— there are viable alternatives. Best solution depends on application and extensions.

2. Ontology development is an iterative process.3. Concepts should be close to objects (physical/logical) and

relationships in domain of interest. – E.g., nouns (objects) or verbs (relationships) in domain describing sentences.

5

Ontology design steps1. Determine the domain and scope of the ontology2. Consider reusing existing ontologies3. Enumerate important terms in the ontology4. Define the classes and the class hierarchy5. Define the properties of classes—slots6. Define the facets of the slots7. Create instances 6

1. Domain and scope: general questions

What is the domain that the ontology will cover?For what we are going to use the ontology?For what types of questions the information in the ontology should provide answers?Who will use and maintain the ontology?

7

1. Domain and scope: general questionsWhat is the domain that the ontology will cover?– E.g., wine and food

For what we are going to use the ontology?– E.g., for applications that suggest good combinations of wines and food

For what types of questions the information in the ontology should provide answers?– See competency questions

Who will use and maintain the ontology?– E.g., users could be restaurant customers deciding which wine to order – so

price is necessary– Maintenance may use different language – so mapping is necessary 8

a list of questions that a knowledge base based on the ontology should be able to answerThese questions will serve as the litmus test later: – Does the ontology contain enough information to

answer these types of questions? – Do the answers require a particular level of detail or

representation of a particular area?9

1. Domain and scope: competency questions

1. Example competency questionsWhich wine characteristics should I consider when choosing a wine?Is Bordeaux a red or white wine?Does Cabernet Sauvignon go well with seafood?What is the best choice of wine for grilled meat?Which characteristics of a wine affect its appropriateness for a dish?Does a bouquet or body of a specific wine change with vintage year?What were good vintages for Napa Zinfandel?

10

2. Reusing existing ontologiesmay be requirement if system needs to interact with other applicationsmany ontologies are already availablemay require translation of used formalism Libraries: SWOOGLE; DAML Library; Ontolingua

Consider also domain specific lists, classifications – e.g. lists of wine properties from commercial websites

11

3. Important terms

write down a list of all terms we would like either to make statements about or to explain to a user– What are the terms we would like to talk about? – What properties do those terms have? – What would we like to say about those terms?

12

4. Classes and hierarchy: approaches

top-down development process starts with definition of most general concepts in domain and then specialization of concepts.bottom-up development process starts with definition of most specific classes, hierarchy leaves, with subsequent grouping of these classes into more general concepts.combination development process = combination of top-down & bottom-up approaches: define more salient concepts first and then generalize and specialize them appropriately. 13

4. Classes and hierarchy: approach examples

top-down – start with general concepts Wine, Food; specialize Wine into WhiteWine,

RedWine, Roséwine etc.

bottom-up – start with leaves of hierarchy: Pauillac and Margaux wines; then, the

common superclass: Medoc, which is a subclass of Bordeaux.

combination – Might start with top-level concepts Wine, then specific concepts Margaux,

then mid-concepts Medoc, etc.14

5. Properties (slots)Classes alone won’t answer all

competency questions (Step 1). Once we have defined some classes, we must describe the internal structure of concepts.After selecting classes from the list of terms (Step 3), remaining terms are likely to be properties of classes.For each property, we must determine which class it describes. These become slots attached to classes.

15

5. Object property types“intrinsic” properties (e.g., flavour of a wine)“extrinsic” properties (e.g., wine’s name, and provenance area)parts, for structured object (e.g., courses of a meal)relationships to other individuals (e.g., maker of wine, relationship between wine and winery)

16

6. Facets of properties/slotsvalue type (string, number, Boolean, enumerated, instance)allowed values (e.g., name of wine is a String)number of the values (cardinality: exact, min, max) Domain: classes to which slot is attached (I)Range: allowed classes for slots of type instance (O)

17

6. Domain and Range of properties: design issuesIf a list of classes defining a range or a domain of a slot includes a class and its subclass, remove the subclass.If a list of classes defining a range or a domain of a slot contains all subclasses of a class A, but not class A, the range should contain only class A and not the subclasses.If a list of classes defining a range or a domain of a slot contains all but a few subclasses of a class A, consider if class A makes a more appropriate range definition.

18

7. Create instances1) choosing a class

– E.g., class BeaujolaisWine

2) creating an individual instance of that class– E.g., Chateau-Morgon-Beaujolais

3) filling in the property (slot) values– E.g., hasBody Light; hasColor Red; hasFlavour

Delicate, etc. 19

Error detection in building ontologies

20

4. Classes and class hierarchy: correctionIs-a: hierarchical relations:– A subclass of a class represents a concept that is a “kind of” the

concept that the superclass represents.– E.g., Chardonnay is a subclass of WhiteWine– Common mistake: to include both singular and plural version of

same concept in hierarchy, making former subclass of latter. (e.g., class Wines and a class Wine as a subclass is an error).

21

4. Classes and class hierarchy: correctionTransitivity of hierarchical relations– If B is a subclass of A and C is a subclass of B, then C is a subclass

of A– E.g., class Wine, and then define a class WhiteWine as a

subclass of Wine. Then we define a class Chardonnay as a subclass of WhiteWine.

– Conclusion: So, Chardonnay will be also a subclass of Wine.

22

4. Classes and class hierarchy: correction

Evolution of a class hierarchy – a subclass now may become a subclass of another class later– E.g., Zinfandel wines were red. Therefore, Zinfandel was a

subclass of the RedWine class. Now, they also are rose. – Conclusion: We need to break the Zinfandel class into two

classes— WhiteZinfandel and RedZinfandel — and classify them as subclasses of RoseWine and WhiteWine, respectively.

23

Classes as their names:– Classes represent concepts in the domain and not the words that denote these concepts.

• e.g. renaming the class Shrimp to Prawn represents the same concept– Synonyms for the same concept do not represent different classes.– Common mistake: to have the two classes above (avoid if possible)

24

4. Classes and class hierarchy: correction

Avoiding class cycles:– there is a cycle in a hierarchy when some class A has a subclass B

and at the same time B is a superclass of A. – Correction: Creating such a cycle in a hierarchy amounts to

declaring that the classes A and B are equivalent: all instances of A are instances of B and all instances of B are also instances of A.

25

4. Classes and class hierarchy: correction

4. Classes: corrections

Siblings in a class hierarchy:– All the siblings in the hierarchy (except for the ones at the root)

must be at the same level of generality. – Common mistake: White wine and Chardonnay being

subclasses of the same class (should be avoided)

26

4. Classes: corrections

How many is too many and how few is too few?– If a class has only one direct subclass there may be a modelling

problem, or the ontology is not complete.• E.g., an only subclass CotesD’Or for RedBurgundy may be an error.

– If there are more than a dozen (12) subclasses for a given class then additional intermediate categories may be necessary.• E.g., a long list of wines under the class Wine, without any sub-

hierarchy, may be an error.

27

4. Classes: other issuesMultiple inheritance– These are allowed in most systems – but beware that all slots and facets are also inherited– E.g., Port is also RedWine as well as DesertWine. Conclusion: So it will inherit:

• hasSugarLevel Sweet from DesertWine • and containsTannin from RedWine.

28

4. Classes: correctionsNew class? (or not)– Subclasses of a class usually:

1. have additional properties that the superclass does not have, or

2. different restrictions from those of the superclass, or 3. participate in different relationships than the superclasses

– However: Classes in terminological hierarchies do not have to introduce new properties

29

4. Classes: correctionsNew class or property(slot) value?– If the concepts with different slot values become restrictions for

different slots in other classes. Otherwise, we represent the distinction in a slot value. (e.g., RedMerlot, WhiteMerlot if they have different properties)

– If a distinction is important in the domain and we think of the objects with different values for the distinction as different kinds of objects, then we should create a new class for the distinction.

– A class to which an individual instance belongs should not change often. (e.g., chilledWine should be a property, not a class)

30

4. Classes: corrections

New class or instance?– Individual instances are the most specific concepts represented

in a knowledge base. (e.g., individual wine bottles)– If concepts form a natural hierarchy, then we should represent

them as classes. (e.g., BourgogneRegion is a class, as it has subclasses or instances such as CotesD’Or)

– However: Abstract classes have no instances.

31

4. Classes: correctionsLimiting scope– The ontology should not contain all the possible information

about the domain: you do not need to specialize (or generalize) more than you need for your application (at most one extra level each way). (e.g., the ontology has no FavouriteWine term)

Disjoint subclasses– Many systems allow us to specify explicitly that several classes

are disjoint.– Common mistake: a subclass of both shouldn’t exist!

32

5. Defining properties

Inverse properties/slots/relations– Storing the information “in both directions” is redundant.

However, it can be convenient to have both pieces of information explicitly available. A knowledge-acquisition system could automatically fill in value for inverse relation for consistency of knowledge base.

– e.g., makerOf versus produces

33

5. Defining properties

Default property (slot) values– If a particular slot value is the same for most instances of a class.

When a new instance of a class containing this slot is created, the system fills in the default value automatically. It can then be changed to any other value allowed.

– e.g. hasSugarLevel Sweet for all DesertWines

34

Previously: ontology building and error correctionNext: Names; formal criteria to evaluate ontologies

35

What’s in a name?Define a naming convention for classes and slots and adhere to it.Capitalisation and delimiters (space, underscore, dash)

– consistent capitalization for concept names;– use lower case for property names

Singular or plural (a class Wine actually represents all wines)

– Prefix and suffix conventions• Prefix has, suffix of: e.g., hasMaker; makerOf;

– Other naming conventions• Do not add “class”, “property”, “slot”, to concept names.• Avoid abbreviations. Use similar conventions for subclasses (not Red and

WhiteWine)

36

(Formal) criteria to evaluate ontologies

37

Evaluation criteriaConsistencyCompletenessConcisenessExpandabilitySensitiveness

38

ConsistencyWhether it is possible to obtain contradictory conclusions from valid input definitions

3 conditions (if and only if):– If there is no contradiction between formal definition

and real world– If there is no contradiction between informal

definition and real world– If formal and informal definition have same meaning

(internal consistency)39

Example: internal consistency & individual inconsistency

Informal: “The days in the week are: house, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday”Formal: <owl:Class rdf:ID=“Week"/>

<Week rdf:ID=“house" />

<Week rdf:ID=“Tuesday" /> …

<Week rdf:ID=“Sunday" /> 40

Example: internal consistency & individual inconsistency

Informal: “The days in the week are: house, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday”Formal:

<owl:Class rdf:ID=“Week"/>

<Week rdf:ID=“house" />

<Week rdf:ID=“Tuesday" /> …

<Week rdf:ID=“Sunday" />41

Internal consistency: informal and formal definitions have the same meaning

Example: internal consistency & individual inconsistency

Informal: “The days in the week are: house, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday”Formal:

<owl:Class rdf:ID=“Week"/>

<Week rdf:ID=“house" />

<Week rdf:ID=“Tuesday" /> …

<Week rdf:ID=“Sunday" /> 42

Individual inconsistency: contradiction between real world and formal/informal definition

CompletenessOntology complete, if and only if:– All that is supposed to be in the ontology is explicitly

stated in it, or can be inferred.– Each definition is complete.

To prove incompleteness, we prove the incompleteness of an individual definition

43

ConcisenessAn ontology is concise if:– It doesn’t store any unnecessary definitions;– Explicit redundancies between definitions of

terms don’t exist;– Redundancies cannot be inferred.

44

ExpandabilityThe effort required to add new definitions to an ontology (or more knowledge to its definitions) without altering the set of well-defined properties already present (guaranteed).

45

SensitivenessHow small changes in a definition alter the set of well-defined properties already present (guaranteed).

46

47

Semantic Inconsistency errorsIncorrect semantic classification: – classifies a concept as a subclass of a concept

to which it does not belong; • e.g., Dog as subclass of House or

– an instance under the wrong class • e.g., Pluto under House

48

Inconsistency: circularity errorsA class if defined as a specialisation or generalisation of itself;Distance: 0 (a class within itself) 1, …, n

49

Inconsistency: circularity errors

50

Partitions: define concept classifications in a disjoint and/or complete manner.Mistakes: – Subclass partition with common classes– Subclass partition with common instances– Exhaustive subclass partition with common classes– Exhaustive subclass partition with common instances– Exhaustive subclass partition with external instances.

51

Inconsistency: partition errors

Subclass partition with common classesWhen a class belongs to more than one subclass – e.g., Dog and Cat

as subclasses of Mammal, and Doberman as a subclass of both classes

52

Subclass partition with common instances

53

When several instances belong to more than one subclass

e.g., Dog and Cat as subclasses of Mammal, and Pluto as instance of both classes

Exhaustive subclass partition with common classes

54

E.g., Odd and Even as exhaustive subclass partition of Number, if a subclass Prime is a subclass of both classes.

Exhaustive subclass partition with common instances

55

When one or several instances belong to more than one subclass of the exhaustive partition

e.g., Odd and Even as exhaustive subclass partition of Number, if a number Three is an instance of both classes

Exhaustive subclass partition with external instances

56

When having defined an exhaustive subclass partition of the base class, there is one more instance that doesn’t belong to any class in the sub-partition

e.g., Odd and Even as exhaustive subclass partition of Number, and instance Three of Number doesn’t belong to any of the other classes

Detecting IncompletenessCheck completeness of class hierarchy (impreciseness or over-specification)Check completeness of the domains and ranges of properties (impreciseness or over-specification)Check completeness of classes (properties missing, different classes with same definition, etc.)

57

Example incomplete concept classification

Class MusicalInstruments is defined considering only the sub-classes StringInstruments and WindInstruments (overlooking, e.g. percussion instruments)

58

Partition errorsWhen a definition of a partition between a set of classes is omitted.Subclass partition omission: dog, cat as subclasses of mammal, but forget that they are a subclass partition of mammals (disjoint classes)Exhaustive subclass partition omission: defining a partition of a class and omitting the fact that it is exhaustive (e.g., odd, even subclasses of number, without specifying exhaustiveness) 59

Detecting RedundancyGrammatical redundancy errors: where there is more than one definition of the hierarchical relation– Redundancies of subclass-of relations: more than one– Redundancies of instance-of relations: direct/indirect

repetition (the latter instance of a subclass)

Identical formal definition of classesIdentical formal definition of instances 60

ConclusionsOntology-development methodology for declarative frame-based systems (see, e.g., Generic UM techniques)

complex issues of defining class hierarchies and properties of classes and instancesthere is no single correct ontology for any domainHowever: this does not mean that errors cannot be detected!

61