+ All Categories
Home > Documents > From thesauri to rich ontologies: The AGROVOC case

From thesauri to rich ontologies: The AGROVOC case

Date post: 02-Jan-2016
Category:
Upload: alea-avila
View: 42 times
Download: 0 times
Share this document with a friend
Description:
From thesauri to rich ontologies: The AGROVOC case. Boris Lauser Food and Agriculture Organization (FAO) Rome, Italy [email protected], www.fao.org. DELOS Workshop Lund, Sweden June 23, 2004. The problem. - PowerPoint PPT Presentation
Popular Tags:
36
1 From thesauri to rich ontologies: The AGROVOC case Boris Lauser Food and Agriculture Organization (FAO) Rome, Italy [email protected], www.fao.org DELOS Workshop Lund, Sweden June 23, 2004
Transcript
Page 1: From thesauri to rich ontologies: The AGROVOC case

1

From thesauri to rich ontologies:

The AGROVOC case

Boris LauserFood and Agriculture Organization (FAO)

Rome, Italy

[email protected], www.fao.org

DELOS WorkshopLund, Sweden June 23, 2004

Page 2: From thesauri to rich ontologies: The AGROVOC case

2

The problem

• AI and Semantic Web applications need full-fledged ontologies that support reasoning

• Constructing such ontologies is expensive

• While existing KOS do not provide the full set of precise concept relationships needed for reasoning,existing KOS, both large and small, represent much intellectual capital KOS = Knowledge Organization System

• How can this intellectual capital be put to use in constructing full-fledged ontologies

• Specifically: From AGROVOC to a full-fledged Food and Agriculture Ontology

Page 3: From thesauri to rich ontologies: The AGROVOC case

3

Some applications of a Food and Agriculture Ontology

• Advice on crops and crop management (fertilization, irrigation)

• Advice on pest management

• Tracking contaminants through the food chain

• Advice on safe food processing

• Computing nutrition labels

• Advice on healthy eating

• Improved searching

Page 4: From thesauri to rich ontologies: The AGROVOC case

4

AGROVOC relationships compared with more differentiated relationships

of a Food and Agriculture Ontology

Page 5: From thesauri to rich ontologies: The AGROVOC case

5

AGROVOC Food and Agriculture Ontology

Undifferentiated hierarchical relationships

milk     NT cow milk     NT milk fat 

cows

     NT cow milk 

Cheddar cheese

     BT cow milk

Differentiated relationships  

milk     <includesSpecific> cow milk     <containsSubstance> milk fat

cows

     <hasComponent> cow milk*

Cheddar cheese

     <madeFrom> cow milk

Rule 1

Part X <mayContainSubstance> Substance Y

     IF Animal W <hasComponent> Part X     AND  Animal W <ingests> Substance Y

Rule 2

Food Z <containsSubstance> Substance Y

   IF Food Z <madeFrom> Part X   AND Part X <containsSubstance> Substance Y 

Page 6: From thesauri to rich ontologies: The AGROVOC case

6

From AGROVOC to FA Ontology

1) Define the FA Ontology structure

2) Fill in values from AGROVOC to the extent possible

3) Edit manually with computer assistanceusing the rules-as-you go approach andan ontology editor:

• make existing information more precise

• add new information

Page 7: From thesauri to rich ontologies: The AGROVOC case

7

Define ontology structureOverall model

Page 8: From thesauri to rich ontologies: The AGROVOC case

8

Concept

Relationshipsbetweenconcepts

Lexicalization/Term

String

Relationshipsbetweenstrings

Relationshipsbetweenterms

designated by

manifested asOther information:language/culture

subvocabulary/scopeaudiencetype, etc.

Note

annotation relationship

Relationship

RelationshipsbetweenRelationships

Page 9: From thesauri to rich ontologies: The AGROVOC case

9

Define ontology structureRelationship types

Page 10: From thesauri to rich ontologies: The AGROVOC case

10

Isa

Relationship Inverse relationship

X  <includesSpecific>

X  <inheritsTo>  Y 

Y  <isa>  X

Y  <inheritsFrom>  X

Page 11: From thesauri to rich ontologies: The AGROVOC case

11

Holonymy / meronymy (the generic whole-part relationship)

Relationship Inverse relationship

X  <containsSubstance>  Y 

X  <hasIngredient>  Y 

X  <madeFrom>  Y 

X  <yieldsPortion>  Y 

X  <spatiallyIncludes>  Y

X  <hasComponent>  Y

X  <includesSubprocess>  Y

X  <hasMember>  Y

Y  <substanceContainedIn>  X

Y  <ingredientOf>  X 

Y  <usedToMake>  X

Y  <portionOf>  X

Y  <spatiallyIncludedIn>  X

Y  <componentOf>  X

Y  <subprocessOf>  X

Y  <memberOf>  XY

Page 12: From thesauri to rich ontologies: The AGROVOC case

12

Further relationship examples

Relationship Inverse relationship

X  <causes>  Y 

X  <instrumentFor>  Y 

X  <processFor>  Y 

X  <beneficialFor>  Y 

X  <treatmentFor>  Y

X  <harmfulFor>  Y

X  <hasPest>  Y

X  <growsIn>  Y

X  <hasProperty>  Y

X  <hasSymptom>  Y

X  <similarTo>  Y

X  <oppositeTo>  Y

X <hasPhase> Y

X  <ingests>  Y 

X <madeFrom> Y

Y  <causedBy>  X

Y  <performedByInstrument>  X 

Y  <usesProcess>  X

Y  <benefitsFrom>  X

Y  <treatedWith>  X

Y  <harmedBy>  X

Y  <afflicts>  X

Y  <growthEnvironmentFor>  X

Y  <propertyOf>  X

Y  <indicates>  X

Y  <similarTo>  X

Y  <oppositeTo>  X

Y <phaseOf>  X

Y  <ingestedBy>  X

Y <usedToMake> X

Page 13: From thesauri to rich ontologies: The AGROVOC case

13

Fill in values from AGROVOC

• Fill in values from AGROVOC to the extent possible

• Arrange in structured sequence (to the extent possible based on the information in AGROVOC) to facilitate editing(The editor can deal with similar problems at the same time.)

Page 14: From thesauri to rich ontologies: The AGROVOC case

14

Undifferentiated relationships from AGROVOC

Edited relationships

milk NT cow milk

milk NT goat milk

milk NT buffalo milkmilk NT milk fatmilk RT milk proteinmilk RT lactosecows RT cow milk 

goats RT goat milk

ewes RT ewe milk

goat milk RT goat cheese

ewe milk RT ewe cheese

acid soils BT chemical soil types

acrisols BT genetic soil types

alkaline soils BT chemical soil types

aluvial soils BT lithological soil types

chemical soil types BT soil types

Cichorium BT Asteraceae

Cichorium endivia BT Cichorium

Cichorium intybus BT Cichorium

Cichorium intybus RT coffee substitutes

Cichorium intybus RT root vegetablesblood NT blood protein

blood NT blood lipids

Page 15: From thesauri to rich ontologies: The AGROVOC case

15

Edit manually with computer assistance

• Use the rules-as-you-go approach andgood ontology editing software that handles large ontologies efficiently

• make existing information more precise

• add new information

Assumption:

Entity types of concepts are known from AGROVOC or other sources (Langual, UMLS, WordNet); for example

milk fat is a Substance

Asteraceae is a taxon

The editor may need to determine the entity type

Page 16: From thesauri to rich ontologies: The AGROVOC case

16

The rules-as-you-go approachExploit patterns to automate the conversion process

Example

1.   An editor has determined that

milk NT cow milk should become milk <includesSpecific> cow milk

2. She recognizes that this is an example of the general pattern milk NT * milk milk <includesSpecific> * milk (where * is the wildcard character)

3. Given this pattern, the system can derive automatically

milk NT goat milk should become milk <includesSpecific> goat milk

Result:

Page 17: From thesauri to rich ontologies: The AGROVOC case

17

Undifferentiated relationships from AGROVOC

Edited relationships

milk NT cow milk

milk NT goat milk

milk NT buffalo milkmilk NT milk fatmilk RT milk proteinmilk RT lactosecow RT cow milk 

goats RT goat milk

ewes RT ewe milk

goat milk RT goat cheese

ewe milk RT ewe cheese

acid soils BT chemical soil types

acrisols BT genetic soil types

alkaline soils BT chemical soil types

aluvial soils BT lithological soil types

chemical soil types BT soil types

Cichorium BT Asteraceae

Cichorium endivia BT Cichorium

Cichorium intybus BT Cichorium

Cichorium intybus RT coffee substitutes

Cichorium intybus RT root vegetablesblood NT blood protein

blood NT blood lipids

milk <includesSpecific> cow milk

milk <includesSpecific> goat milk

milk <includesSpecific> buffalo milk

Page 18: From thesauri to rich ontologies: The AGROVOC case

18

The rules as you go approachExploit patterns to automate the conversion process

1.  Editor: milk NT milk fat milk <containsSubstance> milk fat

2. Pattern: Substance NT/RT Substance

Substance <containsSubstance> Substance

3. Thereforemilk RT milk protein milk <containsSubstance> milk protein

Result:

Page 19: From thesauri to rich ontologies: The AGROVOC case

19

Undifferentiated relationships from AGROVOC

Edited relationships

milk NT cow milk

milk NT goat milk

milk NT buffalo milkmilk NT milk fatmilk RT milk proteinmilk RT lactosecows RT cow milk 

goats RT goat milk

ewes RT ewe milk

goat milk RT goat cheese

ewe milk RT ewe cheese

acid soils BT chemical soil types

acrisols BT genetic soil types

alkaline soils BT chemical soil types

aluvial soils BT lithological soil types

chemical soil types BT soil types

Cichorium BT Asteraceae

Cichorium endivia BT Cichorium

Cichorium intybus BT Cichorium

Cichorium intybus RT coffee substitutes

Cichorium intybus RT root vegetablesblood NT blood protein

blood NT blood lipids

milk <includesSpecific> cow milk

milk <includesSpecific> goat milk

milk <includesSpecific> buffalo milkmilk <containsSubstance> milk fatmilk <containsSubstance> milk proteinmilk <containsSubstance> lactose 

goat milk <containsSubstance> goat cheese

ewe milk <containsSubstance> ewe cheese

blood <containsSubstance> blood protein

blood <containsSubstance> blood lipids

Page 20: From thesauri to rich ontologies: The AGROVOC case

20

The rules as you go approachExploit patterns to automate the conversion process

1.   Editor:

cows RT cow milk cows <hasComponent> cow milk

2. Pattern Animal RT BodyPart Animal <hasComponent> BodyPart

3. Therefore:

goats NT goat milk goat <hasComponent> goat milk

Result:

Page 21: From thesauri to rich ontologies: The AGROVOC case

21

Undifferentiated relationships from AGROVOC

Edited relationships

milk NT cow milk

milk NT goat milk

milk NT buffalo milkmilk NT milk fatmilk RT milk proteinmilk RT lactosecow RT cow milk 

goats RT goat milk

ewes RT ewe milk

goat milk RT goat cheese

ewe milk RT ewe cheese

acid soils BT chemical soil types

acrisols BT genetic soil types

alkaline soils BT chemical soil types

aluvial soils BT lithological soil types

chemical soil types BT soil types

Cichorium BT Asteraceae

Cichorium endivia BT Cichorium

Cichorium intybus BT Cichorium

Cichorium intybus RT coffee substitutes

Cichorium intybus RT root vegetablesblood NT blood protein

blood NT blood lipids

milk <includesSpecific> cow milk

milk <includesSpecific> goat milk

milk <includesSpecific> buffalo milkmilk <containsSubstance> milk fatmilk <containsSubstance> milk proteinmilk <containsSubstance> lactosecows <hasComponent> cow milk 

goats <hasComponent> goat milk

ewes <hasComponent> ewe milk

goat milk <containsSubstance> goat cheese

ewe milk <containsSubstance> ewe cheese

blood <containsSubstance> blood protein

blood <containsSubstance> blood lipids

Page 22: From thesauri to rich ontologies: The AGROVOC case

22

The rules as you go approachExploit patterns to automate the conversion process

1.   Editor:

acid soils BT chemical soil types acid soils <isa> chemical soil types

2. Pattern: X BT * type* X <isa> * type*

3. Therefore:

acrisols BT genetic soil types acrisols <isa> genetic soil types

Result:

Page 23: From thesauri to rich ontologies: The AGROVOC case

23

Undifferentiated relationships from AGROVOC

Edited relationships

milk NT cow milk

milk NT goat milk

milk NT buffalo milkmilk NT milk fatmilk RT milk proteinmilk RT lactosecow RT cow milk 

goats RT goat milk

ewes RT ewe milk

goat milk RT goat cheese

ewe milk RT ewe cheese

acid soils BT chemical soil types

acrisols BT genetic soil types

alkaline soils BT chemical soil types

aluvial soils BT lithological soil types

chemical soil types BT soil types

Cichorium BT Asteraceae

Cichorium endivia BT Cichorium

Cichorium intybus BT Cichorium

Cichorium intybus RT coffee substitutes

Cichorium intybus RT root vegetablesblood NT blood protein

blood NT blood lipids

milk <includesSpecific> cow milk

milk <includesSpecific> goat milk

milk <includesSpecific> buffalo milkmilk <containsSubstance> milk fatmilk <containsSubstance> milk proteinmilk <containsSubstance> lactosecows <hasComponent> cow milk 

goats <hasComponent> goat milk

ewes <hasComponent> ewe milk

goat milk <containsSubstance> goat cheese

ewe milk <containsSubstance> ewe cheese

acid soils <isa> chemical soil types

acrisols <isa> genetic soil types

alkaline soils <isa> chemical soil types

aluvial soils <isa> lithological soil types

chemical soil type <isa> soil types

blood <containsSubstance> blood protein

blood <containsSubstance> blood lipids

Page 24: From thesauri to rich ontologies: The AGROVOC case

24

The rules as you go approachExploit patterns to automate the conversion process

1.   Editor:Cichorium BT Asteraceae Cichorium <isa> Asteraceae

2. Pattern: Taxon BT Taxon Taxon <isa> Taxon

3. Therefore:

Cichorium endivia BT Cichorium Cichorium endivia <isa> Cichorium

Result:

Page 25: From thesauri to rich ontologies: The AGROVOC case

25

Undifferentiated relationships from AGROVOC

Edited relationships

milk NT cow milk

milk NT goat milk

milk NT buffalo milkmilk NT milk fatmilk RT milk proteinmilk RT lactosecow RT cow milk 

goats RT goat milk

ewes RT ewe milk

goat milk RT goat cheese

ewe milk RT ewe cheese

acid soils BT chemical soil types

acrisols BT genetic soil types

alkaline soils BT chemical soil types

aluvial soils BT lithological soil types

chemical soil types BT soil types

Cichorium BT Asteraceae

Cichorium endivia BT Cichorium

Cichorium intybus BT Cichorium

Cichorium intybus RT coffee substitutes

Cichorium intybus RT root vegetablesblood NT blood protein

blood NT blood lipids

milk <includesSpecific> cow milk

milk <includesSpecific> goat milk

milk <includesSpecific> buffalo milkmilk <containsSubstance> milk fatmilk <containsSubstance> milk proteinmilk <containsSubstance> lactosecows <hasComponent> cow milk 

goats <hasComponent> goat milk

ewes <hasComponent> ewe milk

goat milk <containsSubstance> goat cheese

ewe milk <containsSubstance> ewe cheese

acid soils <isa> chemical soil types

acrisols <isa> genetic soil types

alkaline soils <isa> chemical soil types

aluvial soils <isa> lithological soil types

chemical soil type <isa> soil types

Cichorium <isa> Asteraceae

Cichorium endivia <isa> Cichorium

Cichorium intybus <isa> Cichorium

blood <containsSubstance> blood protein

blood <containsSubstance> blood lipids

Page 26: From thesauri to rich ontologies: The AGROVOC case

26

The rules as you go approachExploit patterns to automate the conversion process

1.   Editor:Cichorium intybus RT coffee substitutes Cichorium intybus <usedToMake> coffee substitutes

2. Pattern: Taxon RT FoodProduct Taxon <usedToMake> FoodProduct

3. Therefore:Cichorium intybus RT root vegetables

Cichorium intybus <usedToMake> root vegetables

Result:

Page 27: From thesauri to rich ontologies: The AGROVOC case

27

Undifferentiated relationships from AGROVOC

Edited relationships

milk NT cow milk

milk NT goat milk

milk NT buffalo milkmilk NT milk fatmilk RT milk proteinmilk RT lactosecow RT cow milk 

goats RT goat milk

ewes RT ewe milk

goat milk RT goat cheese

ewe milk RT ewe cheese

acid soils BT chemical soil types

acrisols BT genetic soil types

alkaline soils BT chemical soil types

aluvial soils BT lithological soil types

chemical soil types BT soil types

Cichorium BT Asteraceae

Cichorium endivia BT Cichorium

Cichorium intybus BT Cichorium

Cichorium intybus RT coffee substitutes

Cichorium intybus RT root vegetablesblood NT blood protein

blood NT blood lipids

milk <includesSpecific> cow milk

milk <includesSpecific> goat milk

milk <includesSpecific> buffalo milkmilk <containsSubstance> milk fatmilk <containsSubstance> milk proteinmilk <containsSubstance> lactosecows <hasComponent> cow milk 

goats <hasComponent> goat milk

ewes <hasComponent> ewe milk

goat milk <containsSubstance> goat cheese

ewe milk <containsSubstance> ewe cheese

acid soils <isa> chemical soil types

acrisols <isa> genetic soil types

alkaline soils <isa> chemical soil types

aluvial soils <isa> lithological soil types

chemical soil type <isa> soil types

Cichorium <isa> Asteraceae

Cichorium endivia <isa> Cichorium

Cichorium intybus <isa> Cichorium

Cichorium intybus <usedToMake> coffee substitutes

Cichorium intybus <usedToMake> root vegetablesblood <containsSubstance> blood protein

blood <containsSubstance> blood lipids

Page 28: From thesauri to rich ontologies: The AGROVOC case

28

The rules as you go approachDiscussion

Main idea: Formulate constraints to assist the editor

• Ontology may have many relationship types, perhaps > 100

• Constraints limit the relationship types that are possible in a specific case; show the editor only these

• If the constraints limit possible relationship types to 1, conversion is automatic

• Constraints may depend on Thesaurus to be converted

Page 29: From thesauri to rich ontologies: The AGROVOC case

29

Constraints

Thesaurus Relationships

Possible ontology relationships

NT / BT <hasMember>  |  <memberOf>

<includesSpecific> |  <isa>

<hasComponent> |  <componentOf>

<spatiallyIncludes> |  <spatiallyIncludedIn>

etc.

RT <similarTo> | <similarTo>

<growsIn>  |  <EnvironmentForGrowing>

<treatmentFor> |  <treatedWith> 

<hasMember> |  <memberOf>

<hasComponent> |  <componentOf>

<madeFrom> | <usedToMake>

etc.

Page 30: From thesauri to rich ontologies: The AGROVOC case

30

Constraints

Thesaurus Relationships

+ entity types or values

Possible ontology relationships

milk NT * milk

Substance NT Substance

X BT * type*

Taxon BT Taxon

GeogrEntity  BT GeogrEntity

BodyPart BT BodyPart

ChemSubstance BT ChemSubstance

milk <includesSpecific> * milk

Substance <containsSubstance> Substance

X <isa> * type*

Taxon <isa> Taxon

GeogrEntity  <spatiallyIncludedIn>  GeogrEntity

BodyPart <isComponentOf> BodyPart

ChemSubstance <isa> ChemSubstance

Page 31: From thesauri to rich ontologies: The AGROVOC case

31

Constraints

Thesaurus Relationships

+ entity types or values

Possible ontology relationships

Substance RT Substance

LivingOrganism RT BodyPart

Taxon RT FoodProduct

GeogrEntity  RT GeogrGrouping

Process RT Object

ChemSubstance RT Function

Substance <containsSubstance> SubstanceSubstance <containedInSubstance> SubstanceSubstance <usedToMake> SubstanceSubstance <madeFrom> Substance

LivingOrganism <hasComponent> BodyPart

Taxon <usedToMake> FoodProduct

GeogrEntity  <isMemberOf>  GeogrGrouping

Process <performedByInstrument> ObjectProcess <affects> Object

ChemSubstance <usedFor> Function

Page 32: From thesauri to rich ontologies: The AGROVOC case

32

Checking by editor

• Relationship instances created by editor by selecting from a constraint-generated menuare final

• Relationship instances created automatically must be presented to the editor

• If the editor determines that the relationship instances are almost always correct, she checks a box accept without checking

Page 33: From thesauri to rich ontologies: The AGROVOC case

33

Overall conversion process

• One master editor must go through the file from start to finish,processing the relationship instances and creating patterns,creating new relationship types as needed

• Assistant editors can apply the patterns.

• In the first pass, the master editor should deal with the easy cases.

• Deal with the remaining cases later.Groups of similar relationship instances can be seen more easily in a smaller set

Page 34: From thesauri to rich ontologies: The AGROVOC case

34

Adding new relationship types and new relationship instances

• AGROVOC does not contain all relationship types or relationship instances for AI applications

• Need to add data. For exampleOrganism X  <hasPest> Organism Y

ChemSubstance X <actsAgainst> Organism Y

Organism X <actsAgainst> Organism Y

Plant X  <growsIn>  Environment Y

FoodProduct X <suitableFor> Diet Y

Page 35: From thesauri to rich ontologies: The AGROVOC case

35

Conclusion

The rules-as-you-go approach is a realistic method for developing a rich ontology from an existing thesaurus

Full paper:Reengineering Thesauri for New Applications: the AGROVOC Example

Journal of Digital Information, Volume 4 Issue 4

http://jodi.ecs.soton.ac.uk/Articles/v04/i04/Soergel/

Page 36: From thesauri to rich ontologies: The AGROVOC case

36

References

• For questions and discussion contact

Boris [email protected]

Dagobert [email protected]

• AOS: Agricultural Ontology Service Projecthttp://www.fao.org/agris/aos

• AGMES: http://www.fao.org/agris/agmes


Recommended