+ All Categories
Home > Documents > Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and...

Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and...

Date post: 28-Dec-2019
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
71
Olivier Bodenreider Olivier Bodenreider Lister Hill National Center Lister Hill National Center for Biomedical Communications for Biomedical Communications Bethesda, Maryland Bethesda, Maryland - - USA USA Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology New Jersey Institute of Technology Computer Science Department Computer Science Department April 29, 2002 April 29, 2002
Transcript
Page 1: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

Olivier BodenreiderOlivier Bodenreider

Lister Hill National CenterLister Hill National Centerfor Biomedical Communicationsfor Biomedical CommunicationsBethesda, Maryland Bethesda, Maryland -- USAUSA

Issues in the visualization and navigationof biomedical knowledge

New Jersey Institute of TechnologyNew Jersey Institute of TechnologyComputer Science DepartmentComputer Science Department

April 29, 2002April 29, 2002

Page 2: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

2Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

OutlineOutline

◆◆ Biomedical terminologies as a source of Biomedical terminologies as a source of biomedical knowledgebiomedical knowledge

◆◆ Structural perspective on the MetathesaurusStructural perspective on the Metathesaurus

◆◆ Visualizing biomedical knowledgeVisualizing biomedical knowledge

◆◆ From structure to semanticsFrom structure to semantics●● Inherit relationshipsInherit relationships

●● Path between two conceptsPath between two concepts

●● LimitationsLimitations

Page 3: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

Biomedical terminologies

Page 4: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

4Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Biomedical knowledge organizationBiomedical knowledge organization

Semantic Spaces

TerminologiesMedical Subject HeadingsInternational Classification of DiseasesSNOMED[…]

OntologiesCycWordNetDigital Anatomist[…]

UMLS

Page 5: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

5Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Biomedical terminologiesBiomedical terminologies

◆◆ Core vocabulariesCore vocabularies●● anatomy (UWDA,anatomy (UWDA, NeuronamesNeuronames))

●● drugs (Firstdrugs (First DataBankDataBank,, MicromedexMicromedex))

●● medical devices (UMD, SPN)medical devices (UMD, SPN)

◆◆ Several perspectivesSeveral perspectives●● clinical terms (SNOMED, CTV3)clinical terms (SNOMED, CTV3)

●● information sciences (MeSH, CRISP)information sciences (MeSH, CRISP)

●● administrative terminologies (ICDadministrative terminologies (ICD--99--CM, CPTCM, CPT--4)4)

●● standards (HL7, LOINC)standards (HL7, LOINC)

Page 6: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

6Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Biomedical terminologies Biomedical terminologies (cont’d)(cont’d)

◆◆ Specialized vocabulariesSpecialized vocabularies●● nursing (NIC, NOC, NANDA, Omaha, PCDS)nursing (NIC, NOC, NANDA, Omaha, PCDS)

●● dentistry (CDT)dentistry (CDT)

●● oncology (PDQ)oncology (PDQ)

●● psychiatry (DSM, APA)psychiatry (DSM, APA)

●● adverse reactions (COSTART, WHO ART)adverse reactions (COSTART, WHO ART)

●● primary care (ICPC)primary care (ICPC)

◆◆ Knowledge bases (AI/Rheum, Knowledge bases (AI/Rheum, DXplainDXplain, QMR), QMR)

Page 7: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

7Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

UMLSUMLS

◆◆ TwoTwo--level structurelevel structure●● Semantic NetworkSemantic Network

■■ 134 Semantic Types (134 Semantic Types (STsSTs))

■■ Relationships among Relationships among STsSTs

●● MetathesaurusMetathesaurus■■ 800,000 concepts800,000 concepts

■■ InterInter--concept relationshipsconcept relationships

●● Link = categorizationLink = categorization■■ Often Often isaisa

■■ Rarely is an instance ofRarely is an instance ofConcept

Metathesaurus

SemanticType

Semantic Network

categorization

Page 8: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

Heart

Concepts

Metathesaurus

22

225

97

4

12

9 31

Esophagus

Left PhrenicNerve

HeartValves

FetalHeart

Medias-tinum

SaccularViscus

AnginaPectoris

CardiotonicAgents

TissueDonors

AnatomicalStructure

Fully FormedAnatomicalStructure

EmbryonicStructure

Body Part, Organ orOrgan Component Pharmacologic

Substance

Disease orSyndrome

PopulationGroup

Semantic Types

SemanticNetwork

Page 9: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

9Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Addison’s diseaseAddison’s disease

◆◆ Addison's disease is a rare Addison's disease is a rare endocrine disorderendocrine disorder

◆◆ Addison's disease occurs Addison's disease occurs when the when the adrenal glandsadrenal glandsdo not produce enough of do not produce enough of the hormone the hormone cortisolcortisol

◆◆ For this reason, the For this reason, the disease is sometimes disease is sometimes called called chronic adrenal chronic adrenal insufficiencyinsufficiency, or , or hypocortisolismhypocortisolism

Page 10: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

Structural perspectiveon the Metathesaurus

Page 11: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

11Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

HierarchyHierarchy

◆◆ Hierarchical relationshipsHierarchical relationships●● Taxonomy (Taxonomy (isaisa))

●● Meronomy Meronomy (part of)(part of)

◆◆ Partial orderingPartial ordering●● [Reflexivity][Reflexivity]

●● AntisymmetryAntisymmetry

●● TransitivityTransitivity

◆◆ InheritanceInheritance

◆◆ ReasoningReasoning

Physiologic Function

OrganismFunction

Organor TissueFunction

CellFunction

MolecularFunction

MentalProcess

GeneticFunction

Page 12: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

12Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Principles of subsumptionPrinciples of subsumption

Aneuvrysm

Aortic Aneuvrysm

Aortic Aneuvrysm, Thoracic

Thoracoabdominal Aortic Aneuvrysm

aneuvrysm

an. of the aorta

an. of the thoracic aorta

an. of the thoracic aorta and abdominal aorta

isa

isa

isa

partitive refinementof a concept element

conjunctivecoordination

introduction of aspecializing criterion

[Bernauer, AMIA 1994]

Page 13: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

13Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Hierarchies in source vocabulariesHierarchies in source vocabularies

◆◆ StructureStructure●● Single treeSingle tree

●● Polyhierarchical Polyhierarchical (multiple parents)(multiple parents)

◆◆ RelationshipsRelationships●● Usually implicitUsually implicit

●● May be other than May be other than isaisa or or part ofpart of■■ E.g., Thesaurus relationshipsE.g., Thesaurus relationships

Page 14: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

14Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Autoimmune Diseases

Adrenal Gland Hypofunction

Adrenal Gland Diseases

Addison’s Disease

AD in medical vocabularies AD in medical vocabularies ContextsContextsSNOMED MeSH

Diseases

Endocrine Diseases Immunologic DiseasesEndocrine Diseases

Adrenal Gland Diseases

Addison’s Disease

Diseases

Page 15: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

15Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Adrenal Cortex Diseases

Hypoadrenalism

Adrenal Gland Hypofunction

Adrenal cortical hypofunction

Endocrine Diseases

Adrenal Gland Diseases

Addison’s Disease

AD in UMLS AD in UMLS ContextsContexts

Page 16: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

16Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

AD in UMLS AD in UMLS SNOMED contextSNOMED context

Adrenal Cortex Diseases

Hypoadrenalism

Adrenal Gland Hypofunction

Adrenal cortical hypofunction

Endocrine Diseases

Adrenal Gland Diseases

Addison’s Disease

Page 17: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

17Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

AD in UMLS AD in UMLS MeSHMeSH contextcontext

Adrenal Cortex Diseases

Hypoadrenalism

Adrenal cortical hypofunction

Adrenal Gland Hypofunction

Endocrine Diseases

Adrenal Gland Diseases

Addison’s Disease

Page 18: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

18Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

AD in UMLS AD in UMLS Read Codes contextRead Codes context

Adrenal Cortex Diseases

Hypoadrenalism

Adrenal Gland Hypofunction

Adrenal cortical hypofunction

Endocrine Diseases

Adrenal Gland Diseases

Addison’s Disease

Page 19: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

19Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

AD in UMLS AD in UMLS AOD AOD ThesThes. context. context

Hypoadrenalism

Adrenal Gland Hypofunction

Adrenal Cortex Diseases

Adrenal cortical hypofunction

Endocrine Diseases

Adrenal Gland Diseases

Addison’s Disease

Page 20: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

Endocrine Diseases

Adrenal Gland Diseases

Adrenal Cortex Diseases

Hypoadrenalism

Adrenal Gland Hypofunction

Adrenal cortical hypofunction

Addison’s Disease

Adrenal Cortex Dysfunction

Adrenal Dysfunction

Addison’s disease due to autoimmunity

Secondary hypocortisolism

Other disorders ofadrenal gland

Disorders of otherendocrine gland

Adrenal Glands

Adrenal Cortex

Endocrine System

Endocrine Glands

Abdominal organ Diseases

Page 21: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

21Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Hierarchies in source vocabulariesHierarchies in source vocabularies

◆◆ Often taskOften task--drivendrivenrather than based on principlesrather than based on principles

◆◆ Usually suitable for information retrievalUsually suitable for information retrieval●● Better recallBetter recall

●● Precision may not be crucialPrecision may not be crucial

◆◆ Not necessarily suitable for reasoningNot necessarily suitable for reasoning

◆◆ But expected to be consistent structurallyBut expected to be consistent structurally

Page 22: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

22Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

AD in UMLS AD in UMLS ContextsContexts

◆◆ Multiple Multiple treetree structures structures combined into a combined into a graphgraphstructurestructure

◆◆ Directed Directed acyclicacyclic graph graph (DAG)(DAG)

A

B D E H D E

B

G H

E F H

C

B C

A

E FD

G H

Page 23: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

23Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Actually, there are some cyclesActually, there are some cycles

Disinfectant soap

Disinfectants

Disinfectantsand Cleansers

Anti-infective Agents

Germicidal soap

Page 24: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

24Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Issues with cyclesIssues with cycles

◆◆ TheoreticalTheoretical●● Violate the Violate the antisymmetry antisymmetry property of partial ordering property of partial ordering

relationsrelations

◆◆ PracticalPractical●● Loops in graph traversalLoops in graph traversal

●● Impossible to performImpossible to performtransitive reductiontransitive reduction

B

A

ED

G H

Page 25: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

25Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Cycle due to underspecificationCycle due to underspecification

◆◆ Specified and underspecified termsSpecified and underspecified terms●● May appear at different levels in a source hierarchyMay appear at different levels in a source hierarchy

●● Are clustered into the same concept (same meaning)Are clustered into the same concept (same meaning)

Fever

Feverof unknown origin

Fever of unknown origin

Fever, unspecifiedFever

MeSHICD-10

Page 26: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

26Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Other causesOther causes

◆◆ Compound termsCompound terms◆◆ MetadataMetadata

●● HYDROCELE, HYDROCELE, HydroceleHydrocele

◆◆ Classes and memberClasses and member●● PurinesPurines, , PurinePurine

◆◆ Organizational conventionsOrganizational conventions●● Acid + Base Salt + WaterAcid + Base Salt + Water

◆◆ IdiopathicIdiopathic●● Wrong relationshipsWrong relationships●● Use of nonUse of non--hierarchical relationships in “hierarchies”hierarchical relationships in “hierarchies”

[Bodenreider, AMIA 2001]

Nausea

Nausea and Vomiting

Vomiting

Nausea

Nausea and Vomiting

VomitingNausea Nausea and Vomiting Vomiting

Page 27: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

Visualizing biomedical knowledge

Page 28: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

28Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Visualizing biomedical knowledgeVisualizing biomedical knowledge

◆◆ ObjectivesObjectives●● Make knowledge navigable by usersMake knowledge navigable by users

●● Make knowledge available to applicationsMake knowledge available to applications

◆◆ Common issuesCommon issues●● Reduce complexityReduce complexity

●● Provide consistent views across the domainProvide consistent views across the domain

●● Extend views to fit specific needsExtend views to fit specific needs

� UMLS Semantic Navigatorumlsks.nlm.nih.gov → Resources → Semantic Navigator

Page 29: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

29Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

UMLS Semantic NavigatorUMLS Semantic Navigator

◆◆ Visualize semantic localityVisualize semantic locality

◆◆ FeaturesFeatures●● All relationshipsAll relationships

presented simultaneouslypresented simultaneously■■ Metathesaurus relationshipsMetathesaurus relationships

■■ Semantic network relationshipsSemantic network relationships

●● Hierarchical relationships Hierarchical relationships presented graphicallypresented graphically

●● Dynamic and navigableDynamic and navigable

●● Transitive reductionTransitive reduction

Cb

Ca

Cc

Cb

Page 30: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

30Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

UMLS Semantic Navigator UMLS Semantic Navigator ConceptsConcepts

Page 31: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

31Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

UMLS Semantic Navigator UMLS Semantic Navigator ConceptsConcepts

Page 32: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

32Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

UMLS Semantic Navigator UMLS Semantic Navigator ConceptsConcepts

Page 33: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

33Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

UMLS Semantic Navigator UMLS Semantic Navigator ConceptsConcepts

Page 34: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

34Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

UMLS Semantic Navigator UMLS Semantic Navigator RelationshipsRelationships

Addison’sDisease

Concepts

Semantic Types

AdrenalCortex

12

Disease orSyndrome

Body Part, Organ orOrgan Component

Page 35: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

35Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Visualization Visualization Research directionsResearch directions

◆◆ Selecting relevant coSelecting relevant co--occurring conceptsoccurring concepts●● Based on the relative frequencyBased on the relative frequency

●● Compared to symbolic relationshipsCompared to symbolic relationships

◆◆ Visualizing the paths between two conceptsVisualizing the paths between two concepts●● Display polyhierarchyDisplay polyhierarchy

●● Graph theory algorithmsGraph theory algorithms

Page 36: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

36Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

ObjectObject--oriented model oriented model SpecificationsSpecifications

◆◆ KnowledgeKnowledge--orientedoriented

◆◆ SimpleSimple

◆◆ HighHigh--level methodslevel methods

◆◆ IndependentIndependent●● From the UMLS relational format From the UMLS relational format

●● From backFrom back--end implementationend implementation

◆◆ ExtendableExtendable

Page 37: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

37Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Design choicesDesign choices

◆◆ ObjectObject--oriented modeloriented model●● Extensible through derivationExtensible through derivation

◆◆ Limited number of classesLimited number of classes●● Not as comprehensive as the UMLS itselfNot as comprehensive as the UMLS itself

◆◆ MethodsMethods●● Accessor Accessor methods (properties)methods (properties)

●● HighHigh--level methodslevel methods

Page 38: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

38Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

4 major classes4 major classes

Concept1Concept2

SemanticType 1 Semantic

Type 2

Metathesaurus

Semantic Network

Class�������

Class�� ���

Class�� ���������

Class�� ���

Page 39: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

39Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Class Class ��������������

◆◆ PropertiesProperties●● Unique identifier (CUI)Unique identifier (CUI)

●● List of synonymous terms [in a given source/language]List of synonymous terms [in a given source/language]

●● List of definitionsList of definitions

●● List of sourcesList of sources

●● Set of related concepts (instances of Set of related concepts (instances of ��������������))

●● Set of semantic types (instances ofSet of semantic types (instances of �� ����� ���))

●● Total frequency of coTotal frequency of co--occurrence in MEDLINEoccurrence in MEDLINE

Page 40: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

40Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Class Class ��������������

◆◆ Methods provided for convenienceMethods provided for convenience●● anc1 = par anc1 = par ∪∪ brobro

◆◆ HigherHigher--level methodslevel methods●● sibxsibx: Extended siblings: Extended siblings

●● par_par_trtr: Transitive reduction: Transitive reduction

Cb

Ca

Cc

Cb

broaderparent

Page 41: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

41Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

ArchitectureArchitecture

LocalUMLS

database

KnowledgeSourceServer

Databasemediator class

KSS API

Back-end

Mediator classes

Application

O-O model UMLS classes

Page 42: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

From structure to semantics

1. Inherit relationships

Page 43: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

43Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Semantic NetworkSemantic Network

◆◆ Semantic types (134)Semantic types (134)●● tree structuretree structure

●● 2 major hierarchies2 major hierarchies■■ EntityEntity

–– Physical ObjectPhysical Object

–– Conceptual EntityConceptual Entity

■■ EventEvent

–– ActivityActivity

–– Phenomenon or ProcessPhenomenon or Process

Page 44: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

44Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Semantic NetworkSemantic Network

◆◆ Semantic network relationships (54)Semantic network relationships (54)●● hierarchical (isa = is a kind of)hierarchical (isa = is a kind of)

■■ among typesamong types

–– AnimalAnimal isaisa OrganismOrganism

–– EnzymeEnzyme isaisa Biologically Active SubstanceBiologically Active Substance

■■ among relationsamong relations

–– treats treats isaisa affectsaffects

●● nonnon--hierarchicalhierarchical■■ Sign or SymptomSign or Symptom diagnosesdiagnoses Pathologic FunctionPathologic Function

■■ PharmacologicPharmacologic SubstanceSubstance treatstreats Pathologic FunctionPathologic Function

Page 45: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

45Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

“Biologic Function” hierarchy (isa)“Biologic Function” hierarchy (isa)

Biologic Function

Pathologic FunctionPhysiologic Function

Disease orSyndrome

Cell orMolecular

Dysfunction

ExperimentalModel ofDisease

OrganismFunction

Organor TissueFunction

CellFunction

MolecularFunction

Mental orBehavioral

Dysfunction

NeoplasticProcess

MentalProcess

GeneticFunction

Page 46: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

46Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Associative (nonAssociative (non--isa) relationshipsisa) relationships

EmbryonicStructure

AnatomicalAbnormality

CongenitalAbnormality

AcquiredAbnormality

Fully FormedAnatomicalStructure

AnatomicalStructure

part of

OrganismAttribute

property of

BodySubstance

contains,produces

conceptualpart of

evaluation of

Body Systemconceptual

part of

part of

Body Part, Organ orOrgan Component

part of

Tissue

part of

Cell

part of

CellComponent

Gene orGenome

Organismprocess of

Body Spaceor Junction

adjacent to

location of

location of

evaluation ofFinding

Laboratory orTest Result

Sign orSymptom

BiologicFunction

PhysiologicFunction

PathologicFunction

Body Locationor Region

conceptualpart of

conceptualpart of

Injury orPoisoning

disrupts

disrupts

co-occurs with

Page 47: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

47Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

RoleRole

◆◆ A relationship between 2A relationship between 2 STsSTs is a possible link is a possible link between 2 concepts that have been assigned to between 2 concepts that have been assigned to those those STsSTs●● The relationship may or may not hold at the concept The relationship may or may not hold at the concept

levellevel

●● Other relationships may apply at the concept levelOther relationships may apply at the concept level

◆◆ A child ST inherits properties from its parentsA child ST inherits properties from its parents(isa relationships)(isa relationships)

Page 48: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

48Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

ApplicationsApplications

◆◆ To help qualify interTo help qualify inter--concept relationshipsconcept relationships●● using the relationships defined between their semantic using the relationships defined between their semantic

types in the semantic network types in the semantic network

◆◆ To strengthen the structure of the MetathesaurusTo strengthen the structure of the Metathesaurus●● a relationship between 2 concepts should be consistent a relationship between 2 concepts should be consistent

with the relationships defined between their semantic with the relationships defined between their semantic types in the semantic network types in the semantic network

◆◆ Semantic interpretationSemantic interpretation●● finding semantic relationships between concepts in textfinding semantic relationships between concepts in text

Page 49: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

Metathesaurus

Concepts

Adrenalcortex

Body Part, Organ orOrgan Component

Semantic Types

SemanticNetwork

Adrenalcortical

hypofunction

Disease orSyndrome

Fully FormedAnatomicalStructure

isa

PathologicFunction

BiologicFunction

isa

isa

has location

haslocation

Page 50: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

50Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

ExperimentExperiment

◆◆ 3764 concepts related to Heart3764 concepts related to Heart

◆◆ 6894 pairs of related concepts6894 pairs of related concepts●● A relation can be inferred unambiguously from the A relation can be inferred unambiguously from the

Semantic Network (65%)Semantic Network (65%)

●● Multiple semantic links possible (22%)Multiple semantic links possible (22%)

●● Violation of the Semantic Network (13%)Violation of the Semantic Network (13%)■■ Wrong interWrong inter--concept relationshipconcept relationship

■■ Wrong categorizationWrong categorization

■■ BothBoth

[McCray & al. (in press)]

Page 51: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

From structure to semantics

2. Find a path between two concepts

Page 52: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

52Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

CoCo--occurrence occurrence OverviewOverview

◆◆ CoCo--occurrence between occurrence between MeSH MeSH descriptors in descriptors in MEDLINE citationsMEDLINE citations

◆◆ 8 M pairs of co8 M pairs of co--occurring conceptsoccurring concepts

◆◆ Implicit semanticsImplicit semantics

◆◆ The UMLS provides knowledge for helping make The UMLS provides knowledge for helping make this relationship explicit this relationship explicit ●● Corresponding symbolic knowledge (Metathesaurus)Corresponding symbolic knowledge (Metathesaurus)

●● Categorization (Semantic Network)Categorization (Semantic Network)

Page 53: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

53Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

CoCo--occurrence occurrence ExampleExample

Addison'sdisease

Cortisol

Co-occurrence(frequency = 20)

Adrenal gland

Adrenalcortical

hypofunction

produces

location of

isa

Page 54: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

54Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

CoCo--occurrence occurrence MethodsMethods

◆◆ Based on Metathesaurus relationshipsBased on Metathesaurus relationships●● Does “Does “ CortisolCortisol” belong to the family of “” belong to the family of “ Addison’sAddison’s

disease”?disease”?

◆◆ Based on Semantic Network relationshipsBased on Semantic Network relationships●● What is the relationship between the semantic types of What is the relationship between the semantic types of

““ CortisolCortisol” and “” and “ Addison’sAddison’s disease”?disease”?

Addison'sdisease

Cortisol

Co-occurrence(frequency = 20)

Page 55: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

55Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Family relationshipsFamily relationships

A1

A2

U

S

D1

C

Page 56: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

Does “Cortisol” belong to the family of “Addison’s disease”?

?�

Metathesaurus

AD Family

Chemicals & DrugsDisordersSemanticGroups

Hypo-natremia

RO

TuberculosisAddison'sDisease

DES1Cushing

Syndrome

SIBX

EndocrineDiseases

ANC2 Addison'sdisease

Cortisol

Co-occurrence(frequency = 20)

affected by

caused byaffected bycaused bycomplicated byproduces

affected bycaused bycomplicated by

diagnosed bypresented bytreated by

PharmacologicSubstance

Hormone

Steroid

Disease orSyndrome

SemanticNetwork

What is the relationship between the semantic types of “Cortisol” and “Addison’s disease”?

Page 57: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

57Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

CoCo--occurrence occurrence ResultsResults

◆◆ FamilyFamily●● Only 6% of the relationships between coOnly 6% of the relationships between co--occurring occurring

concepts correspond to symbolic relationships recorded concepts correspond to symbolic relationships recorded in the Metathesaurusin the Metathesaurus

◆◆ Semantic typesSemantic types●● The semantics of the relationship often remains The semantics of the relationship often remains

ambiguousambiguous

Page 58: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

58Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Bioinformatics Bioinformatics OverviewOverview

◆◆ Association in Association in LocusLink LocusLink between a Gene / Gene between a Gene / Gene product andproduct and●● PhenotypePhenotype●● Molecular functionMolecular function●● Biological processBiological process●● Cellular componentCellular component

◆◆ Explicit relationshipExplicit relationship◆◆ Most concepts presents in the UMLSMost concepts presents in the UMLS

◆◆ Is the relationship present in the UMLS?Is the relationship present in the UMLS?

Page 59: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

59Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Bioinformatics Bioinformatics ExampleExample

Dystrophin Cytoskeleton

LocusLinkassociation

�������������� � ��

�� ������

Page 60: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

60Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

BioinformaticsBioinformatics Relationships exploredRelationships explored

HierarchicalAssociativeCo-occurrence

Page 61: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

61Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Bioinformatics Bioinformatics ExampleExample

Dystrophin Cytoskeleton

LocusLinkassociation

�������������� � ��

�� ������

UMLSRelationship

(Co-occurrenceFreq=83483)

Page 62: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

62Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

BioinformaticsBioinformatics ResultsResults

◆◆ 70% of 70% of LocusLink LocusLink associations supported by some associations supported by some kind of relationship in the UMLSkind of relationship in the UMLS

◆◆ Many UMLS relationships supporting Many UMLS relationships supporting LocusLink LocusLink associations are coassociations are co--occurrence relationshipsoccurrence relationships

◆◆ Variation per domainVariation per domain●● PhenotypePhenotype 64%64%

●● Molecular functionMolecular function 83%83%

●● Biological processBiological process 60%60%

●● Cellular componentCellular component 70%70%

Page 63: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

From structure to semantics

3. Limitations

Page 64: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

64Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

InsufficientInsufficient

◆◆ Identifying vs. solving problemsIdentifying vs. solving problems●● CyclesCycles

■■ Which edge to remove?Which edge to remove?

■■ Underlying issuesUnderlying issues

●● Inconsistency between Semantic Network and Inconsistency between Semantic Network and Metathesaurus relationshipsMetathesaurus relationships

■■ Wrong Metathesaurus relationshipWrong Metathesaurus relationship

■■ Wrong / missing Semantic Network relationshipWrong / missing Semantic Network relationship

■■ Wrong categorizationWrong categorization

B

A

ED

G H

??

Page 65: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

65Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Insufficient, but…Insufficient, but…

◆◆ Identifying problems is already importantIdentifying problems is already important

◆◆ Possible usesPossible uses●● Retrospectively:Retrospectively:

To focus the work of human editors of the UMLSTo focus the work of human editors of the UMLS

●● Prospectively:Prospectively:Structural constraints could be used as filters integrated Structural constraints could be used as filters integrated to the UMLS editing environmentto the UMLS editing environment

◆◆ Additional clues are sometimes availableAdditional clues are sometimes available●● RedundancyRedundancy

●● Linguistic featuresLinguistic features

Page 66: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

66Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Structure + redundancy Structure + redundancy (1)(1)

B

A

PAR+

RB

PARorRB

Redundancy

B

A5 1

Democracy

Page 67: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

67Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Structure + linguistic features Structure + linguistic features (2)(2)

Hemofiltration in digoxin overdose

Disease orSyndrome

Therapeutic orPreventive Procedure

Digoxin overdoseHemofiltration

Mapping to UMLS

Syntactic analysis

in digoxin overdosenoun phrasenoun phrase

Hemofiltrationpreposition

treatsSemantic rules

Semantic networkrelationships

Ther. or Prev. Procedure affects Disease or SyndromeTher. or Prev. Procedure complicates Disease or SyndromeTher. or Prev. Procedure has_result Disease or SyndromeTher. or Prev. Procedure treats Disease or SyndromeTher. or Prev. Procedure prevents Disease or Syndrome

Select matching rule

treats

Semantic

interpretation

Page 68: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

Conclusions

Page 69: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

69Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

ConclusionsConclusions

◆◆ Good knowledge resources should be structurally Good knowledge resources should be structurally soundsound

◆◆ Identifying structural abnormalities may help Identifying structural abnormalities may help identify semantic problemsidentify semantic problems

◆◆ Like syntax, structure alone does not ensure that Like syntax, structure alone does not ensure that the semantics is correctthe semantics is correct

◆◆ Close to approaches based on description logicsClose to approaches based on description logics

Page 70: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

70Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications��������������� ������� �������������

Karlt

on Ja

ckso

n / N

LM

����� �

������

���

�� �

Page 71: Issues in the visualization and navigation of …2002/04/29  · Issues in the visualization and navigation of biomedical knowledge New Jersey Institute of Technology Computer Science

Contact:Contact: olivierolivier@@nlmnlm..nihnih..govgov

Olivier BodenreiderOlivier Bodenreider

Lister Hill National CenterLister Hill National Centerfor Biomedical Communicationsfor Biomedical CommunicationsBethesda, Maryland Bethesda, Maryland -- USAUSA


Recommended