Post on 03-Jan-2016
transcript
Standards and Terminologies
S. Trent Rosenbloom, MD MPH
Associate Professor and Vice ChairDepartments of Biomedical Informatics, Internal Medicine and PediatricsVanderbilt University Medical Center
September 21, 2015Department of Biomedical InformaticsBMIF 6300
Standards and Terminologies
Standards are principles and rules designed to ensure that methods used and products created reliably and consistently conform to expectations
Software Standards Detail: minimum set of functions provided methods used to achieve those functions formatting of the data structure
Standards and Terminologies
minimum set of functions provided“Content standards”
methods used to achieve those functions“Functional standards”
formatting of the data structure“Syntactic standards” ← Messaging, e.g., HL7 XML“Semantic standards” ← Terminology, e.g., SNOMED CT
Terminologies can provide formal and machine-computable representations of knowledge and data
Such representation can facilitate interoperability, dissemination, decision support, research
Terminologies
Terminologies are formal representations of entities and their interrelationships. Embodied as concepts, terms, linkages
▪ Concepts are the cognitive representation of entities or meanings
▪ Terms are evocative words or phrases
▪ Linkages are explicitly defined relationships
Terminologies
Concept - ischemic injury and necrosis of heart muscle cells resulting from absent or diminished blood flow in a coronary artery
Terms –▪ Myocardial Infarction▪ Heart Attack
Linkage – ▪ is_a Disease of the Heart▪ has_severity Severities
Terminologies
Morning Star
Evening Star
The second planet from the sun, having an average radius of 6,052 kilometers (3,761 miles), a mass 0.815 times that of Earth, and a sidereal period of revolution about the sun of 224.7 days at a mean distance of approximately 108.2 million kilometers (67.2 million miles).
Morning Star
Evening Star
The second planet from the sun, having an average radius of 6,052 kilometers (3,761 miles), a mass 0.815 times that of Earth, and a sidereal period of revolution about the sun of 224.7 days at a mean distance of approximately 108.2 million kilometers (67.2 million miles).
Conceptual ExperienceRepresentative Terms
Venus
Adapted from Campbell, ‘Representing thoughts, words, and things in the UMLS’, 1998.
Physical Entity
Venus
Mercury
Earth
Jupiter
Saturn
Neptune
Planets of the Solar System
inside outside
Concept: Myocardial Infarction CUI: C0027051 Semantic Type: Disease or Syndrome
Entity: Gross necrosis of the myocardium, as a result of interruption of the blood supply to the area. (Dorland, 27th ed)
Representative Terms (synonyms): Myocardial Infarction Attack coronary Cardiac infarction Heart attack Infarction of heart MI MI - Myocardial infarction Myocardial Infarct Myocardial infarction (disorder) Myocardial infarction syndrome myocardium; infarction
More Specific Concepts (children): Acute myocardial infarction Old myocardial infarction Microinfarct of heart True posterior wall infarction Aborted myocardial infarction Other specified anterior myocardial infarction Silent myocardial infarction Subsequent myocardial infarction Postoperative myocardial infarction First myocardial infarction Myocardial infarction with complication Non-Q wave myocardial infarction
Adapted from the UMLS Metathesaurus.
There are a lot of terminologies
In 2003, the National Committee on Vital Health and Statistics (NCVHS) recommended a subset of existing terminologies as:
“uniform data standards for patient medical record information (PMRI) and the electronic exchange of such information”
Terminologies
PMRI standards: SNOMED CT (as licensed by the National Library of Medicine)
- for the exchange, aggregating, and analysis of patient medical information.
Logical observation Identifiers Names and Codes - for the representation of individual laboratory tests
Federal Drug Terminologies:▪ RxNorm;▪ The representations of the mechanism of action and physiologic effect
of drugs from NDF-RT; ▪ Ingredient name, manufactured dosage form and package type form
the FDA
UMLS (Unified Medical Language System)
The UMLS is a terminology collection Concepts are unique No formal relationships among concepts
present, per se
Using the UMLS: Semantics and relationships from source
terminologies lost (or implied) May mix up different levels of detail from
different terminologies Can loose link with source terminology, which
can hinder maintenance
Terminology [old] History Classification scheme for the London Bills of
Mortality - 16th century John Gaunt’s refinement - middle of the 17th
century International Classification of Diseases (ICD) -
first adopted in Paris in 1900 Multi-axial Standardized Nomenclature of
Diseases (SND) – 1928 Standardized Nomenclature of Diseases and
Operations (SNDO) - 1933
Terminology History
“Modern era for clinical descriptions” With SND and SNDO
▪ Multiaxial: users could model complex concepts by constructing them from more primitive building blocks
▪ Designed to classify diseases based on:EtiologyManifestations Relationships between them
Terminology Desiderata
Statement of purpose, scope, and comprehensiveness Complete coverage of domain specific content Use of concepts rather than terms, phrases and words (concept
orientation) Concepts do not change with time, view or use (concept consistency) Concepts must evolve with change in knowledge Concepts identified through nonsense identifiers (context-free identifier) Representation of concept context consistently from multiple hierarchies Concepts have single explicit formal definitions Support for multiple levels of concept detail Absence of or methods to identify duplication, ambiguity, and synonymy Integration with other terminologies Mapping to administrative terminologies
Adapted from Cimino, ‘Desiderata for controlled medical vocabularies in the twenty-first century’, 1998.
Coverage achieved by one of two ways
▪ Post-coordination - complex concepts from different levels of detail are composed as needed from fundamental concepts (e.g., ‘chest pain’ composed from the concepts ‘chest’ and ‘pain’ when
needed)
▪ Pre-coordination - all levels of detail are modeled with distinct concepts (e.g., ‘chest pain’, ‘substernal chest pain’, and ‘crushing substernal
chest pain’ are all in the terminology)
Completeness measured by Coverage:
▪ coverage calculated as the proportion of concepts covered by a terminology
▪ multiple studies: post-coordinated terminologies generally have better coverage than pre-coordinated terminologies
Post-coordination versus Pre-coordination
Select One Flavor
Select One Topping
Select One Cone
…or…Select One Favorite
Post-Coordination▪ Flexible▪ Wide choice▪ Rules implied▪ Explicit relationships▪ Inefficient▪ Permits Inappropriate
combinations
Pre-Coordination▪ No flexibility▪ Limited choice▪ Asserted knowledge▪ Implied relationships▪ Efficient▪ Only appropriate
combinations
Consequences of post-coordination:
▪ Inefficient post-coordination: “too cumbersome for complex problem entry”
▪ Nonsensical Concepts
▪ Concept duplication
D5-46210 01 Acute appendicitis, NOS G-A231 01 Acute D5-46100 01 Appendicitis, NOS
M-41000 01 Acute inflammation, NOSG-CO06 01 InT-59200 01 Appendix, NOS
G-A231 01 AcuteM-40000 01 inflammation, NOSG-CO06 01 InT-59200 01 Appendix, NOS
Table. Duplication due to compositionality: four ways to compose ‘Appendicitis’ in SNOMED, from the CANON Group
Rigorous development may produce terminologies unusable by healthcare providers for routine clinical tasks.
Rector: tension between clinical usability and meticulous knowledge representation mirrors the conflict -
▪ human users require flexible, expressive terminologies that model common colloquial phrases
▪ computer programs are generally designed to process formally defined concepts having rigidly defined interrelationships.
Rector’s six tasks for terminologies: 1) support efficient data entry and query formulation2) record and archive clinical information3) support sharing and reuse of clinical information4) infer and suggest knowledge according to decision
support algorithms5) support terminology maintenance6) create a natural language output from manual structured
input
Generally a set of flexible, user friendly, colloquial terms displayed in via computer programs.
Use assertional medical knowledge to support efficiency, size and focus
Have been used for problem list entry, clinical documentation, provider order entry
Rosenbloom ST, et al. Interface terminologies: facilitating direct entry of clinical data into electronic health record systems. J Am Med Inform Assoc. 2006 May-Jun;13(3):277-88..
Interface Terminologies
SNOMED CT 2012AA in the UMLS contains:▪ over 311,000 unique concepts, 10% are diagnoses▪ almost 800,000 descriptions▪ approximately 1,360,000 links
S-CT
Terminology Subsets
SNOMED CT CORE subset in the UMLS contains:▪ about 5,000 unique concepts, primarily diagnoses▪ covers 95% of diagnoses recorded from 7 healthcare sites
(Beth Israel Deaconess Medical Center, Intermountain Healthcare, Kaiser Permanente, Mayo Clinic, Nebraska University Medical Center, Regenstrief, Hong Kong Hospital Authority)
S-CT
CORE
Fung KW, Rosenbloom ST, et al. Testing Three Problem List Terminologies in a simulated data entry environment. AMIA Annu Symp Proc .2011:445-54.
Terminology Subsets
SNOMED CT VA-KP subset in the UMLS contains:▪ about 17,000 unique concepts▪ primarily contains precoordinated concepts from the
“Clinical Finding” hierarchy
S-CT
CORE
VA-KP
Terminology Subsets
Institutional subsets / supersets▪ can be created locally as SNOMED CT extensions▪ can be created locally without regard to SNOMED CT▪ may or may not follow standard formalisms
S-CT
CORE
VA-KP
local
Terminology Subsets
S-CT
CORE
VA-KP
CCPSS302,537 12,675
2,437
1,449
1,756527
9862475
9510
* Statistics thanks to Lina Sulieman
Terminology Subsets