XML in Healthcare and the Semantic Web Jonathan Borden, M.D. Center for Brain and Cranial Diseases...

Post on 26-Mar-2015

214 views 0 download

Tags:

transcript

XML in Healthcare and the Semantic Web

Jonathan Borden, M.D.Center for Brain and Cranial DiseasesSt. Vincent Health System, Erie PAInvited Expert, W3C Web Ontology Working GroupChair, ASTM E31.28 Electronic Healthcare Records

The Goal

Answer questions like:“Of all the patient’s I operated on for

brain tumors between 1996-2000, matching severity of pathology and matching clinical status and who have the “P53” mutation, did PCV chemotherapy improve the cure rate at five years?”

Healthcare: The current situation

A disaster: 1.1 Trillion $/year in the USA30-40 % overheadmostly paper basedhighly proprietary commercial systemstens of thousands of people die each

year due to poor information/errorsMost of the information is rendered

useless

Strategies

Define open standardsCapture information in an electronic

formReduce errors related to informationDefine distributed, web enabled,

query models

Tactics

XML, schemas, query modelSemantic Web/URI graphsData analysis based on actual

population rather than small, potentially biased, samples

Google for biomedical information

Why XML?

Widely implemented with excellent open source tools

Life of data is longer than life of application

Data driven, Platform independentFormal schema and query models

Reinventing medical informatics

Get the data format right and the rest will follow

Structured information has been the holy grail of medical informatics for the last 30+ years

XML is the culmination of 30+ years of work in structured information

Time to do something

XML Briefly

Simplification of SGML … markup language for the web

<element> content </element><element attribute=“value”>

<child-element another=“123”/></element>

XML and Infosets

<patient> <person.name>

<given>James</given><given>Steven</given>

<family>Smith</family><suffix>3rd</suffix>

</person.name>startElement(“patient”)

startElement(“person.name”)startElement(“given”);characters(“James”);...

Regular Expressions

Pattern matching“*TATA*”bp ::= ‘G’ | ‘T’ | ‘A’ | ‘C’tata ::= bp*, ‘T’, ‘A’, ‘T’, ‘A’, bp*

XML DTD

<!ELEMENT foo (bar*)><!ELEMENT bar (baz?)><!ATTLIST bar bop CDATA

#IMPLIED><!ELEMENT baz (#PCDATA)>

Tree Regular Expressions

element foo{element bar{

attribute bop[int]element baz{‘xxx’}}

}

<foo><bar bop=“23”>

<baz>xxx</baz>

</bar></foo>

ASTM E2182/E2183

XML DTDs for HealthcareEmphasize Human ReadabilityFlexibilityOpenhealth reference

implementation http://www.openhealth.org/ASTM

Compatible with HL7 CDA

ASTM Healthcare DTDs

clinical.header compatible with HL7 CDA

clinical.body specific to document type operative.report radiology.report discharge.summary etc.

ASTM E31.28 Clinical Header

ch.person.type = person.name, id*, addr* ch.organization.type = organization.name?, id*, addr* clinical.header =

element clinical.header{ ch.attrib, id*, version.number?, confidentiality.code*, patient.encounter?, authenticator*, legal.authenticator*, intended.recipient*, originator?, originating.organization?, transcriptionist?, provider+, service.actor*, patient, events?, codes?, related.document*

}

ASTM E31.28 Clinical Header

service.actor = element service.actor { ch.attrib, xlink.attrib?,

(person.name|organization.name), id*, addr*, type.code?, function?, date.time? }

provider = element provider{ ch.attrib, ch.actor.type,

function?}

ASTM E31.28 Clinical Header

patient.encounter = element patient.encounter{ ch.attrib,

(id? & practice.setting? & date.time? & location)

}

service.target.model = ch.actor.type & birth.date? & gender?

patient = element patient { ch.attrib,xlink.attrib?,

service.target.model }

Encounter

<encounter> <patient>…</patient> <provider>…</provider> <date.time>…</date.time> <location> … </location> <encounter.id>…</encounter.id>

</encounter>

XML examples

<person> <person.name>

<prefix>Ms.</prefix> <given>Susan</given> <given>Samantha</given> <family>Jones</family>

</person.name> <id type=“SSN”>000-11-2233</id>

XML examples

<patient> <person.name> … </person.name> <id authority=“New England Medical

Center”>000112233</id>

</patient> <provider>

<person.name><prefix>Dr.</prefix><given>Amanda</given><family>Smith</family></person.name>

</provider>

Using XML to generate reports

Browser formASTM E2182 XML formatXSLT transform for display in

browserXSL-FO transform for printable form

(e.g. PDF)

ASTM Opnote: Header (1/3)<operative.report xmlns="http://www.openhealth.org/ASTM/operative.report"> <clinical.header xmlns="http://www.openhealth.org/ASTM/clinical.header">

<id>5556666</id><patient.encounter>

<id>ENC-11111</id><practice.setting>Operation</practice.setting><date.time>2000-10-15</date.time><location>New England Medical Center</location>

</patient.encounter><provider>

<person.name><prefix type="title">Dr.</prefix><given>Jonathan</given><given>Alan</given><family>Borden</family><suffix type="degree">M.D.</suffix>

</person.name>...

ASTM Opnote: Header (2/3)…<id type="license" authority="MA">12345</id>

<addr type="office"><house.number>750</house.number><street>Washington Street</street><city>Boston</city><state>MA</state><zip>02111</zip><uri type="email">mailto:jonathan@openhealth.org</uri><telephone>617-636-7587</telephone>

</addr><type.code>Attending</type.code><function>Surgeon</function>

</provider>...

ASTM Opnote: Header (3/3)…

<patient><person.name>

<given>John</given><given type="MI">Q</given><family>Doe</family><suffix>Jr.</suffix>

</person.name><id type="patient.identifier" authority="NEMC">111223344</id><id type="SSN" authority="SSA">111-22-3344</id><birth.date>1955-10-21</birth.date>

</patient><codes>

<coded.value code.system="CPT">63051</coded.value><coded.value code.system="CPT">69990</coded.value><coded.value code.system="ICD9">XXX.21</coded.value>

</codes> </clinical.header>

ASTM Opnote: Body <clinical.body> <preoperative.diagnosis>Right Frontal Brain Tumor</preoperative.diagnosis> <postoperative.diagnosis>same, probable Astrocytoma</postoperative.diagnosis> <procedure>Right Frontal Craniotomy for Excision of Brain Tumor</procedure> <anesthesia>GETA</anesthesia> <indications><p>The patient presents with severe headaches and blurred vision. An MRI demonstrates a large cystic irregularly shaped mass within the right frontal lobe.</p> </indications> <description> <p>The patient had application of the external fiducial markers and was brought down to the MRI suite where a head MRI was obtained using the frameless stereotactic (3D) protocol. The image set was transferred using the DICOM protocol ... </p> </description> <estimated.blood.loss>100cc</estimated.blood.loss> <patient.condition>Stable, extubated</patient.condition> <disposition>SICU</disposition> </clinical.body></operative.report>

How it works

Browser

Apache

XSLT

Servlet engine

xml:dbRDF

Form generation

Form.xml

Defaults.xml

Formgen.xsl

XML + XSLT => XHTML

Workflow

Form createdTransform into ASTM XML formatXHTML editing (opnote-edit.xsl)Sign finished productRender as XHTML for viewing,

printingemail to Medical Records and Billing

Workflow

generate

edit

sign

Billing

repository

Document analysis

Like gene sequences, it turns out that …Medical documentation is highly repetitiveWith ‘hot spots’ of unique informationSchema defines template filled with valuesEasily expanded into HTML for human

consumptionEasily analyzed by software

Document analysis

Integrating binary formats

MIME <-> XMTPHL7 V2X12 EDIDICOM

Internet Telemedicine

The OceanMed project, 1998Merchant vessel, e-mail access via

satellite gatewayDigital cameraWeb based physician access

XMTP

ShipGateway

XMTPMIME -> XML ->

XSLT -> HTML

SMTP

HTML

XMTP Consult

36 year old male has itchy rash for 6 days

Hydrocortisone cream 1% to affected area t.i.d.|

reply

How it works

Messages arrive in MIME formatMIME SAX parser ‘converts’ to XML

by SAX eventsXMTP employs XML object model

*not necessarily* serialization format ->

grove processing

XMTP

From: joe.patient@home.com To: sue.doctor@openhealth.org Content-type: multipart/related; charset=iso-8859-1 --------- startDocument()

startElement(“MIME”) startElement(“From”)

• characters(“joe.patient@home.com”) endElement(“From”) startElement(“Content-Type”, attribute(“charset”,”iso-8859-

1”))• characters(“multipart/related”)

endElement(“Content-Type”)

The XMTP/MIME grove

Content-type: text/plain

From: joe@whereever.org

To: sue@example.com

Hi Sue! See you in Boston, Joe

<MIME>

<Content-type>text/plain</Content-Type>

<From>joe@whereever.org</From>

<Body>Hi Sue! See you in Seattle, Joe</Body>

</MIME>

The HL7 Grove

Non-XML syntax => XML InfosetMSH|PAT|Jones^James^Stephen^3rd|

startElement(“patient”)startElement(“person.name”)

startElement(“family”)characters(“Jones”);

endElement(“family”)…endElement(“person.name”)

endElement(“patient”)

Simple building blocks

XML parsersXSLT transform enginesHTTP clients and servers

From syntax to semantics

Layer 1: syntax XML defines syntactic constrains on text other specs define syntactic constraints

on binary dataLayer 2: datatypes

integers define mapping from lexical space to value space

“10”base10 -> 10, “10”base2 -> 2

The shape of informationsyntax -> structure = semantics

“…..TATA…..”

gene

tatasnp

snp

Pattern matching transform

Semantics

Layer 3: hierarchy of classes the set of individuals of a given

datatype or object type define a classOntology: a description of a

collection of classes, their properties and the relationships between them

Healthcare Ontology

RDF in Healthcare

<rdf:Description about=“…/patient/12345”><lab:HIV>positive</lab:HIV><lab:CD4>100</lab:CD4>

</rdf:Description>

<path:Biopsy about=“…/patient/12345”>

<path:description>The brain demonstrates areas of PML including viral inclusion bodies

</path:description>

</path>

RDF is...

A standard syntax to represent (edge labeled) directed graphs in XML

DLG: Semantic Networks

vertebrate

mammal bird

canary ostrich

heartspine

hair

fly

wings

walkdoesn’t fly

yellow

isa isa

isa

has

can

freddie hugo

Semantic Networks

A way to represent natural language circa 1970s

A format for organizing statements in a way that can be queries by computers

Semantic Networks

“Can freddy fly?”“Does hugo have wings?”“Does freddy have a spine?”“Of all the canaries, how many live in

cages?”

RDF N-triples syntax

Subject predicate object .

ex:Freddy rdf:type ex:Canary .

ex:Canary rdfs:subClassOf ex:Bird .

ex:Freddy ex:color “Yellow” .

Bird

Canary

yellow

isa

Freddie

RDF/XML syntax

<rdf:Description rdf:ID=“Freddy”>

<rdf:type rdf:resource=“#Canary”/>

<ex:color>Yellow</ex:color>

</rdf:Description>

<rdf:Description rdf:ID=“Canary”>

</rdf:Description>

RDF/XML syntax: typed

<ex:Canary rdf:ID=“Freddy”>

<ex:color>Yellow</ex:color>

</ex:Canary>

Semantic analysis

“Of all the patient’s I operated on for brain tumors between 1996-2000, matching severity of pathology and matching clinical status and who have the “P53” mutation, did PCV chemotherapy improve the cure rate at five years?”

Web Ontology Language (OWL)

Problem (restated): "Tell me what wines I should buy to serve with each course of the following menu. And, by the way, I don't like Sauterne."

OWL is a language for defining Web ontologies and their associated knowledge bases.

Ontologies

Ontology is a term borrowed from philosophy that refers to the science of describing the kinds of entities in the world and how they are related. In OWL, an ontology is a set of definitions of classes and properties, and constraints on the way those classes and properties can be employed.

OWL

includes taxonomic relations between classes datatype properties, descriptions of

attributes of elements of classes, object properties, descriptions of relations

between elements of classes,

Datatype properties and object properties are collectively the properties of a class.

Simple Named Classesclass, subClassOf

Root classes: Every individual in the OWL world is a member of owl:Thing.

sample wines domain, we create three root classes: Winery, Region, and ConsumableThing. <owl:Class rdf:ID="Winery"/> <owl:Class rdf:ID="Region"/> <owl:Class rdf:ID="ConsumableThing"/>

Simple Named Classesclass, subClassOf

<owl:Class rdf:ID="PotableLiquid">

<rdfs:subClassOf rdf:resource="#ConsumableThing" />

</owl:Class>

<owl:Class rdf:ID="Wine"> <rdfs:subClassOf rdf:resource="#PotableLiquid"/> <rdfs:label xml:lang="en">wine</rdfs:label>

<rdfs:label xml:lang="fr">vin</rdfs:label> ... </owl:Class>

Defining individuals

<Region rdf:ID="CentralCoastRegion" />

is identical to

<owl:Thing rdf:ID="CentralCoastRegion" /> <owl:Thing rdf:about="#CentralCoastRegion"> <rdf:type rdf:resource="#Region"/> </owl:Thing>

Grapes

<owl:Class rdf:ID="Grape” />

<owl:Class rdf:ID="WineGrape">

<rdfs:subClassOf rdf:resource="#Grape"/> </owl:Class>

<WineGrape rdf:ID="CabernetSauvignonGrape" />

Simple properties

Object Properties

<owl:ObjectProperty rdf:ID="madeFromGrape"> <rdfs:domain rdf:resource="#Wine"/> <rdfs:range rdf:resource="#WineGrape"/> </owl:ObjectProperty>

Property hierarchy

<owl:ObjectProperty rdf:ID="WineDescriptor" />

<owl:Class rdf:ID="WineColor"> <rdfs:subClassOf rdf:resource="#WineDescriptor" /> ...</owl:Class>

<owl:ObjectProperty rdf:ID="hasWineDescriptor"> <rdfs:domain rdf:resource="#Wine" /> <rdfs:range rdf:resource="#WineDescriptor" /></owl:ObjectProperty>

<owl:ObjectProperty rdf:ID="hasColor"> <rdfs:subPropertyOf rdf:resource="#hasWineDescriptor" /> <rdfs:range rdf:resource="#WineColor" /></owl:ObjectProperty>

Domain and range

<owl:ObjectProperty rdf:ID="locatedIn">

...

<rdfs:domain rdf:resource="http://www.w3.org/2002/07/owl#Thing" />

<rdfs:range rdf:resource="#Region" />

</owl:ObjectProperty>

Restrictions

<owl:Class rdf:ID="Wine"> <rdfs:subClassOf rdf:resource="#PotableLiquid"/> <rdfs:subClassOf>

<owl:Restriction> <owl:onProperty rdf:resource="#madeFromGrape"/> <owl:minCardinality>1</owl:minCardinality>

</owl:Restriction></rdfs:subClassOf> <rdfs:subClassOf>

<owl:Restriction> <owl:onProperty rdf:resource="#locatedIn"/> <owl:minCardinality>1</owl:minCardinality></owl:Restriction>

</rdfs:subClassOf> ... </owl:Class>

Vintages

<owl:Class rdf:ID="Vintage"> <rdfs:subClassOf rdf:resource="http://www.w3.org/2002/07/owl#Thing"/> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="#vintageOf"/> <owl:minCardinality>1</owl:minCardinality> </owl:Restriction> </rdfs:subClassOf></owl:Class>

<owl:ObjectProperty rdf:ID="vintageOf"> <rdfs:domain rdf:resource="#Vintage" /> <rdfs:range rdf:resource="#Wine" /></owl:ObjectProperty>

Datatype properties

<owl:Class rdf:ID="WineYear" />

<owl:DataTypeProperty rdf:ID="yearValue">

<rdfs:domain rdf:resource="#WineYear" />

<rdfs:range rdf:resource="&dt;wineYear"/> </owl:DataTypeProperty>

dt;wineYear ::= integer > 1700

Properties of individuals

<CaliforniaRegion rdf:ID="SantaCruzMountainsRegion" />

<Winery rdf:ID="SantaCruzMountainVineyard" />

<CabernetSauvignon rdf:ID="SantaCruzMountainVineyardCabernetSauvignon" >

<locatedIn rdf:resource="#SantaCruzMountainsRegion"/>

<hasMaker rdf:resource="#SantaCruzMountainVineyard" />

</CabernetSauvignon>

Ontology mapping

sameClassAssameIndividualAssamePropertyAs<owl:Class rdf:ID="TexasThings">

<owl:sameClassAs> <owl:Restriction>

<owl:onProperty rdf:resource="#locatedIn" /> <owl:allValuesFrom rdf:resource="#TexasRegion" />

</owl:Restriction> </owl:sameClassAs>

</owl:Class>

Complex constructs

Description Logic unionOf intersectionOf complementOf oneOf disjointWith

Healthcare DL ontologies

OpenGALEN http://www.opengalen.org Open terminology French Ministry of Health CCAM

SNOMED http://www.snomed.org Closed DL terminology

Simplified Healthcare Ontology

<owl:Class rdf:ID=“Provider”>

<rdfs:subClassOf rdf:resource=“#Person”/>

</owl:Class>

Simplified Healthcare Ontology

Healthcare Ontology

Putting it all together

Biomedical information has many vocabularies - each in its own namespace

genetics “Bio ML”pathology “SNOMED”surgery “CPT”medicine “ICD”radiology “DICOM”

Putting it all together

Electronic medical record

genesdiagnoses

drugs

procedures

genetics

MRIPath-specimen

personGene:p53

Left temporal tumorSNOMED:

glioblastoma

OWL across schemas

Assimilating disparate information

glioblastoma

p53.1

...Ring enhancing

enhancing astrocytoma p53

UMLS next generation

Ontologies exposed as OWL on webCross references exposed as OWL on

web

Enables searching for and reasoning about terms relating to eachother

Enables searching for and reasoning about terms from multiple terminologies

Semantic analysis

repository

instance

Class

Class

Property

domain

type

subClass

Class

type

Queries: several views

Regular expression pattern matchingQuery as universal/existential

quantification (FOPL)Query as DL classification

First Order Predicate Logic

(for-all ?pat (exists ?surgeon (last-name ?surgeon “Borden”))

(exists ?procedure (craniotomy ?procedure)(patient ?procedure ?pat)(surgeon ?procedure ?surgeon)(between (date ?procedure)

“1996” “2000”)(sequence ?procedure “p53”)

...

Future directions

The technology is here …ASTM E31.28 http://www.astm.orgDefine schemas and ontologiesStandardize data formatsCollect datajust do it!

jonathan@openhealth.org

Contact Information

Jonathan Borden, M.D.Center for Brain and Cranial DiseasesSt. Vincent Health System311 W. 24th StreetErie, PA, 16505

www.openhealth.org/ASTMwww.openhealth.org/opnote (demo)www.w3.org/2001/sw/WebOntwww.jonathanborden-md.com

jonathan@openhealth.org