+ All Categories
Home > Documents > Populangthei2b2DatabaseWith* … · Populangthei2b2DatabaseWith* Heterogeneous*EMRData:*ASeman’c*...

Populangthei2b2DatabaseWith* … · Populangthei2b2DatabaseWith* Heterogeneous*EMRData:*ASeman’c*...

Date post: 09-Feb-2021
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
16
Popula’ng the i2b2 Database With Heterogeneous EMR Data: A Seman’c Network Approach Sebas&an Mate a , Thomas Bürkle a , Felix Köpcke a , Bernhard Breil b , Bernd Wullich c , Mar&n Dugas b , HansUlrich Prokosch a,d , Thomas Ganslandt d a Chair of Medical Informa1cs, University ErlangenNuremberg, Erlangen, Germany b Ins1tute of Medical Informa1cs, University of Münster, Münster, Germany c Department of Urology, Erlangen University Hospital, Erlangen, Germany d Center for Medical Informa1on and Communica1on, Erlangen University Hospital, Erlangen, Germany 28.09.2011, Oslo MIE2011 full paper
Transcript
  • Popula'ng  the  i2b2  Database  With  Heterogeneous  EMR  Data:  A  Seman'c  

    Network  Approach  

    Sebas&an  Matea,  Thomas  Bürklea,  Felix  Köpckea,  Bernhard  Breilb,  Bernd  Wullichc,  Mar&n  Dugasb,  Hans-‐Ulrich  Prokoscha,d,  Thomas  Ganslandtd  

    a  Chair  of  Medical  Informa1cs,  University  Erlangen-‐Nuremberg,  Erlangen,  Germany  b  Ins1tute  of  Medical  Informa1cs,  University  of  Münster,  Münster,  Germany  c    Department  of  Urology,  Erlangen  University  Hospital,  Erlangen,  Germany  d  Center  for  Medical  Informa1on  and  Communica1on,  Erlangen  University  Hospital,  Erlangen,  Germany    

    28.09.2011, Oslo

    MIE2011  full  paper  

  • Sebas&an  Mate  //  Chair  Of  Medical  Informa&cs  //  University  of  Erlangen-‐Nuremberg    [MIE2011]  

    2

    Introduc'on:  Replacing  Double  Data  Entry  With  “Single  Source”  –  The  DPKK  Example  

    PSA:

    EMR  DPKK  Partner  A  

    PSA:

    EMR  DPKK  Partner  B  

    Original  DPKK  database  

    DPKK  Researcher  

    NEW! i2b2  System  

    DPKK  Researcher  

    NEW! ETL  

    PSA:

    EMR  DPKK  Partner  A  

    PSA:

    EMR  DPKK  Partner  B  

    EMR  database  EMR  database  

    Erlangen  Münster  

  • Sebas&an  Mate  //  Chair  Of  Medical  Informa&cs  //  University  of  Erlangen-‐Nuremberg    [MIE2011]  

    3

    EMR  in  Münster:  AGFA  ORBIS   EMR  in  Erlangen:  Siemens  Soarian  

    Problem  1:  How  to  align  and  merge  data  from  independent  sources?  

    n  Different  data  structures  (different  vendors)  n  Similar  informa&on  (full  prostate  cancer  documenta&on  in  EMR)  

    Introduc'on:  Developing  the  Data  Integra'on  Approach  

  • Sebas&an  Mate  //  Chair  Of  Medical  Informa&cs  //  University  of  Erlangen-‐Nuremberg    [MIE2011]  

    Erlangen  as  an  example:  >  42,000  data  elements  (including  data  element  versioning)  

    4

    Problem  2:  How  to  efficiently  access  and  process  the  huge  piles  of  data  elements?    

    Introduc'on:  Developing  the  Data  Integra'on  Approach  

    Data:  

    n  Structured  n  Uncoded  

  • Sebas&an  Mate  //  Chair  Of  Medical  Informa&cs  //  University  of  Erlangen-‐Nuremberg    [MIE2011]  

    5

    Oracle  SQL  Ser

    ver   MySQL  

    COGNOS  

    Talend   JDBC  ETL  SQL SQ

    L SQL

    Tools  /  Data  Access  Layer  

    Data  /  Storage  Layer  

    Ontology  /  Knowledge  Base  Layer  

    Methods:  The  “Ontology  Layer”  

  • Sebas&an  Mate  //  Chair  Of  Medical  Informa&cs  //  University  of  Erlangen-‐Nuremberg    [MIE2011]  

    6

    Gleason3  

    PSA  

    Target  Ontology  Target  Dataset  

    Gleason1  PSA:

    PSA_value  

    Source  Ontology   Source  System  

    Gleason2  

    n  Express  the  target  dataset  with  a  target  ontology.  n  Express  the  source  system(s)  with  one  or  more  source  ontologies.  n  Create  a  mapping  ontology  to  connect  concepts  from  the  target  ontology  to  

    concepts  from  the  source  ontology.  

    n  Some  concepts  can  be  mapped  directly,  others  require  transforma&ons  and/or  data  filtering,  which  can  be  expressed  with  intermediate  nodes.  

    Methods:  Approach  Overview  

    Mapping  Ontology  

    ADD  

  • Sebas&an  Mate  //  Chair  Of  Medical  Informa&cs  //  University  of  Erlangen-‐Nuremberg    [MIE2011]  

    Target  Ontology   Mapping  Ontology   Source  Ontology  

    7

    Target  

    n  To  allow  complex  data  transforma&ons,  nodes  can  be  cascaded  into  „expression  trees“.  n  OWL-‐to-‐SQL  transla&on:  For  each  mapping  node,  an  SQL  statement,  which  performs  the  

    opera&on  expressed  by  the  node,  can  be  constructed  automa&cally  .  

    A  

    B  

    C  

    D  

    E  

    t=1  t=2  

    t=4  t=3  

    IF  

    GREATER  

    ADD  SUBTR  

    processing  order  

    Methods:  Approach  Overview  

  • Sebas&an  Mate  //  Chair  Of  Medical  Informa&cs  //  University  of  Erlangen-‐Nuremberg    [MIE2011]  

    Target  Ontology   Mapping  Ontology   Source  Ontology  

    8

    Target  

    A  

    B  

    E  

    t=1  t=2  

    t=4  IF  

    GREATER  

    t=3  SUBTR  

    Methods:  OWL  to  SQL  Transla'on  

    ADD   C  

    D  

    INSERT INTO TEMPTABLE(E, A, V) SELECT OP1.E, „Sum…“, OP1.V + OP2.V FROM (SELECT E, A, V FROM …) OP1 FULL OUTER JOIN (SELECT E, A, V FROM …) OP2 ON OP1.E = OP2.E WHERE OP1.V IS NOT NULL AND …

    INSERT INTO TEMPTABLE(E, A, V) SELECT OP1.E, „Sum…“, OP1.V + OP2.V FROM (SELECT E, A, V FROM …) OP1 FULL OUTER JOIN (SELECT E, A, V FROM …) OP2 ON OP1.E = OP2.E WHERE OP1.V IS NOT NULL AND …

    How  to  process  nodes  

    How  to  access  data  

  • Sebas&an  Mate  //  Chair  Of  Medical  Informa&cs  //  University  of  Erlangen-‐Nuremberg    [MIE2011]  

    OntoExport  

    9

    OntoEdit   QuickMapp   i2b2  

    Gleason3  

    PSA  

    Target  Ontology  

    Target  Dataset  

    Gleason1  

    PSA_value  

    Source  Ontology  

    Gleason2  PSA:

    Source  System  

    Mapping  Ontology  

    ADD  

    Results:  Developed  Prototypical  Tools  

  • Sebas&an  Mate  //  Chair  Of  Medical  Informa&cs  //  University  of  Erlangen-‐Nuremberg    [MIE2011]  

    10

  • Sebas&an  Mate  //  Chair  Of  Medical  Informa&cs  //  University  of  Erlangen-‐Nuremberg    [MIE2011]  

    Results:  Ontology  Mapping  for  DPKK  Results  in  the  Münster  and  Erlangen  Mapping  

    11

    n  Described  i2b2  as  a  target  system  in  OWL  (incl.  DPKK  dataset)  n  Created  source  ontologies  for  both  EMRs  (Soarian  and  ORBIS)  n  Expressed  many  types  of  Oracle  database  opera'ons  inside  the  ontology  (as  it  

    was  illustrated  with  the  blue  “cloud”)  

    n  Preliminary  mapping  results  for  Erlangen  and  Münster  (from  a  total  of  166  concepts):  

  • Sebas&an  Mate  //  Chair  Of  Medical  Informa&cs  //  University  of  Erlangen-‐Nuremberg    [MIE2011]  

    12

    n  Re-‐implementa'on  of  the  current  proof-‐of-‐concept    n  Client/server  architecture  to  improve  administra&ve  aspects  n  Integra&on  with  other  data  integra&on  tools  (e.  g.  Talend  OpenStudio)  

    to  simplify  access  to  mul'ple  databases  

    n  Expression  of  complex  and  abstract  rela'onships  between  data  elements,  e.  g.  between  data  elements  at  different  hierarchy  levels  

    n  Allow  certain  calcula&ons  between  data  elements  from  mul&ple  forms  or  different  source  systems  (=>  handling  of  temporal  aspects)  

    n  Strong  desire  to  export  EMR  data  to  other  systems  besides  i2b2:  allow  descrip&on  of  arbitrary  target  systems  and  database  models  

    Discussion  Limita'ons  and  How  to  Tackle  Them:  TODOs    

  • Sebas&an  Mate  //  Chair  Of  Medical  Informa&cs  //  University  of  Erlangen-‐Nuremberg    [MIE2011]  

    13

    n  Abstrac'on  of  data  records  and  database  opera&ons  to  single  ontology  nodes  n  Simplifies  data  iden&fica&on,  extrac&on  and  transforma&on  from  source  

    systems  (e.g.  EMRs)  n  Allows  re-‐use  of  created  “ETL  knowledge”  (compared  to  SQL  code)  n  OWL  format  enables  easy  linkage  to  medical  ontologies  (SNOMED,  NCIt,  …)  

    n  Adds  the  missing  “easy-‐to-‐use”  ETL  part  to  the  i2b2    n  Comprehensive  solu&on  covers  

    1.    Access  to  heterogeneous  data  2.    “Future-‐proof”  (ontology-‐based)  data  processing  3.    Interac&ve  transla&onal  querying  of  clinical  data  (i2b2)  

    => Step forward in implementing “secondary use / single source”

    Discussion:  Benefits    

  • Sebas&an  Mate  //  Chair  Of  Medical  Informa&cs  //  University  of  Erlangen-‐Nuremberg    [MIE2011]  

    Thank  you  for  your  aden'on!  

    Sebas1an  Mate  would  like  to  thank  the  “Arbeitskreis  SoKware-‐Qualität  und  –Fortbildung   e.   V.”   (ASQF)   and   the   commission   of   computer   science   of   the  University   of   Erlangen   who   have   awarded   this   work   with   the   500€   ASQF  diploma  prize.  Thank  you!  

  • Sebas&an  Mate  //  Chair  Of  Medical  Informa&cs  //  University  of  Erlangen-‐Nuremberg    [MIE2011]  

    Backup  slides  

  • Sebas&an  Mate  //  Chair  Of  Medical  Informa&cs  //  University  of  Erlangen-‐Nuremberg    [MIE2011]  


Recommended