Clinical-Genomics HL7 SIG 1
Clinical-Genomics HL7 SIGThe Tissue Typing Use Case
Amnon Shabo1, Shosh Israel2, Guy Karlebach1
1IBM Research Lab in Haifa, 2Hadassah University Hospital
Presented by Amnon Shabo
SHAMAN = Secured Health and Medical Access NetworkIMR = Integrated Medical Records Middleware
In collaboration with the Hadassah University Hospital in Jerusalem
Haifa Labs Integration of multiple sources of data; transformation to standards; full-text indexation
Watson/YorktownLabs
Processing of personal genomic and proteomic data
Clinical-Genomics HL7 SIG 2
Types of Genomic Data• DNA Sequences
• Personal SNPs (Single Nucleotide Polymorphism)
• Programmatic / manual annotation (e.g., SNPs combination x could possibly lead to mutation y)
• Gene expression levels
• Proteomic (proteins translated w/SNPs)
Clinical-Genomics HL7 SIG 3
The Case for Clinical-Genomics• Clinical-Genomics: the use of information obtained from DNA sequencing, patterns of gene expression & resulted proteins for healthcare purposes
• Personalized Medicine– Detect sensitivities/allergies beforehand– Drug Selection by clinicians
• Pharmacogenomics– Improve drug development based on clinical-genomics correlations
– Personal customization of drugs
• Preventive Care
Clinical-Genomics HL7 SIG 4
Gene Expression in Cancer• Differences between normal tissue vs.
premalignant lesion vs. neoplastic tissue – markers of diagnostic value– targets for drug research– evolution of cancer
• Differences between responders vs. non-responders for a standard therapy
• Development of drug-resistance
• Correlation of gene expression patterns with presentation or evolution:– long vs. short survivors– metastatic vs. non-metastatic– clinical or pathological grades
Clinical-Genomics HL7 SIG 5
Differential Display• Difference between banding patterns of cDNA from tumor tissue and normal tissue on polyacrylamide gel can point to a protein that could potentially be the target of a therapeutic antibody.
• DNA microarrays are also employed to examine the genetic expression of thousands of potential antigens and determine which are present in abnormal (tumor) tissue but not in normal tissue.
Clinical-Genomics HL7 SIG 6
Using Databases• Vast databases of genetic information contribute to genomic research
• Search for potential antigens can be as easy as an online search
• HLA Database example: (part of the IMGT - international immunogentics project)
http://www.ebi.ac.uk/imgt/hla/
Clinical-Genomics HL7 SIG 7
Clinical-Genomics InterrelationsBi-directional relationships:
• Genomics Clinical– Personal SNPs could be interpreted as mutations and thus indicate possible diseases/sensitivities
• Clinical Genomics– Patient & family history leads to genetic testing order
– Crosschecking of genomics results
Clinical-Genomics HL7 SIG 8
SNPs Interpretation• SNPs as known mutations (might imply the develop. of diseases)
• Unknown SNPs: – in significant segments of the gene(possibly imply individual differences)
– in gene segments that translate to inactive parts of the proteins(thought to be insignificant)
• SNPs as normal polymorphisms
Clinical-Genomics HL7 SIG 9
CG Uses: From Clinical to ForensicThese pictures describes paternity casework autoRADS - the left picture shows a case of paternity exclusion and the right one a case of paternity inclusion.
Taken from the site of Genelex, a company which offers, among other genomic services, paternity testing (see http://www.genelex.com/).
Clinical-Genomics HL7 SIG 10
Variety of MethodsSTR (short tandem repeats )
STR’s are short sequences that are easy to detect and its specific pattern of repetitions could identify a gene without needing to
sequence the entire gene.
Clinical-Genomics HL7 SIG 11
HL7 Specs for Clinical-Genomics• Create a DIM for Clinical-Genomics
• Derive R-MIMs and message types
• Clinical-Genomic Documents (CDA L3!)
• Review / Utilize the followingemerging bio-informatics standards– BSML (Bioinformatic Sequence Markup Language)
– MAGE-ML (Microarray and GeneExpression Markup Language)
Problem: These standards are not necessarily patient-based.
Clinical-Genomics HL7 SIG 12
BSML: Sequencing Markup<Sequence id="_2" db-source="GMS" length="51" representation="raw" molecule="dna" topology="linear"
alignment-sequence="_"> <Feature-tables>
<Feature-table>- <Feature title="gms:sequence">
<Interval-loc startpos="1" endpos="51" /> </Feature> <Feature title="gms:new_fragment">
<Interval-loc startpos="1" endpos="51" /> </Feature> <Feature title="gms:annotation" value="possible somatic mutation cell line #4 end-
11thxml" /> <Feature title="/gms:new_fragment" /> <Feature title="/gms:sequence"/>
</Feature-table> </Feature-tables> <Seqdata>
AGGAATCAGAAAGGACACTCTGGACTTCAGCCAACAGGATACCTGAGCTGA</Seq-data>
</Sequence>
Clinical-Genomics HL7 SIG 13
MAGE-ML: Gene Expression• Gene Description:
<reporter id="1051_g_at"> <rep_des V="Source: Human melanoma antigen
recognized by T-cells (MART-1) mRNA." /> </reporter>
• Gene Expression Levels:
<reporter id="32847_at" accession="U48959"><NormalizedIntensity value="0.235" /> <Control value="230.972" /> <Raw value="54.3" /> <T-testPValue value="no replicates" /> <PresentAbsentCall value="A" />
</reporter>
Clinical-Genomics HL7 SIG 14
Analogy to Imaging IntegrationHL7DICOM relationship:
existing standardsIMAGINGDICOM
GENOMICSBSML;MAGE;I3C Efforts
Mass data
Summarized data
Pixels
Radiologist-Report
Sequences;Gene- Expression;ProteinsGenomicist-Report
Clinical-Genomics HL7 SIG 15
Current Experimentations at IBM Research• A clinical point of view
– Bone-marrow transplantation center in Israel• Donor-recipient matching: tissue typing• Reporting to international BMT registry
• A research point of view– Research center in Canada
• Focusing on heart&lung diseases• Trying to find clinical-genomic interrelations
• Using clinical data from patient records compared with healthy people
• Using genomic data, mainly gene expression levels and proteins
Clinical-Genomics HL7 SIG 16
Collaboration with Hadassah• Information exchange
– Report to international registries (IBMTR) • Standardization
– Transform to HL7-CDA documents (L.13)• Indexing
– Index all data including semi-structured data• Annotation
– Integrating the personal genomic data • Visualization
– Visualizing the integrated BMT documents
…agctgaa…SNPs
Clinical-Genomics HL7 SIG 17
The BMT Procedure
Pre-BMT
BMT
Post-BMT
–Matching a donor or autologous transplant–Conditioning
•Irradiation•Chemotherapy•GVHD (Graft vs. Host Disease) Prophylaxis
–Substance donated•Bone-marrow•Peripheral blood stem cells•Cord blood stem cells•Donor lymphocytes
-Transplant
–Control of GVHD and other complications–Hematopoietic Reconstitution–Engraftment and Chimerism
Clinical-Genomics HL7 SIG 18
New Trends in BMTMini-allografts (mini-transplantations)
• Immunosuppression instead of total conditioning (destroying the entire immune system)
• Infusing donor lymphocytes to attack tumors, cancerous cells, autoimmune artifacts and infectious pathogens
• Stopping the donor lymphocytes once they’re done with the patient disease source, so that they won’t attack the patient normal cells using ‘suicide genes’
• Striking a balance between to 2 immune systems
Clinical-Genomics HL7 SIG 19
The HLA-Typing Use Case• HLA = Human Leucocytes Antigens; determine the personal fingerprint distinguishing between self and non-self
• HLA-Typing methods move from serology (antibodies) to molecular (DNA) and recently to DNA sequencing yielding higher levels of typing resolution
• Common Triggers: donor-recipient matching, familial relationships, disease association
Clinical-Genomics HL7 SIG 20
Donor Matching• HLA (Human Leukocytes Antigens)
– HLA Typing– DNA typing
– About 6 important loci, each can have dozens of different antigens (alleles)
– Haplotype – common set of antigens
• Relatives versus unrelated donation• Donor banks• Search engines
• Lack of donors to minorities
Clinical-Genomics HL7 SIG 21
HLA Alleles in the Family
Clinical-Genomics HL7 SIG 22
Differences in Antigens
Class I:
Variables exons: 2,3,4
Allelic polymorphism is concentrated in the peptide (antigen) binding site:
Class II
Variables exons: 2
Clinical-Genomics HL7 SIG 23
The HLA-Typing Triggers• Donor-Recipient Matching
– Bone-Marrow transplant• Full match (identical twin) • Avoid GVHD and Promote GVM • Precise and personal match rather than full match
– Organ transplant (cross-match: antibodies) • Living donor: also HLA typing before transplant
• Select the best treatment for the individual patient-donor matching
• HLA-typing is done for post-transplant Info. • Forensic Scenarios
– Paternity disputes – Crime suspects
(HLA is one component of known genetic markers)
Clinical-Genomics HL7 SIG 24
Personal Rather than Full Match
Personal match could be beneficial to to new trends in BMT:
• HLA - A & B versus C:– When there is a match in HLA A & B:– Mismatch in HLA-C might promote GVL (Graft vs. Leukemia)
• Mini-transplants:– Avoid full-match (even when identical twin is available)
Clinical-Genomics HL7 SIG 25
Data of Interest• Class I allele sequences (all cells):
– HLA-A– HLA-B– HLA-C
• Class II allele sequences (certain cells from the immune system):– HLA-DR (most important)– HLA-DQ (the contribution is not proven but can verify the DR match since there there is strong linkage)
– HLA-DP (usually is not being typed)
• might sequence only the polymorphic segments (e.g., exon 2 in class II and exon 2-4 in class I), each exon is about a 300 nucleotides, because SNPs in other segments are not important to the matching
Clinical-Genomics HL7 SIG 26
New Naming Convention• Letter designates the membrane locus
• Full allele name: eight digits
– First 2 digits defining the allele family and where possible corresponding to the serological family
– Third and fourth digits describing coding variation
– Fifth and sixth digits describing synonymous variation
– Seventh and eighth digits describing variation in introns
Clinical-Genomics HL7 SIG 27
Sequencing Data Example:Generic Meta Data:
– Local Names: DRB1*110101– IMGT/HLA No: HLA00756– Class: II– Assigned: 01-AUG-1989– Last Aligned: 17-OCT-2002– Component Entries: AF029281
AJ297587 – Cell SequenceDerived From: 34A2, FPAF
– Known Ethnic Origin of Cells: Caucasoid
– Length: 801 bps
Clinical-Genomics HL7 SIG 28
Sequencing Data Example:
IMGT-HLA SEQUENCE DATABASE.htm
DRB1*110101
SNPs
Clinical-Genomics HL7 SIG 29
Sequencing Data Example:
IMGT-HLA SEQUENCE DATABASE.htm
SNP-Resulted Protein Sequence
Clinical-Genomics HL7 SIG 30
Sequencing Data Example:
IMGT-HLA SEQUENCE DATABASE2.htm
DRB1*110401
SNP
Clinical-Genomics HL7 SIG 31
Sequencing Data Example:
IMGT-HLA SEQUENCE DATABASE2.htm
SNP-Resulted Protein Sequence
Clinical-Genomics HL7 SIG 32
Testing Kit Output Example
- Sample ID - Kit Name- Name - Kit Lot Number- Ethnic Group - Kit Expires- Donor/Patient - DNA Extraction- Purpose of Test - DNA Quality- Test Date - DNA Concentration- Test By - Review Date- Comments - Reviewed BySerology Results:HLA A: B: C: DR: DQ: Positive Lanes:Kit-specific
data
Clinical-Genomics HL7 SIG 33
Tissue Typing Report- Recipient- Subject- Specific Alleles
- Record Number- Molecular Sample
- Date- Disease
- Patient Result
- Specific Alleles
- Possible combinations
- Siblings- Unrelated Donors
Clinical-Genomics HL7 SIG 34
Search for Unrelated Donor• Banks of potential donors (volunteers)
• Each donor was tested only for HLA Class I
• When a patient needs a donor:– The transplant facility searches the donor banks to
find a donor (direct access to the donor banks databases)
– The search is based on Class I matching
– If appropriate donors are found – then the searching transplant facility initiates a request to the respective donor banks, asking for Class II typing
– Each approached donor bank is moving the request to the tissue typing lab where the DNA samples reside
– Class II matching results are returned to the searching facility and if the donor with the best match in both class I & II is approached
Clinical-Genomics HL7 SIG 35
Donor Banks
Search for Unrelated DonorTransplant Center (TC) searches for
donorsDonor Banks
Patient Class I HLA
Class I Matching donors
Donor Bank
Request for HLA class II typing
TC chooses potential donors
Tissue Typing Lab Class II Typing
TC chooses
best donor
Class II Matching donors
Clinical-Genomics HL7 SIG 36
Genomic Data in a Clinical Docs
• A DNA Testing Device – raw DNA sequences
• Reports from service units, e.g., tissue typing, should answer questions such as patient-donor matching, fatherhood, etc.
• Embedding annotated results received from a DNA lab in a CDA document
• Linking genomic annotations and clinical data (external links?)
Clinical-Genomics HL7 SIG 37
Matching Option Notations• Different notations for coarse-grain results:
– possibilities from the A24 antigen family could be represented differently by different kits on the same patient DNA tested:• A*2402101-06/08-11N/13-15/17/18/20-23/25-36N• A*2402101-06/08-11N/13-15/17/18/20-23/25-31
– Pair combinations (inherited alleles):• DRB1*0402 AND DRB1*0408orDRB1*0404/44 AND DRB1*0414
Kit A:Exact combinationKit B:
two possiblecombinations or
Clinical-Genomics HL7 SIG 38
Report Example – Unrelated DonorsThe Patient
Unrelated Donor 1
Unrelated Donor 2
Unrelated Donor 3
Clinical-Genomics HL7 SIG 39
Class I vs. Class II Antigens• A 4-digit resolution level is common in class II antigens as they have been discovered more lately
• It’s desired that class I antigens will report in 4 –digits as well as they are more crucial to BMT success
• 4-digits reporting requires molecular and sequencing procedures
• 4-digits reporting still not common in class I
Clinical-Genomics HL7 SIG 40
Clinical-Genomic Data in CDA?• What should go into a clinical document (extent of detail)?
• Programmatic and manual annotation at different levels?
• The users of such integrated documents: clinicians? genomicists? patients? Medico-ethical issues!
• HL7-Association semantics that represents the interrelations of clinical-genomics
Clinical-Genomics HL7 SIG 41
First Attempts using CDA…• GMS
– Genetic Messaging System– From the computational biology center in IBM Watson– Example: integrating the genomic annotation and analysis of the
personal DNA sequences, into the clinical document (CDA format)
<levelone> <clinical_document_header> <!--header structures per CDA--> </clinical_document_header> <body> <!--clinical content per CDA--> <!--GMS merges genomic data here--> <gms:dna sequence="2" base="802" locus="1"> <gms:annotation>
possible somatic mutation cell line #4 end-11th
</gms:annotation> AGGAATCAGAAAGGACACTCTGGACTTCAGCCAACAGGATACCTGAGCTGA... <gms:automated_annotation> </body></levelone>
CDA L1
Clinical-Genomics HL7 SIG 42
And the Work Just Begins…• Use Cases in Detail & Taxonomy
• High-Level CG Model and HL7-DIM
• Messages
• Documents
• Prototyping info. Exchange using specs