HL7 Clinical Genomics and Structured Documents
Work Groups
CDA Implementation Guide: Genetic Testing Report
DRAFT PROPOSAL
Amnon Shabo (Shvo), [email protected]
HL7 Clinical Genomics WGCo-chair and Modeling Facilitator
HL7 Structured Documents WGCDA R2 Co-editorCCD Implementation Guide Co-editor
2
Haifa Research Lab
The HL7 Clinical Genomics SIG
Mission: to enable the standard use of patient-related genetic data such as DNA sequence variations and gene expression levels, for healthcare purposes (‘personalized medicine’) as well as for clinical trials & research
Genomic Data Clinical Data
HL7
DICOMX12
HL7 Clinical Genomics -
A bridge standard…
MAGE
BSMLPSI
GenBank
HUGOSwissProt
SNOMED
ICDLOINC
3
Haifa Research Lab
How to Handle Raw and Mass Data
Could we learn from the imaging integration effort?
existing standardsIMAGINGDICOM
GENOMICSBSML;MAGE-ML;......
Mass and noisy data
Summary, interpretation,Narrative, etc.
Pixels
Radiologist-Report
Bio-sequences;Pathways; Gene ExpressionGeneticist-Report
4
Haifa Research Lab
HL7 Clinical Genomics v3 Static ModelsFamily
History
Genetic
Loci
Utilize
Genetic
Locus
Constrained GeneticVariation
Phenotype(utilizing the HL7
Clinical Statement)
Utilize
Utilize
Utilize
Implementation Topic
Normative
DSTU
Constrained Gene ExpressionImplementation Topic
Comments
RCRIM LAB
Other domains
Utilize
Utilize
CDA IG
Ref
eren
ce
Reference
5
Haifa Research Lab
0..* associatedObservation
typeCode*: <= COMPcomponent
0..* associatedProperty
typeCode*: <= DRIVderivedFrom2
0..* polypeptide
typeCode*: <= DRIVderivedFrom5
SEQUENCES & PROTEOMICS
0..* expression
typeCode*: <= COMPcomponent1
0..* sequenceVariation
typeCode*: <= COMPcomponent3
IndividualAlleleclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]negationInd: BL [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: CD [0..1] (allele code, drawn from HUGO-HGVS or OMIM)methodCode: SET<CE> CWE [0..*]
GeneticLocusclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (e.g., ALLELIC, NON_ALLELIC)text: ED [0..1]effectiveTime: IVL<TS> [0..1]confidentialityCode: SET<CE> CWE [0..*] <= ConfidentialityuncertaintyCode: CE CNE [0..1] <= Uncertaintyvalue: CD [0..1] (identifying a gene through GenBank GeneID with an optional translation to HUGO name.)methodCode: SET<CE> CWE [0..*]
0..* individualAllele
typeCode*: <= COMPcomponent1
SequenceclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CD CWE [1..1] (the sequence standard code, e.g. BSML)text: ED [0..1] (sequence's annotations)effectiveTime: GTS [0..1]uncertaintyCode: CE CNE [0..1] <= Uncertaintyvalue: ED [1..1] (the actual sequence)interpretationCode: SET<CE> CWE [0..*] <= ObservationInterpretationmethodCode: SET<CE> CWE [0..*] (the sequencing method)
ExpressionclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [1..1] (the standard's code (e.g., MAGE-ML identifier)negationInd: BL [0..1]text: ED [0..1]effectiveTime: GTS [0..1]uncertaintyCode: CE CNE [0..1] <= Uncertaintyvalue: ED [1..1] (the actual gene or protein expression levels)interpretationCode: SET<CE> CWE [0..*] <= ObservationInterpretationmethodCode: SET<CE> CWE [0..*]
PolypeptideclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: CD [0..1] (protein code, drawn from SwissProt, PDB, PIR,HUPO, etc.)methodCode: SET<CE> CWE [0..*]
DeterminantPeptidesclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: CD [0..1] (peptide code, drawn from referencedatabases like those used in the Polypeptide class)methodCode: SET<CE> CWE [0..*]
Constrained to a restrictedMAGE-ML constrained schema,specified separately.
Constraint: GeneExpression.value
Note:A related allele that is ona different locus, and hasinterrelation with thesource allele, e.g.,translocated duplicatesof the gene.
0..* clinicalPhenotype
typeCode*: <= PERTpertinentInformation
ExternalObservedClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= EVNid*: II [1..1] (The unique id of an external observation residing outside of the instance)code: CD CWE [0..1]text: ED [0..1]effectiveTime: GTS [0..1]
Note:An external observation is preferably a valid observationinstance existing in any other HL7-compliant instance,e.g., a document or a message.Use the id attribute of this class to point to the uniqueinstance identifier of that observation.
Note:A phenotype which has been actuallyobserved in the patient representedinternally in this model.
Note:This is a computed outcome, i.e.,the lab does not test for the actualprotein, but secondary processespopulate this class with thetranslational protein.
SequenceVariationclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CD CWE [0..1]negationInd: BL [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: ANY [0..1] (The variation itself expressed with recognized notation like 269T>C or markup like BSML or drawn from an external reference like OMIM or dbSNP.)interpretationCode: SET<CE> CWE [0..*] <= ObservationInterpretationmethodCode: SET<CE> CWE [0..*]
KnownClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= DEFcode: CD CWE [0..1]text: ED [0..1]effectiveTime: GTS [0..1]uncertaintyCode: CE CNE [0..1] <= ActUncertaintyvalue: ANY [0..1]
Note:These phenotypes are not the actual (observed)phenotypes for the patient, rather they are thescientifically known phenotypes of the sourcegenomic observation (e.g., known risks of amutation or know responsiveness to a medication).
Note:Code: COPY_NUMBER, ZYGOSITY, DOMINANCY, GENE_FAMILY,etc. For example, if code = COPY_NUMBER, then the value is oftype INT and is holding the no. of copies of this gene or allele.
0..* clinicalPhenotype
typeCode*: <= PERTpertinentInformation
EXPRESSION DATA
SEQUENCE VARIATIONS
Polypeptide
Note:The Expression class refers to both gene and proteinexpression levels. It is an encapsulating class that allowsthe encapsulation of raw expression data in its value attribute.
0..* sequence
typeCode*: <= COMPcomponent2
0..* clinicalPhenotypetypeCode*: <= PERTpertinentInformation
0..* clinicalPhenotype
typeCode*: <= PERTpertinentInformation
Note:The code attribute indicates inwhat molecule the variation occurs,i.e., DNA, RNA or Protein.
0..* expression
typeCode*: <= COMPcomponent5
Note:Use the associations to the shadowclasses when the data set type (e.g.,expression) is not at deeper levels(e.g., allelic level) and needs to beassociated directly with the locus(e.g., the expression level is thetranslational result of both alleles).
0..* associatedObservationtypeCode*: <= COMPcomponent2
0..1 associatedObservation
typeCode*: <= COMPcomponent4 Note:
This recursive associationenables the association of anRNA sequence derived froma DNA sequence and apolypeptide sequence derivedfrom the RNA sequence.
0..* clinicalPhenotype
typeCode*: <= PERTpertinentInformation
Note:This class is a placeholder for a specific locus on the genome - that is - a position of a particulargiven sequence in the subject’s genome or linkage map.Note that the semantics of the locus (e.g., gene, marker, variation, etc.) is defined by data assignedin the code & value attributes of this class, and also by placing additional data relating to thislocus into the classes associated with this class like Sequence, Expression, etc..
Note:The term 'Individual Allele' doesn't refer necessarily to aknown variant of the gene/locus, rather it refers to theindividual patient data regarding the gene/locus and mightwell contain personal variations w/unknown significance.
AssociatedObservationclassCode*: <= OBSmoodCode*: <= EVNid: SET<II> [0..*]code: CD CWE [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: ANY [0..1]methodCode: SET<CE> CWE [0..*]
Note:The code attribute could hold codes likeNORMALIZED_INTENSITY, P_VALUE, etc.The value attribute is populated based on theselected code and its data type is then setupaccordingly during instance creation.
Note:The code attribute could hold codes like TYPE,POSITION.GENOME, LENGTH, REFERENCE, REGION, etc..The value attribute is populated based on the selected codeand its data type is then setup accordingly during instancecreation. Here are a few examples:If code = TYPE, then the value is of type CV and holds one of thefollowing: SNP (tagSNP), INSERTION, DELETION,TRANSLOCATION, etc.
if code = POSITION, then value is of type INT and holdsthe actual numeric value representing the variation positionalong the gene.
if code = LENGTH, then value is of type INT and holdsthe actual numeric value representing the variation length.
If code = POSITION.GENE, then value is of type CV and is oneof the following codes:INTRON, EXON, UTR, PROMOTER, etc.
If code = POSITION.GENOME, then value is of type CV and is oneof the following codes:NORMAL_LOCUS, ECTOPIC, TRANSLOCATION, etc.
If the code = REFERENCE, then value istype CD and holds the reference gene identifier drawn from areference database like GenBank.
The full description of the allowed vocabularies for codes and itsrespective values could be found in the specification.
AssociatedObservation
Note:Code: CLASSIFICATION, etc.For example, if code =CLASSIFICATION, then the valueis of type CV and is holding eitherKNOWN or NOVEL.
reference
0..* geneticLocus
typeCode*: <= REFR
Note:A related gene that is on adifferent locus, and stillhas significant interrelationwith the source gene (similarto the recursive associationof an IndividualAllele).
ClinicalPhenotypeclassCode*: <= ORGANIZERmoodCode*: <= EVN
0..* observedClinicalPhenotype
typeCode*: <= COMPcomponent1
0..* knownClinicalPhenotype
typeCode*: <= COMPcomponent2
0..* externalObservedClinicalPhenotype
typeCode*: <= COMPcomponent3
At least one of the target acts ofthe three component act relationshipsshould be populated, since this isjust a wrapper class.
Constraint: ClinicalPhenotype
Note:- code should indicate the type of source, e.g., OMIM- text could contain pieces from research papers- value could contain a phenotype code if known (e.g., if it’s a disease, then the disease code)
ClinicalPhenotype
ClinicalPhenotype
ClinicalPhenotype
ClinicalPhenotype
ClinicalPhenotype
ClinicalPhenotype
0..1 identifiedEntity
typeCode*: <= SBJcontextControlCode: CS CNE [0..1] <= ContextControl "OP"
subject
reference
0..* individualAllele
typeCode*: <= REFR
ObservedClinicalPhenotype
Note:This CMET might be replacedwith the Clinical Statement SharedModel for richer expressivity, whenthe that mode is approved(currently in ballot).
Constrained to a restricted BSMLcontent model, specified in aseparate schema.
Constraint: Sequence.value
0..* sequence
typeCode*: <= COMPcomponent4
0..* sequenceVariation
typeCode*: <= COMPcomponent3
AssociatedPropertyclassCode*: <= OBSmoodCode*: <= EVNcode: CD CWE [0..1]text: ED [0..1]value: ANY [0..1]
0..* associatedProperty
typeCode*: <= DRIVderivedFrom1
AssociatedObservation
0..* associatedObservation
typeCode*: <= COMPcomponent
AssociatedPropertyAssociatedObservation
0..* associatedProperty
typeCode*: <= DRIVderivedFrom
AssociatedProperty0..* associatedProperty
typeCode*: <= DRIVderivedFrom1
AssociatedObservation0..* associatedObservation
typeCode*: <= COMPcomponent
0..* sequenceVariationtypeCode*: <= DRIVderivedFrom3derivedFrom2
0..* sequence
typeCode*: <= DRIV
0..* determinantPeptides
typeCode*: <= DRIVderivedFrom4
0..* determinantPeptides
typeCode*: <= DRIVderivedFrom
0..* clinicalPhenotype
typeCode*: <= PERTpertinentInformation 0..* clinicalPhenotype
typeCode*: <= PERTpertinentInformation
AssociatedProperty
0..* associatedProperty
typeCode*: <= DRIVderivedFrom
AssociatedProperty
GeneticLociclassCode*: <= OBSmoodCode*: <= EVNid: SET<II> [0..*]code: CD CWE [0..1]effectiveTime: GTS [0..1]value: ANY [0..1]
0..* geneticLocitypeCode*: <= COMPcomponentOf
0..* clinicalPhenotype
typeCode*: <= PERTpertinentInformation
GeneticLoci0..* geneticLoci
typeCode*: <= COMPcomponentOf
GeneticLoci0..* geneticLoci
typeCode*: <= COMPcomponentOf
0..* polypeptide
typeCode*: <= DRIVderivedFrom1
Polypeptide
0..* polypeptide
typeCode*: <= DRIVderivedFrom2
Note:Use this class to indicate a set of genetic locito which this locus belongs. The loci set couldbe a haplotype, a genetic profile and so forth.Use the id attribute to point to the GeneticLociinstance if available. The other attributesserve as a minimal data set about the loci group.
PHENOTYPES
Note:Any observation related to the variation and is notan inherent part of the variation observation (the lattershould be represented in the AssociatedProperty class).For example, the zygosity of the variation.
Note:Use this class to point to a variationgroup to which this variation belongs.For example, a SNP haplotype.
Note:Any observation related to the sequence and is notan inherent part of the sequence observation (the lattershould be represented in the AssociatedProperty class).For example, splicing alternatives.
Note:Key peptides in the proteinthat determine its function.
Note:There could be zero to manyIndividualAllele objects in aspecific instance. A typicalcase would be an allele pair,one on the paternalchromosome and one on thematernal chromosome.
Note:Use this class toshow an allelehaplotype like in HLA.
Note:Any observationrelated to theexpression assayand is not aninherent part ofthe expressionobservation.
Note:Use this class forinherent dataabout the locus, e.g.chromosome no.
IdentifiedEntityclassCode*: <= IDENTid: SET<II> [0..*]code: CE CWE [0..1] <= RoleCode
Note:Use this role to identify a different subject(e.g., healthy tissue, virus, etc.) than theone propagated from the wrappingmessage or payload (e.g., GeneticLoci).
ScopingEntityclassCode*: <= LIVdeterminerCode*: <= INSTANCEid: SET<II> [0..*]code: CE CWE [0..1] <= EntityCode
0..* assignedEntity
typeCode*: <= PRFcontextControlCode: CS CNE [0..1] <= ContextControl "OP"
performer
0..*performer
0..*performer1
0..*performer2
0..*performer1
0..*performer2
Genetic Locus(POCG_RM000010)The entry point tothe GeneticLocus modelis any locus on the genome.
Constrained to a restricted MAGE-MLcontent model, specified in aseparate schema.
Constraint: Expression.value
Expression
Sequence
SequenceVariation
SequenceVariation
0..* clinicalPhenotypetypeCode*: <= PERTpertinentInformation
ClinicalPhenotype
CMET: (ASSIGNED) R_AssignedEntity
[universal](COCT_MT090000)
0..1 scopedRoleName
CMET: (ACT) A_SupportingClinicalInformation
[universal](COCT_MT200000)
The Locus and its Alleles
SequenceVariations
ExpressionData
Sequenceand
Proteomics
ClinicalPhenotypes
The GeneticLocus Model - Focal Areas:
6
Haifa Research Lab
The Underlying Paradigm: Encapsulate & Bubble-up
Clinical PracticesGenomic Data Sources
EHR System
HL7 CG Messages with mainly
Encapsulating HL7 Objects HL7 C
G Mes
sage
s with
enca
psula
ted da
ta as
socia
ted w
ith
HL7 cl
inica
l obje
cts (p
heno
types
)
Bubble up the most clinically-significant raw genomic data into specialized HL7 objects and
link them with clinical data from the patient EHR
Decision Support Applications
Knowledge(KBs, Ontologies, registries,
reference DBs, Papers, etc.)
the challenge…
Encapsulation by predefined & constrained
bioinformatics schemas
Bubbling-up is done continuously by specialized DS
applications
7
Haifa Research Lab
0..* associatedObservation
typeCode*: <= COMPcomponent
0..* associatedProperty
typeCode*: <= DRIVderivedFrom2
0..* polypeptide
typeCode*: <= DRIVderivedFrom5
SEQUENCES & PROTEOMICS
0..* expression
typeCode*: <= COMPcomponent1
0..* sequenceVariation
typeCode*: <= COMPcomponent3
IndividualAlleleclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]negationInd: BL [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: CD [0..1] (allele code, drawn from HUGO-HGVS or OMIM)methodCode: SET<CE> CWE [0..*]
GeneticLocusclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (e.g., ALLELIC, NON_ALLELIC)text: ED [0..1]effectiveTime: IVL<TS> [0..1]confidentialityCode: SET<CE> CWE [0..*] <= ConfidentialityuncertaintyCode: CE CNE [0..1] <= Uncertaintyvalue: CD [0..1] (identifying a gene through GenBank GeneID with an optional translation to HUGO name.)methodCode: SET<CE> CWE [0..*]
0..* individualAllele
typeCode*: <= COMPcomponent1
SequenceclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CD CWE [1..1] (the sequence standard code, e.g. BSML)text: ED [0..1] (sequence's annotations)effectiveTime: GTS [0..1]uncertaintyCode: CE CNE [0..1] <= Uncertaintyvalue: ED [1..1] (the actual sequence)interpretationCode: SET<CE> CWE [0..*] <= ObservationInterpretationmethodCode: SET<CE> CWE [0..*] (the sequencing method)
ExpressionclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [1..1] (the standard's code (e.g., MAGE-ML identifier)negationInd: BL [0..1]text: ED [0..1]effectiveTime: GTS [0..1]uncertaintyCode: CE CNE [0..1] <= Uncertaintyvalue: ED [1..1] (the actual gene or protein expression levels)interpretationCode: SET<CE> CWE [0..*] <= ObservationInterpretationmethodCode: SET<CE> CWE [0..*]
PolypeptideclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: CD [0..1] (protein code, drawn from SwissProt, PDB, PIR,HUPO, etc.)methodCode: SET<CE> CWE [0..*]
DeterminantPeptidesclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: CD [0..1] (peptide code, drawn from referencedatabases like those used in the Polypeptide class)methodCode: SET<CE> CWE [0..*]
Constrained to a restrictedMAGE-ML constrained schema,specified separately.
Constraint: GeneExpression.value
Note:A related allele that is ona different locus, and hasinterrelation with thesource allele, e.g.,translocated duplicatesof the gene.
0..* clinicalPhenotype
typeCode*: <= PERTpertinentInformation
ExternalObservedClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= EVNid*: II [1..1] (The unique id of an external observation residing outside of the instance)code: CD CWE [0..1]text: ED [0..1]effectiveTime: GTS [0..1]
Note:An external observation is preferably a valid observationinstance existing in any other HL7-compliant instance,e.g., a document or a message.Use the id attribute of this class to point to the uniqueinstance identifier of that observation.
Note:A phenotype which has been actuallyobserved in the patient representedinternally in this model.
Note:This is a computed outcome, i.e.,the lab does not test for the actualprotein, but secondary processespopulate this class with thetranslational protein.
SequenceVariationclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CD CWE [0..1]negationInd: BL [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: ANY [0..1] (The variation itself expressed with recognized notation like 269T>C or markup like BSML or drawn from an external reference like OMIM or dbSNP.)interpretationCode: SET<CE> CWE [0..*] <= ObservationInterpretationmethodCode: SET<CE> CWE [0..*]
KnownClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= DEFcode: CD CWE [0..1]text: ED [0..1]effectiveTime: GTS [0..1]uncertaintyCode: CE CNE [0..1] <= ActUncertaintyvalue: ANY [0..1]
Note:These phenotypes are not the actual (observed)phenotypes for the patient, rather they are thescientifically known phenotypes of the sourcegenomic observation (e.g., known risks of amutation or know responsiveness to a medication).
Note:Code: COPY_NUMBER, ZYGOSITY, DOMINANCY, GENE_FAMILY,etc. For example, if code = COPY_NUMBER, then the value is oftype INT and is holding the no. of copies of this gene or allele.
0..* clinicalPhenotype
typeCode*: <= PERTpertinentInformation
EXPRESSION DATA
SEQUENCE VARIATIONS
Polypeptide
Note:The Expression class refers to both gene and proteinexpression levels. It is an encapsulating class that allowsthe encapsulation of raw expression data in its value attribute.
0..* sequence
typeCode*: <= COMPcomponent2
0..* clinicalPhenotypetypeCode*: <= PERTpertinentInformation
0..* clinicalPhenotype
typeCode*: <= PERTpertinentInformation
Note:The code attribute indicates inwhat molecule the variation occurs,i.e., DNA, RNA or Protein.
0..* expression
typeCode*: <= COMPcomponent5
Note:Use the associations to the shadowclasses when the data set type (e.g.,expression) is not at deeper levels(e.g., allelic level) and needs to beassociated directly with the locus(e.g., the expression level is thetranslational result of both alleles).
0..* associatedObservationtypeCode*: <= COMPcomponent2
0..1 associatedObservation
typeCode*: <= COMPcomponent4 Note:
This recursive associationenables the association of anRNA sequence derived froma DNA sequence and apolypeptide sequence derivedfrom the RNA sequence.
0..* clinicalPhenotype
typeCode*: <= PERTpertinentInformation
Note:This class is a placeholder for a specific locus on the genome - that is - a position of a particulargiven sequence in the subject’s genome or linkage map.Note that the semantics of the locus (e.g., gene, marker, variation, etc.) is defined by data assignedin the code & value attributes of this class, and also by placing additional data relating to thislocus into the classes associated with this class like Sequence, Expression, etc..
Note:The term 'Individual Allele' doesn't refer necessarily to aknown variant of the gene/locus, rather it refers to theindividual patient data regarding the gene/locus and mightwell contain personal variations w/unknown significance.
AssociatedObservationclassCode*: <= OBSmoodCode*: <= EVNid: SET<II> [0..*]code: CD CWE [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: ANY [0..1]methodCode: SET<CE> CWE [0..*]
Note:The code attribute could hold codes likeNORMALIZED_INTENSITY, P_VALUE, etc.The value attribute is populated based on theselected code and its data type is then setupaccordingly during instance creation.
Note:The code attribute could hold codes like TYPE,POSITION.GENOME, LENGTH, REFERENCE, REGION, etc..The value attribute is populated based on the selected codeand its data type is then setup accordingly during instancecreation. Here are a few examples:If code = TYPE, then the value is of type CV and holds one of thefollowing: SNP (tagSNP), INSERTION, DELETION,TRANSLOCATION, etc.
if code = POSITION, then value is of type INT and holdsthe actual numeric value representing the variation positionalong the gene.
if code = LENGTH, then value is of type INT and holdsthe actual numeric value representing the variation length.
If code = POSITION.GENE, then value is of type CV and is oneof the following codes:INTRON, EXON, UTR, PROMOTER, etc.
If code = POSITION.GENOME, then value is of type CV and is oneof the following codes:NORMAL_LOCUS, ECTOPIC, TRANSLOCATION, etc.
If the code = REFERENCE, then value istype CD and holds the reference gene identifier drawn from areference database like GenBank.
The full description of the allowed vocabularies for codes and itsrespective values could be found in the specification.
AssociatedObservation
Note:Code: CLASSIFICATION, etc.For example, if code =CLASSIFICATION, then the valueis of type CV and is holding eitherKNOWN or NOVEL.
reference
0..* geneticLocus
typeCode*: <= REFR
Note:A related gene that is on adifferent locus, and stillhas significant interrelationwith the source gene (similarto the recursive associationof an IndividualAllele).
ClinicalPhenotypeclassCode*: <= ORGANIZERmoodCode*: <= EVN
0..* observedClinicalPhenotype
typeCode*: <= COMPcomponent1
0..* knownClinicalPhenotype
typeCode*: <= COMPcomponent2
0..* externalObservedClinicalPhenotype
typeCode*: <= COMPcomponent3
At least one of the target acts ofthe three component act relationshipsshould be populated, since this isjust a wrapper class.
Constraint: ClinicalPhenotype
Note:- code should indicate the type of source, e.g., OMIM- text could contain pieces from research papers- value could contain a phenotype code if known (e.g., if it’s a disease, then the disease code)
ClinicalPhenotype
ClinicalPhenotype
ClinicalPhenotype
ClinicalPhenotype
ClinicalPhenotype
ClinicalPhenotype
0..1 identifiedEntity
typeCode*: <= SBJcontextControlCode: CS CNE [0..1] <= ContextControl "OP"
subject
reference
0..* individualAllele
typeCode*: <= REFR
ObservedClinicalPhenotype
Note:This CMET might be replacedwith the Clinical Statement SharedModel for richer expressivity, whenthe that mode is approved(currently in ballot).
Constrained to a restricted BSMLcontent model, specified in aseparate schema.
Constraint: Sequence.value
0..* sequence
typeCode*: <= COMPcomponent4
0..* sequenceVariation
typeCode*: <= COMPcomponent3
AssociatedPropertyclassCode*: <= OBSmoodCode*: <= EVNcode: CD CWE [0..1]text: ED [0..1]value: ANY [0..1]
0..* associatedProperty
typeCode*: <= DRIVderivedFrom1
AssociatedObservation
0..* associatedObservation
typeCode*: <= COMPcomponent
AssociatedPropertyAssociatedObservation
0..* associatedProperty
typeCode*: <= DRIVderivedFrom
AssociatedProperty0..* associatedProperty
typeCode*: <= DRIVderivedFrom1
AssociatedObservation0..* associatedObservation
typeCode*: <= COMPcomponent
0..* sequenceVariationtypeCode*: <= DRIVderivedFrom3derivedFrom2
0..* sequence
typeCode*: <= DRIV
0..* determinantPeptides
typeCode*: <= DRIVderivedFrom4
0..* determinantPeptides
typeCode*: <= DRIVderivedFrom
0..* clinicalPhenotype
typeCode*: <= PERTpertinentInformation 0..* clinicalPhenotype
typeCode*: <= PERTpertinentInformation
AssociatedProperty
0..* associatedProperty
typeCode*: <= DRIVderivedFrom
AssociatedProperty
GeneticLociclassCode*: <= OBSmoodCode*: <= EVNid: SET<II> [0..*]code: CD CWE [0..1]effectiveTime: GTS [0..1]value: ANY [0..1]
0..* geneticLocitypeCode*: <= COMPcomponentOf
0..* clinicalPhenotype
typeCode*: <= PERTpertinentInformation
GeneticLoci0..* geneticLoci
typeCode*: <= COMPcomponentOf
GeneticLoci0..* geneticLoci
typeCode*: <= COMPcomponentOf
0..* polypeptide
typeCode*: <= DRIVderivedFrom1
Polypeptide
0..* polypeptide
typeCode*: <= DRIVderivedFrom2
Note:Use this class to indicate a set of genetic locito which this locus belongs. The loci set couldbe a haplotype, a genetic profile and so forth.Use the id attribute to point to the GeneticLociinstance if available. The other attributesserve as a minimal data set about the loci group.
PHENOTYPES
Note:Any observation related to the variation and is notan inherent part of the variation observation (the lattershould be represented in the AssociatedProperty class).For example, the zygosity of the variation.
Note:Use this class to point to a variationgroup to which this variation belongs.For example, a SNP haplotype.
Note:Any observation related to the sequence and is notan inherent part of the sequence observation (the lattershould be represented in the AssociatedProperty class).For example, splicing alternatives.
Note:Key peptides in the proteinthat determine its function.
Note:There could be zero to manyIndividualAllele objects in aspecific instance. A typicalcase would be an allele pair,one on the paternalchromosome and one on thematernal chromosome.
Note:Use this class toshow an allelehaplotype like in HLA.
Note:Any observationrelated to theexpression assayand is not aninherent part ofthe expressionobservation.
Note:Use this class forinherent dataabout the locus, e.g.chromosome no.
IdentifiedEntityclassCode*: <= IDENTid: SET<II> [0..*]code: CE CWE [0..1] <= RoleCode
Note:Use this role to identify a different subject(e.g., healthy tissue, virus, etc.) than theone propagated from the wrappingmessage or payload (e.g., GeneticLoci).
ScopingEntityclassCode*: <= LIVdeterminerCode*: <= INSTANCEid: SET<II> [0..*]code: CE CWE [0..1] <= EntityCode
0..* assignedEntity
typeCode*: <= PRFcontextControlCode: CS CNE [0..1] <= ContextControl "OP"
performer
0..*performer
0..*performer1
0..*performer2
0..*performer1
0..*performer2
Genetic Locus(POCG_RM000010)The entry point tothe GeneticLocus modelis any locus on the genome.
Constrained to a restricted MAGE-MLcontent model, specified in aseparate schema.
Constraint: Expression.value
Expression
Sequence
SequenceVariation
SequenceVariation
0..* clinicalPhenotypetypeCode*: <= PERTpertinentInformation
ClinicalPhenotype
CMET: (ASSIGNED) R_AssignedEntity
[universal](COCT_MT090000)
0..1 scopedRoleName
CMET: (ACT) A_SupportingClinicalInformation
[universal](COCT_MT200000)
The GeneticLocus ModelIndividual
Allele Bio Sequenc
e
Sequence Variation
(SNP, Mutation,
Polymorphism, etc.)
Polypeptide
Expression Data
Clinical Phenotype
Entry Point: GeneticLocus
Determinant
PolypeptideExpression
Attributes
Variation
Attributes
Encapsulating Obj.
Bubbled-up Obj.
Related
Allele
genotypephenotype
8
Haifa Research Lab
The GeneticVariation Model
0..* associatedObservation
typeCode*: <= ActRelationshipType (default=COMP)contextConductionInd: BL [0..1] "TRUE"
sourceOf
0..* associatedProperty
typeCode*: <= DRIVcontextConductionInd: BL [0..1] "TRUE"
derivedFrom
0..* sequenceVariation
typeCode*: <= COMPcontextConductionInd: BL [0..1] "TRUE"
component1
IndividualAlleleclassCode*: <= SEQVARmoodCode*: <= x_ActMoodDefEvnRqoPrmsPrp (default=EVN)id: II [0..1]negationInd: BL [0..1]title: ED [0..1]text: ED [0..1]statusCode: CS CNE [0..1] <= ActStatuseffectiveTime*: GTS [1..1]reasonCode: SET<CE> CWE [0..*] <= ActReasonvalue: CD CWE [0..1] <= C:interpretationCode: SET<CE> CWE [0..*] <= GeneticObservationInterpretation
GeneticLocusclassCode*: <= LOCmoodCode*: <= x_ActMoodDefEvnRqoPrmsPrp (default=EVN)id: II [0..1]code: CE CWE [0..1] (default=Gene)negationInd: BL [0..1]title: ED [0..1]text: ED [0..1]statusCode: CS CNE [0..1] <= ActStatuseffectiveTime*: IVL<TS> [1..1]confidentialityCode: SET<CE> CWE [0..*] <= ConfidentialityreasonCode: SET<CE> CWE [0..*] <= GeneticActReasonvalue*: ANY [1..1]interpretationCode: SET<CE> CWE [0..*] <= GeneticObservationInterpretationmethodCode*: SET<CE> CWE [1..1]
0..* individualAllele
typeCode*: <= COMPcontextConductionInd: BL [0..1] "TRUE"
component2
SequenceclassCode*: <= SEQmoodCode*: <= x_ActMoodDefEvnRqoPrmsPrp (default=EVN)id: II [0..1]code: CD CWE [1..1] (the type of sequence (observed, reference, etc.))text: ED [0..1] (sequence's annotations)effectiveTime: GTS [0..1]reasonCode: SET<CE> CWE [0..*] <= ActReasonvalue: ED [1..1] ((the actual sequence in a recognized bioinformatics content model) (such as BSML)interpretationCode: SET<CE> CWE [0..*] <= GeneticObservationInterpretationmethodCode: SET<CE> CWE [0..*] (the sequencing method)
HL7 Clinical Genomics SIGDocument: Genotype Topic - The GeneticVariation ModelRev: POCG_RM000011.v9 Date: November 18, 2007Facilitator: Amnon Shabo (Shvo), IBM Research in Haifa, [email protected]
Note:A related allele that is ata different locus, and hasinterrelation with thesource allele, e.g.,translocated duplicatesof a gene.
0..* phenotype
typeCode*: <= PERTcontextConductionInd: BL [0..1] "TRUE"
pertinentInformation
SequenceVariationclassCode*: <= SEQVARmoodCode*: <= x_ActMoodDefEvnRqoPrmsPrp (default=EVN)id: II [0..1]code: CD CWE [0..1]negationInd: BL [0..1]title: ED [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: ANY [0..1]interpretationCode: SET<CE> CWE [0..*] <= GeneticObservationInterpretationmethodCode: SET<CE> CWE [0..*]
Note:Code: COPY_NUMBER, ZYGOSITY, DOMINANCY, GENE_FAMILY,etc. For example, if code = COPY_NUMBER, then the value is oftype INT and is holding the no. of copies of this gene or allele.
0..* sequence
typeCode*: <= COMPcomponent2
0..* phenotype
typeCode*: <= PERTcontextConductionInd: BL [0..1] "TRUE"
pertinentInformation
0..* phenotype
typeCode*: <= PERTpertinentInformation
Note:The code attribute indicates inwhat molecule the variation occurs,i.e., DNA, RNA or Protein.
Note:Use the associations to the shadowclasses when the variation and orthe sequence data are not at theallelic level.
0..* associatedObservation
typeCode*: <= ActRelationshipType (default=COMP)contextConductionInd: BL [0..1] "TRUE"
sourceOf
0..1 associatedObservation
typeCode*: <= ActRelationshipType (default=COMP)contextConductionInd: BL [0..1] "TRUE"
sourceOf
Note:This recursive associationenables the association of anRNA sequence derived froma DNA sequence and apolypeptide sequence derivedfrom the RNA sequence.
0..* phenotype
typeCode*: <= PERTcontextConductionInd: BL [0..1] "TRUE"
pertinentInformation
Note:This class is a placeholder for specifying a locus on the genome, i.e., a position of a particulargiven sequence in the subject’s genome.Note that the semantics of the locus (e.g., gene) is defined by data assigned in the code & valueattributes of this class, and also by placing additional data relating to this locus into the classes(and CMETs) associated with this class.
Note:The term 'Individual Allele' doesn't refer necessarily to aknown variant of the gene/locus, rather it refers to theindividual patient data regarding the gene/locus, and mightcontain personal variations with unknown significance atthe effective time of this observation.
AssociatedObservationclassCode*: <= GENmoodCode*: <=x_ActMoodDefEvnRqoPrmsPrp (default=EVN)id: SET<II> [0..*]code*: CD CWE [1..1]text: ED [0..1]effectiveTime*: GTS [1..1]value: ANY [0..1]methodCode: SET<CE> CWE [0..*]
Note:The code attribute could hold codes like TYPE,POSITION.GENOME, LENGTH, REFERENCE, REGION, etc..The value attribute is populated based on the selected codeand its data type is then setup accordingly during instancecreation. Here are a few examples:If code = TYPE, then the value is of type CV and holds one of thefollowing: SNP (tagSNP), INSERTION, DELETION,TRANSLOCATION, etc.
if code = POSITION, then value is of type INT and holdsthe actual numeric value representing the variation positionalong the gene.
if code = LENGTH, then value is of type INT and holdsthe actual numeric value representing the variation length.
If code = POSITION.GENE, then value is of type CV and is oneof the following codes:INTRON, EXON, UTR, PROMOTER, etc.
If code = POSITION.GENOME, then value is of type CV and is oneof the following codes:NORMAL_LOCUS, ECTOPIC, TRANSLOCATION, etc.
If the code = REFERENCE, then value istype CD and holds the reference gene identifier drawn from areference database like GenBank.
More details about vocabularies for codes and itsrespective values could be found in the specification.
Note:Code: CLASSIFICATION, etc.For example, if code =CLASSIFICATION, then the valueis of type CV and is holding eitherKNOWN or NOVEL.
reference
0..* geneticLocus
typeCode*: <= REFRcontextConductionInd: BL [0..1] "TRUE"
Note:A related locus that has significantinterrelation with the source locusand is not part of this loci set representedin this instance.
reference
0..* individualAllele
typeCode*: <= REFRcontextConductionInd: BL [0..1] "TRUE"
Constrained to a restricted BSMLcontent model, specified in aseparate schema.
Constraint: Sequence.value
0..* sequence
typeCode*: <= COMPcontextConductionInd: BL [0..1] "TRUE"
component4
0..* sequenceVariation
typeCode*: <= COMPcontextConductionInd: BL [0..1] "TRUE"
component3
AssociatedPropertyclassCode*: <= GENmoodCode*: <= x_ActMoodDefEvnRqoPrmsPrp (default=EVN)code*: CD CWE [1..1]text: ED [0..1]value: ANY [0..1]
0..* associatedProperty
typeCode*: <= DRIVcontextConductionInd: BL [0..1] "TRUE"
derivedFrom
0..* associatedProperty
typeCode*: <= DRIVcontextConductionInd: BL [0..1] "TRUE"
derivedFrom
0..* associatedProperty
typeCode*: <= DRIVcontextConductionInd: BL [0..1] "TRUE"
derivedFrom1
0..* associatedObservation
typeCode*: <= ActRelationshipType (default=COMP)contextConductionInd: BL [0..1] "TRUE"
sourceOf
0..* sequenceVariation
typeCode*: <= DRIVcontextConductionInd: BL [0..1] "TRUE"
derivedFrom3derivedFrom2
0..* sequence
typeCode*: <= DRIVcontextConductionInd: BL [0..1] "TRUE"
Note:Any observation related to the variation and is notan inherent part of the variation observation (the lattershould be represented in the AssociatedProperty class).For example, the zygosity of the variation.
Note:Any observation related to the sequence and is notan inherent part of the sequence observation, e.g.,splicing alternatives.Note that inherent characteristics of the sequenceshould be represented in the AssociatedProperty class.
Note:There could be zero to manyIndividualAllele objects in aspecific instance. A typicalcase would be an allele pair,one on the paternalchromosome and one on thematernal chromosome.
Note:Use this class forinherent dataabout the locus, e.g.chromosome no.
0..* phenotype
typeCode*: <= PERTpertinentInformation
AssociatedProperty
AssociatedProperty
AssociatedProperty
SequenceVariation
Sequence
SequenceVariation
Note:An internal CMET used to representclinical phenotypes, both observed inthe patient and known in thescientific literature.
CMET: (ORGANIZER) A_Phenotype
[universal](POCG_MT000030UV)
CMET: (ORGANIZER) A_Phenotype
[universal](POCG_MT000030UV)
CMET: (ORGANIZER) A_Phenotype
[universal](POCG_MT000030UV)
CMET: (ORGANIZER) A_Phenotype
[universal](POCG_MT000030UV)
CMET: (ORGANIZER) A_Phenotype
[universal](POCG_MT000030UV)
Holds the variation expressed with arecognized notation like 269T>C ora markup like BSML or drawn from anexternal reference like OMIM or dbSNP.Data type should be set accordingly.
Constraint: value
If code = "Gene", value data type shall be set to CD and containa code identifying a gene through GenBank GeneID, HUGO name,OMIM ID or any other internationally recognized identification of genes.If the locus is not a gene then the data type should be set to theappropriate type, e.g., ST for locus notation like “10q24.32”.
Constraint: value
GeneticLociclassCode*: <= LOCmoodCode*: <= x_ActMoodDefEvnRqoPrmsPrp (default=EVN)id: SET<II> [0..*]code*: CD CWE [1..1] <= GeneticVariationnegationInd: BL [0..1]title: ED [0..1]text: ED [0..1]statusCode*: CS CNE [1..1] <= ActStatuseffectiveTime*: GTS [1..1]confidentialityCode: SET<CE> CWE [1..1] <= ConfidentialityreasonCode: SET<CE> CWE [0..*]interpretationCode: SET<CE> CWE [0..*] <= GeneticObservationInterpretationmethodCode: SET<CE> CWE [0..*] <= ObservationMethod
0..* geneticLocus
typeCode*: <= COMPcomponent1
0..* assignedEntity
typeCode*: <= AUTcontextControlCode: CS CNE [0..1] "OP"
author
0..* assignedEntity
typeCode*: <= VRFcontextControlCode: CS CNE [0..1] "OP"
verifier
0..* assignedEntity
typeCode*: <= PRFcontextControlCode: CS CNE [0..1] "OP"
performer
CMET: (ASSIGNED) R_AssignedEntity
[universal](COCT_MT090000UV)
0..1 roleName
GeneticDocumentclassCode*: <= DOCmoodCode*: <= x_ActMoodDefEvnRqoPrmsPrp (default=EVN)id: SET<II> [0..1]code*: CD CWE [1..1] <= DocumentTypetitle: ED [0..1]text: ED [0..1]statusCode*: CS CNE [1..1] <= ActStatuseffectiveTime*: GTS [1..1]confidentialityCode: SET<CE> CWE [0..*] <= ConfidentialitysetId: II [0..1]
0..* geneticDocument
typeCode*: <= DOCcontextConductionInd: BL [0..1] "TRUE"
documentation
relatedDocument
0..* geneticDocument
typeCode*: <= x_ActRelationshipDocumentcontextConductionInd: BL [0..1] "TRUE"seperatableInd: BL [0..1]
Note:Use the separation indicator to indicatewhen a document should not be separatedfrom its associated document (like in theEGFR-KRAS2 use case from HPCGG)
Note:There are two ways to refer to a clinical document: 1. Populate the id attribute with the document id 2. Place the entire CDA instance within the text attribute
The other attributes in this class are essential data aboutthe document and they are repeated in the documentinstance itself. It’s meant to ease the parsing process.
0..* associatedObservation
typeCode*: <= ActRelationshipType (default=COMP)contextConductionInd: BL [0..1] "TRUE"
sourceOf
CMET: (ORGANIZER) A_Phenotype
[universal](POCG_MT000030UV)
0..* phenotype
typeCode*: <= PERTcontextConductionInd: BL [0..1] "TRUE"
pertinentInformation
GeneticVariation(POCG_RM000011UV)The entry point to the combinedGenetic Loci/Locus model thatrepresent genetic variations data.
0..1 identifiedEntity
typeCode*: <= SBJcontextControlCode: CS CNE [0..1] "OP"
subjectIdentifiedEntityclassCode*: <= IDENTid: SET<II> [0..*]code: CE CWE [0..1] <= RoleCode
ScopingEntityclassCode*: <= LIVdeterminerCode*: <= INSTANCEid: SET<II> [0..*]code: CE CWE [0..1] <= EntityCode
0..1performer
AssociatedObservation
AssociatedObservation
AssociatedObservation
AssociatedObservation
0..*
author
0..1performer 0..1
performer
0..1performer
0..1performer
If interpretationCode is assigned with a valuea reasonCode shall be assigned a value to setthe context for the interpretation semantics.
Constraint: GeneticLoci.reasonCode&interpretationCode
AssociatedProperty0..* associatedProperty
typeCode*: <= DRIVderivedFrom
Genetic Loci
Genetic
Locus
Individual Allele
Sequence
Variation
Sequence
(observed or reference)
Point to CDA Documents
participants
Associated data (vocab. Controlled)
9
Haifa Research Lab
CDA IG for Genetic Testing Report Design principles:
Follow existing report formats commonly used in healthcare & research Emphasis on interpretations & recommendations Provide inline & detailed (generic) information on tests performed
Interpretation: Utilize patterns of ‘genotype-phenotype’ associations in the HL7 v3 Clinical Genomics and implement them as templates in this IG
Reference HL7 Clinical Genomics instances (most likely constrained) Place holders of raw data (evidences) and for structured family history
Section outline: Content sections: Genetic Variations, Gene Expression, others Sub-sections in each content section:
Specimen Findings Interpretations Recommendations Test Information Family History Open the draft outline
10
Haifa Research Lab
Technical Issues
Design & register genotype-phenotype templates Similar a bit to the CCD templates for “Allergies, Adverse Reactions,
Alerts” where ‘agent’ is the genomic entity/observation and the reaction is the phenotypic information
Note that in CCD the relationship is fixed to “MFST” while in genomics we’ll have a variety of codes representing various ‘genotype-phenotype’ relationships
Enable associating a genotype to phenotypes in several places across the document (reference an observation)
Links to HL7 v3 Clinical Genomics instances Similar to referencing images in CDA Diagnostic Report IG
11
Haifa Research Lab
Cause of
allergy
Allergen is manifested
by…
Manifestation of the allergy
12
Haifa Research Lab
Referencing a DICOM Object
13
Haifa Research Lab
The End
• Thank you for your attention…
• Questions?