Using Exons to Define Isoforms in PRO Timothy Danford Novartis Institutes for Biomedical Research...

Post on 13-Dec-2015

214 views 0 download

Tags:

transcript

Using Exons to Define Isoforms in PRO

Timothy DanfordNovartis Institutes for Biomedical Research

PRO / AlzForum Kickoff MeetingOct. 4, 2011

Genes vs. Proteins• Gene• Transcript• Exon• Locus• Allele• Variant

• SNP• Indel• Rearrangement

• Motif

• Protein• Isoform• Variant• Domain• Site• Complex• Motif• Fragment

Can we join the worlds of PRO and of Genes, at a finer-grained level than that of “full sequence?”

Isoforms in PRO Today

PRO v23 (10/2/2011)

[Term]id: PR:000010173name: microtubule-associated protein taudef: "A protein that is a translation product of the MAPT gene or a 1:1 ortholog thereof." [PRO:DNx]comment: Category=gene. Flag=automatic.synonym: "MAPT" EXACT PRO-short-label []synonym: "neurofibrillary tangle protein" EXACT []synonym: "paired helical filament-tau" EXACT []synonym: "PHF-tau" EXACT []synonym: "MAPTL" RELATED []synonym: "Mtapt" RELATED []synonym: "MTBT1" RELATED []synonym: "TAU" RELATED []is_a: PR:000000001 ! protein

Isoforms in PRO Today

PRO v23 (10/2/2011)

[Term]id: PR:000026993name: microtubule-associated protein tau isoform Fetal-taudef: "A microtubule-associated protein tau that is a translation product of some mRNA giving rise to a protein with the amino acid sequence represented by UniProtKB:P10636-2 or a 1:1 ortholog thereof." [PRO:DAN]comment: Category=sequence.synonym: "Fetal-tau" EXACT []is_a: PR:000010173 ! microtubule-associated protein tau

Isoforms in PRO Today

PRO v23 (10/2/2011)

Isoforms in PRO Today

PRO v23 (10/2/2011)

Digression: Visual Notation

Isoforms in PRO Today

PRO v23 (10/2/2011)

Isoforms in PRO Today

PRO v23 (10/2/2011)

Isoforms in PRO Today

PRO v23 (10/2/2011)

Tau Isoforms Share Functionally-relevant Exons

Fetal Tau

Adult Tau

Slide: Gwen Wong (AlzForum), Image: http://www.med.upenn.edu/cndr/TauSynuclein.shtml

“Conserved Protein Domains in Tau Suggest Functional Differences between Protein Isoforms”

What Questions Could We Askof PRO + Genomic Data?

• Which isoform corresponds to which transcript(s)? • Which isoforms share a common feature?

– common exons? – common domains? (pfam, interpro, etc.)

• Which “normal” protein isoforms overlap with SNPs or other genetic variants? – How do protein sites line up to sites on the gene?

• How do mouse and human proteins correspond?

“Which isoform corresponds with which transcript(s)?”

Transcript Variant: This variant (4) lacks six internal coding exons, as compared to variant 6. The reading frame is not affected, and the resulting isoform (4) has identical N- and C-termini but lacks five segments, as compared to isoform 6.

Define Exons as Parts-of-Proteins

Defined class of Isoforms based on has_part and lacks_part to particular exons

Integrate Existing Isoforms

How is “MAPT Exon 2” defined?

• Take the “exon” definition from SO:0000147– “A region of the transcript sequence within a gene

which is not removed from the primary RNA transcript by RNA splicing.”

• Exon number defined relative to the full-length or “canonical” transcript– “An exon that corresponds (aligns) to the second of 13

exons in the full-length MAPT transcript...”• Define the part of the protein derived from this

portion of the transcript…

What Questions Could We Askof PRO + Genomic Data?

• Which isoform corresponds to which transcript(s)? • Which isoforms share a common feature?

– common exons? – common domains? (pfam, interpro, etc.)

• Which “normal” protein isoforms overlap with SNPs or other genetic variants? – How do protein sites line up to sites on the gene?

• How do mouse and human proteins correspond?

What Questions Could We Askof PRO + Genomic Data?

• Which isoform corresponds to which transcript(s)? • Which isoforms share a common feature?

– common exons? – common domains? (pfam, interpro, etc.)

• Which “normal” protein isoforms overlap with SNPs or other genetic variants? – How do protein sites line up to sites on the gene?

• How do mouse and human proteins correspond?

What Questions Could We Askof PRO + Genomic Data?

• Which isoform corresponds to which transcript(s)? • Which isoforms share a common feature?

– common exons? – common domains? (pfam, interpro, etc.)

• Which “normal” protein isoforms overlap with SNPs or other genetic variants? – How do protein sites line up to sites on the gene?

• How do mouse and human proteins correspond?