1
Bio-Trac 40 (Protein Bioinformatics)Bio-Trac 40 (Protein Bioinformatics)
October 9, 2008October 9, 2008
Zhang-Zhi Hu, M.D. Zhang-Zhi Hu, M.D. Research Associate ProfessorResearch Associate ProfessorProtein Information Resource, Department of Protein Information Resource, Department of Biochemistry and Molecular & Cellular BiologyBiochemistry and Molecular & Cellular BiologyGeorgetown University Medical CenterGeorgetown University Medical Center
Lab
2http://www.geneontology.org/
3
GO term (GO:0006366): mRNA transcription from RNA polymerase II
promoter
Leaf node
GO search and display tool
4
Human p53 – GO annotation (UniProtKB:P04637)
GO:0006289:nucleotide-excision repair [PMID:7663514; evidence:IMP]
http://pir.georgetown.edu/cgi-bin/ipcEntry?id=P04637
5
• Science basis of the GO: trained experts use the experimental observations from literature to associate GO terms with gene products (to annotate the entities represented in the gene/protein databases)
• Enabling data integration across databases and making them available to semantic search
GO annotation of gene products
~46Human, mouse, plant, worm, yeast …
http://www.geneontology.org/GO.current.annotations.shtml
6
http://pir.georgetown.edu/pirwww/search/idmapping.shtml
ID Mapping
Information matrix
Functional profiling
Batch gene/protein retrieval and
profilingEnter ID, gi #
7
http://pir.georgetown.edu/cgi-bin/batch_iprox.pl?search=1&data=gu1
57174917850711715797235198411477462622744401032898754976182615009683854123464924723020
Entrez Gene list
http://pir.georgetown.edu/pirwww/search/idmapping.shtml
UniProt Accession/ID
Batch retrieval
http://pir.georgetown.edu/pirwww/search/batch.shtml
8
GO Slim
http://www.geneontology.org/GO.slims.shtml
(http://www.geneontology.org/)
9
KEGG Metabolic & Regulatory Pathways KEGG is a suite of databases and associated software, integrating our current knowledge on molecular
interaction networks, the information of genes and proteins, and of chemical compounds and reactions.
(http://www.genome.ad.jp/kegg/pathway.html)
Transforming Growth Factor (TGF) beta signaling
10
BioCarta Cellular Pathways
(http://www.biocarta.com/index.asp)
Transforming Growth Factor (TGF) beta signaling [Homo sapiens]
11
Transforming Growth Factor (TGF) beta signaling [Homo sapiens]
Event ->REACT_6879.1: Activated type I receptor phosphorylates R-SMAD directly [Homo sapiens] Object -> REACT_7364.1: Phospho-R-SMAD [cytosol]Event -> REACT_6760.1: Phospho-R-SMAD forms a complex with CO-SMAD [Homo sapiens]Object -> REACT_7344.1: Phospho-R-SMAD:CO-SMAD complex [cytosol]Event -> REACT_6726.1: The phospho-R-SMAD:CO-SMAD transfers to the nucleusObject -> REACT_7382.2: Phospho-R-SMAD:CO-SMAD complex [nucleoplasm] ……
(http://reactome.org/cgi-bin/eventbrowser?DB=gk_current&FOCUS_SPECIES=Homo%20sapiens&ID=170834&)
Reactome: events and objects (including modified forms and complex)
12
PIDTransforming Growth Factor beta signaling
13
Reactome PID
~26 proteins in PID are not defined in Reactome, while only 2 in Reactome not defined in PID
Transforming Growth Factor (TGF) beta signaling
14
iProXpressiProXpress: Integrative analysis of : Integrative analysis of proteomic and gene expression dataproteomic and gene expression data
DataData
InformationInformation
KnowledgeKnowledge
MS spectrum
Peptide ident.
Protein ident.
FunctionPathwayFamily
CategorizeStatisticsAssociation
http://pir.georgetown.edu/iproxpress/
15
iProXpress – Pathway Profiling
• Protein information matrix: extensive annotations including protein name, family classification, function, protein-protein interaction, pathway…
• Functional profiling: iterative categorization, sorting, cross-dataset comparison, coupled with manual examination.
ER Mit
Mit
ER
KEGG pathway
• Organelle proteome data sets
16
Purine metabolic pathway
Ribonucleoside diphosphate reductase subunit M2 (RRM2)
DNA synthesis DNA repair
1.17.4.1
ATP X dATP
ADP dADP
dGTP X GTP
dGDPGDP
1.17.4.1