Date post: | 25-Dec-2015 |
Category: |
Documents |
Upload: | randolph-sullivan |
View: | 214 times |
Download: | 0 times |
A network-based representation of protein fold space
Spencer Bliven
Qualifying Examination 6/6/2011
Overview1. Background & Motivation
2. Preliminary Research
3. Proposed Future Research
Fold SpaceWhat protein folds are possible?
Discrete or Continuous? Both? Neither?
What portion of fold space is utilized by nature?
Long debated questions. Why?Understanding of structure-function relationshipProtein design/engineeringProtein evolutionClassification
Previous Work Orengo, Flores, Taylor,
Thornton. Protein Eng (1993) vol. 6 (5) pp. 485-500
Holm and Sander. J Mol Biol (1993) vol. 233 (1) pp. 123-38
Holm and Sander. Science (1996) vol. 273 (5275) pp. 595-603
Shindyalov and Bourne. Proteins (2000) vol. 38 (3) pp. 247-60
Hou, Sims, Zhang, Kim. PNAS (2003) vol. 100 (5) pp. 2386-90
Taylor. Curr Opin Struct Biol (2007) vol. 17 (3) pp. 354-61
Sadreyev et al. Curr Opin Struct Biol (2009) vol. 19 (3) pp. 321-8
α
α+β
β
α/β
Why can we do better?More structures
Sampling of globular folds “saturated”Few novel folds being discoveredGeometric arguments for saturation of
small protein folds
Recent all-vs-all computationCluster sequence to 40% identity17,852 representative (updated weekly)189 million FATCAT rigid-body alignments
73503
http://www.rcsb.org/pdb/statistics/contentGrowthChart.do?content=total&seqid=100Accessed 5/31/2011
Structural Similarity Graph Nodes: PDB chains,
non-redundant to 40%
Edges: FATCAT-rigid alignments
“Significant” edges: p<0.001 Length > 25 Coverage > 50
Hierarchically cluster to reduce complexity in visualization
aba/ba+bMultiMembraneSmall
Agreement with SCOP
Class p<10-6
Fold p<10-7
Superfamily p<10-10
Continuity
Grishin. J Struct Biol (2001) vol. 134 (2-3) pp. 167-85
Skolnick claims ≤ 7 intermediates between any proteinsWe observe network diameter=15
Can find interesting paths
C4
C5
C6
C7
Symmetry
Beta Propellers
SymmetryFunctionally important
Protein evolution (e.g. beta-trefoil)DNA bindingAllosteric regulationCooperativity
Widespread (~20% of proteins)
Focus of algorithmic work
FGF-1 Lee & Blaber. PNAS 2011
TATA Binding Protein1TGH
Hemoglobin4HHB
Cross-class example 3GP6.A
PagP, modifies lipid A f.4.1 (transmembrane
beta-barrel)
1KT6.A Retinol-binding protein b.60.1 (Lipocalins)
Summary of Preliminary Research
Calculated all-vs-all alignment Prlić A, Bliven S, Rose PW, Bluhm WF, Bizon C, Godzik A, Bourne PE. Pre-
calculated protein structure alignments at the RCSB PDB website. Bioinformatics (2010) vol. 26 (23) pp. 2983-2985
Built network of significant alignmentsApproximately matches SCOP classifications
Improved structural alignment algorithms Identify symmetry, circular permutations, topology
independent alignments Discussed more in report
Future ResearchImprove the network
1. Improve all-vs-all comparison algorithm
2. Tune parameters during graph generation
Annotate the network & draw biological inferences3. Annotate nodes with functional information
4. Compare with other networks
Create new networks5. Enhance structural comparison algorithms
1. Improve all-vs-all comparison algorithm
Need domain decomposition
Use Combinatorial Extension (CE)
2. Tune parameters during graph generation
Don’t use p-valuesShouldn’t compare p-values, statistically*Not normalized by secondary structureNot accurate due to multiple testing problem
Use TM-scoreRMSD, normalized to the alignment length
Determine optimal thresholds for determining “significance”For instance, train an SVG
* Technically ok here, since one-to-one with the FATCAT score
FATCAT p-value by Class
Perform poorly on all-alpha in “twilight zone”
Terrible on membrane proteins Probably reflects non-
structural considerations in SCOP assignment
3. Annotate nodes with functional information
SCOP/CATH classifications
GO terms
Metal binding
Ligand binding
Symmetry
aba/ba+bMultiMembraneSmall
4. Compare with other networks
Define other types of network over the set of protein representativesProtein-protein interactionsCo-expression
Correlate to the structural similarities
Structural similarity
Protein-protein interaction
5. Enhance structural comparison algorithms
Improve automated pseudo-symmetry detection
Find topology-independent relationships
C3
SummaryFold space as network
Improve network creation
Annotate network with functional information
Improve structural similarity detection
AcknowledgmentsBourne Lab
Philip Bourne
Andreas Prlić
Lab & PDB members
Qualifying Exam Committee
Ruben Abagyan
Patricia Jennings
Andy McCammon
Collaborators
Philippe Youkharibache
Jean-Pierre Changeux
Rotation Advisors
Pavel Pevzner
Philip Bourne
José Onuchic & Pat Jennings
Mike MacCoss
Virgil Woods