Practical session 2bIntroduction to 3D
Modelling and threading9:30am-10:00am
3D modeling and threading10:00am-10:30am
Analysis of mutations in MYH6
Miguel AndradeMax Delbrück Center for Molecular Medicine
IntroductionProtein tertiary structure
Secondary structure elements fold together
IntroductionProtein quaternary structure
Folded proteins form a complex
Protein domains are structural units (average 160 aa) that share:
FunctionFoldingEvolution
Proteins normally are multidomain (average 300 aa)
Introduction
Protein domains are structural units (average 160 aa) that share:
FunctionFoldingEvolution
Proteins normally are multidomain (average 300 aa)
Introduction
X-ray crystallography (70,714 in PDB)• need crystals
Nuclear Magnetic Resonance (NMR) (9,312)• proteins in solution• lower size limit (600 aa)
Electron microscopy (422)• Low resolution (>5A)
Determination of protein structure
Determination of protein structure
resolution 2.4 A
Determination of protein structure
resolution 2.4 A
Structural genomics
Currently: 81K 3D structures from around 27K seqs16M sequences in UniProt
only 0.17%!
Structural genomics
Currently: 81K 3D structures from around 27K seqs16M sequences in UniProt
50% sequences covered (25% in 1995)
only 0.17%!
Strategy for analysisQuery Sequence
Yes
3D Modeling by homology
No
2D Prediction3D Ab initio3D Threading
Similar to PDBsequence?
Predict domainsCut
3D structure predictionApproaches
Class 1
Comparative modeling
Class 2
Ab initio
Need sequence only
Need similarity to a known structure
Threading
3D structure predictionApproaches
Search for sequences of known structure similar to target
Comparative modeling (30% to 50% id)
Model from template core regions and from loops and side chains of structures that might be unrelated
Atom coordinates from conserved residues
Optimize distance constraints derived from sequence-template alignment
Extra methods for loops, turns, and side chains
3D structure predictionApproaches
Thread target sequence through a library of known folds.
Select right fold based on energy considerations
More computational cost – but detect more distant relationships
Threading (identity can be lower than 30%)
3D structure predictionApproaches
Explore conformational space
Limit the number of atoms
Break the problem into fragments of sequence
Optimize hydrophobic residue burial and pairing of beta-strands
Limited success
Ab initio
Relation between sequence identity and accuracy/applications
From:Baker and Sali (2001) Science
3D structure predictionApplications: target design
Query sequence
catalytic center
known 3D
LeuGly
model 3D by homology
Gly Lys+
similar toL G
G K
3D structure predictionApplications: fit to low res 3D
Query sequence 1
low resolution 3D (electron microscopy)
Query sequence 2
3D structure predictionGenTHREADER
David Jones http://bioinf.cs.ucl.ac.uk/psipred/
Input sequence
Relatively quick, 5 minutes
GenTHREADER Jones (1999) J Mol Biol
Output GenTHREADER
3D structure predictionGenTHREADER
3D structure predictionPhyre
http://www.sbg.bio.ic.ac.uk/phyre2/
Kelley et al (2000) J Mol BiolKelley and Sternberg (2009) Nature Protocols
3D structure predictionI-Tasser
Jeffrey Skolnick TasserYang Zhang I-Tasser
Lee and Skolnick (2008) Biophysical JournalRoy et al (2010) Nature Methods
ThreadingFold 66% sequences <200 aa long of low homology to PDB
Just submit your sequence and wait… (some days)
Output are predicted structures (PDB format)
3D structure predictionI-Tasser
Roy et al (2010) Nature Methods
3D structure predictionI-Tasser
http://zhanglab.ccmb.med.umich.edu/I-TASSER/
3D structure predictionI-Tasser
3D structure predictionQUARK
http://zhanglab.ccmb.med.umich.edu/QUARK/
3D structure predictionMODbase
Andrej Sali http://modbase.compbio.ucsf.edu/
Pieper et al (2011) Nucleic Acids Research
3D structure predictionMODbase