11
Several SER Structures,Several SER Structures, Strategies, Surfaces, and Such. Strategies, Surfaces, and Such.
The Derewenda LabThe Derewenda Lab
University of VirginiaUniversity of Virginia
Earth Day, 2008.Earth Day, 2008.
Sponsored by the letter S.
Protein crystallized in our group by the surface engineering approach, with solved crystal Protein crystallized in our group by the surface engineering approach, with solved crystal structures (as of March 2008) structures (as of March 2008)
The RGSL domain of PDZRhoGEFThe RGSL domain of PDZRhoGEF((Longenecker KL, et al. & Derewenda ZS. Structure, 2001, 9:559-69)The LcrV antigen of the plague-causing bacterium The LcrV antigen of the plague-causing bacterium Yersinia pestisYersinia pestis(Derewenda, U. et al. & Waugh, D.S. Structure, 2001, 9:559-69)Product of the Product of the YkoFYkoF B. subtilisB. subtilis gene gene (Devedjiev, Y. et al. & Derewenda, Z.S. J Mol Biol. 2004, 343:395-406)Product of the Product of the YdeNYdeN B. subtilisB. subtilis gene gene (Janda, I. et al. & Derewenda, Z.S. Acta Crystallogr 2004, D60: 1101-1107) Product of the Product of the Hsp33Hsp33 B. subtilisB. subtilis gene gene (Janda, I. et al. & Derewenda, Z.S. Structure 2004, 12:1901-1907) The product of the The product of the YkuDYkuD B. subtilisB. subtilis gene gene (Bielnicki, J. et al. & Derewenda, Z.S. Proteins, 2006, 62:144-51) The Ohr protein of The Ohr protein of B. subtilisB. subtilis(Cooper, D. et al. & Derewenda, Z.S. Acta Cryst 2007, D63:1269-1273) The N-DCX domain of human doublecortinThe N-DCX domain of human doublecortin(Cierpicki, et al. & Derewenda, Z.S. Proteins; 2006:D64:874-882) The p23-like domain of the human nuclear migration NudC proteinThe p23-like domain of the human nuclear migration NudC protein(Zheng, M. et al. & Derewenda, Z.S. in preparation) APC 1446 Bacillus subtilisAPC 1446 Bacillus subtilis(Derewenda, U. et al. & Derewenda, Z.S. in preparation)DinB Bacillus subtilisDinB Bacillus subtilis(Cooper, D.R. et al. & Derewenda, Z.S. in preparation) Tm0439 – VanR family transcription factorTm0439 – VanR family transcription factor(Zheng, M. et al. & Derewenda, Z.S. in preparation) TM1865 – endonuclease VTM1865 – endonuclease V(Utepbergenov, D. et al. & Derewenda, Z.S. in preparation) Tm0260 – Phosphate transport regulatorTm0260 – Phosphate transport regulator(Zheng, M. et al. & Derewenda, Z.S. in preparation) Tm1382 – NUDIX hydrolase (Possible mutT family member)Tm1382 – NUDIX hydrolase (Possible mutT family member)(Choi, W.C., et al. & Derewenda, Z.S. in preparation)
Publications by other groups reporting crystallization of novel proteinsPublications by other groups reporting crystallization of novel proteins (green),(green), or preparations of higher or preparations of higher quality crystal formsquality crystal forms (red)(red) of proteins previously crystallized, by the SER method (as of March 2008)of proteins previously crystallized, by the SER method (as of March 2008)
The CUE:ubiquitin complexThe CUE:ubiquitin complex (Prag G et al., & Hurley JH, Cell. 2003, 113:609-20)(Prag G et al., & Hurley JH, Cell. 2003, 113:609-20)
Unactivated insulin-like growth factor-1 receptor kinaseUnactivated insulin-like growth factor-1 receptor kinase(Munshi, S. et al. & Kuo, L.C. Acta Cryst. 2003, D59:1725-1730)(Munshi, S. et al. & Kuo, L.C. Acta Cryst. 2003, D59:1725-1730)
Human choline acetyltransferaseHuman choline acetyltransferase(Kim, A-R., et al. & Shilton, B. H. Acta Cryst. 2005, D61, 1306-1310)(Kim, A-R., et al. & Shilton, B. H. Acta Cryst. 2005, D61, 1306-1310)
Activated factor XI in complex with benzamidineActivated factor XI in complex with benzamidine(Jin, L., et al. & Strickler, J.E. Acta Cryst. 2005, D61, 1418-1425)(Jin, L., et al. & Strickler, J.E. Acta Cryst. 2005, D61, 1418-1425)
Axon guidance protein MICALAxon guidance protein MICAL(Nadella, M., et al. & Amzel, M.L. PNAS, 2005, 102, 16830-16835)(Nadella, M., et al. & Amzel, M.L. PNAS, 2005, 102, 16830-16835)
Functionally intact Hsc70 chaperoneFunctionally intact Hsc70 chaperone(Jiang, J., et al. & Sousa, R. Molecular Cell, 2005, 20, 513-524)(Jiang, J., et al. & Sousa, R. Molecular Cell, 2005, 20, 513-524)
EscJ protein from the Type III secretion systemEscJ protein from the Type III secretion system(Yip, C.K., et al. & Strynadka, N.C.J. Nature, 435: 702-707)(Yip, C.K., et al. & Strynadka, N.C.J. Nature, 435: 702-707)
L-rhamnulose kinase from E. coliL-rhamnulose kinase from E. coli(Grueninger D, & Schultz, G.E.) J. Mol. Biol, 2006, 359, 787-797)(Grueninger D, & Schultz, G.E.) J. Mol. Biol, 2006, 359, 787-797)
T4 vertex gp24 protein T4 vertex gp24 protein (Boeshans, K.M.., et al. & Ahvazi, B. Protein Expr. Purif., 2006, 49, 235-243.(Boeshans, K.M.., et al. & Ahvazi, B. Protein Expr. Purif., 2006, 49, 235-243.
Borrelia burgdorferi outer surface protein ABorrelia burgdorferi outer surface protein A(Makabe, K., et al. & Koide, S. Protein Science., 2006, 15, 1907-1914)(Makabe, K., et al. & Koide, S. Protein Science., 2006, 15, 1907-1914)
SH2 domain from the SH2-B murine adapter proteinSH2 domain from the SH2-B murine adapter protein(Hu, J., & Hubbard, S.R J. Mol. Biol., 2006, 361, 69-79)(Hu, J., & Hubbard, S.R J. Mol. Biol., 2006, 361, 69-79)
Mycoplasma arthriditisMycoplasma arthriditis-derived mitogen-derived mitogen(Guo, Y., et al., & Li, H. J., Acta Cryst. 2006, F62, 238-241)(Guo, Y., et al., & Li, H. J., Acta Cryst. 2006, F62, 238-241)
KChIP1 – Kv4.3 T1 complexKChIP1 – Kv4.3 T1 complex(Pioletti, M., et al. & Minor, D. L., Nature, Str & Mol Bio. 2006, 13: 988-995(Pioletti, M., et al. & Minor, D. L., Nature, Str & Mol Bio. 2006, 13: 988-995
Kinase domain of serum and glucocorticoid-regulated kinase 1 in complex with AMP-PNP (R126A) Kinase domain of serum and glucocorticoid-regulated kinase 1 in complex with AMP-PNP (R126A) (Zhao, B., et al & Schackenberg, C.G., Protein Science, 2007, 16, 2761-2769)(Zhao, B., et al & Schackenberg, C.G., Protein Science, 2007, 16, 2761-2769)
Human IL-7 bound to unglycosylated and glycosylated forms of its Human IL-7 bound to unglycosylated and glycosylated forms of its receptor receptor(Wickham, J. Jr. and Walsh, S.T.R., Acta Crystallographica, 2007, F63, 865-869)(Wickham, J. Jr. and Walsh, S.T.R., Acta Crystallographica, 2007, F63, 865-869)
Human cyclin B1 (C167S, C283S, C350S, E183A, E184A)Human cyclin B1 (C167S, C283S, C350S, E183A, E184A) (Petri, E.T., et al. & Basavappa, R. Cell Cycle, 2007, 6: 1342-1349)(Petri, E.T., et al. & Basavappa, R. Cell Cycle, 2007, 6: 1342-1349)
Candida boidinii Candida boidinii formate dehydrogenaseformate dehydrogenase(Schirwitz, K., Schmidt, A. & Lamzin, V.S. Protein Science, 2007, 16: 1146-1156)(Schirwitz, K., Schmidt, A. & Lamzin, V.S. Protein Science, 2007, 16: 1146-1156)
EpsI/EpsJ complexEpsI/EpsJ complex(Yanez, M.E., et al., Hol, W.G.J. J. Mol. Biol., 2008, 375:471-486)(Yanez, M.E., et al., Hol, W.G.J. J. Mol. Biol., 2008, 375:471-486)
Periplasmic domain of E. coli YidCPeriplasmic domain of E. coli YidC(Paetzel, M & Oliver, D.C. J. Biol. Chem., 2008, 283:5208-5216)(Paetzel, M & Oliver, D.C. J. Biol. Chem., 2008, 283:5208-5216)
Candida boidinii Candida boidinii formate dehydrogenaseformate dehydrogenase(Schirwitz, K., Schmidt, A. & Lamzin, V.S. Protein Science, 2007, 16: 1146-1156)(Schirwitz, K., Schmidt, A. & Lamzin, V.S. Protein Science, 2007, 16: 1146-1156)
-ketoacyl acyl carrier protein from Streptococcus pneumoniae (FabF)-ketoacyl acyl carrier protein from Streptococcus pneumoniae (FabF)(Parthasarathy, G. et al., & Soisson, Stephen, M. 2008, Acta Crystallographica, D64:141-148)(Parthasarathy, G. et al., & Soisson, Stephen, M. 2008, Acta Crystallographica, D64:141-148)
44
Our Current SER strategyOur Current SER strategyTarget evaluation and selection—
See the slides after the acknowledgements for information on:PSI Structural Genomics Knowledgebase
http://kb.psi-structuralgenomics.org/KB/DisMeta (a disorder meta-server)XtalPred
Expression of Wild Type – taken through to crystallization trials. Performed on a chromatography system and eluted as a gradient to determine
optimal washing concentration of imidazole. We will work with WT crystals for ~2 months before undertaking mutagenesis.
Mutation Site and Replacement Residue selection We use the SERp server and use the three best sites. We make Ala and Tyr variants for the top 3 clusters.
QuikChange mutatgenesis We make them all at once.
Purification, crystallization. We use gravity columns and wash with the imidazole concentration determined
for the wild type protein. Some lab members like to purify all 6 at once, others like to purify the 1A and 1Y variants first.
55
Tm1865Tm1865
Site 1) K49, E50, E51 Site 2) K173, E174 Site 3) K25, K26, K28
MWMW 25.525.5
# of Residues# of Residues 225225
pIpI 8.938.93
Gravy IndexGravy Index -.21-.21
# of Mets# of Mets 44
Endonuclease V (TM1865), is a DNA repair Endonuclease V (TM1865), is a DNA repair enzyme. It cleaves a second phosphodiester enzyme. It cleaves a second phosphodiester
bond (in 5’ direction) from a deaminated base.bond (in 5’ direction) from a deaminated base.
Recognizes an unusually broad range of irregularities in the DNA structure:hairpins, unpaired/mispaired bases, deaminated residues, abasic sites etc
ATGCxTGCTACGTACG
•Found throughout nature – homologs in human, bacteria, archaea•Structure unknown, function is believed to be DNA repair•However, E. coli deficient in EndoV are generally normal and resistant to mutagens (except nitrosating agents). The enzyme is important for the resistance of E.coli to mutagenesis during nitrate/nitrite respiration. •Enzyme is used for mutagenesis and for high throughput detection of mutations in clinical samples•E. coli enzyme commercially available from NEB•Thermatoga enzyme commercially available from Fermentas
TM1865 – crystallization, structure solutionTM1865 – crystallization, structure solution
Purifies and crystallizes easily as a wild type, Purifies and crystallizes easily as a wild type, no need to apply SERno need to apply SERCrystals of SeMet derivative were obtained Crystals of SeMet derivative were obtained directly from the JCSG screen, (24% directly from the JCSG screen, (24% PEG1500, 20% glycerol ) using 1.5 M NaCl in PEG1500, 20% glycerol ) using 1.5 M NaCl in reservoir.reservoir.
PP221122112211, a=69.27, b=71.37, c=119.78 , a=69.27, b=71.37, c=119.78
Scaled at 2.7Scaled at 2.7ÅÅ3 molecules per ASU, solution from Shelx, 3 molecules per ASU, solution from Shelx, model with Solve/Resolve and model with Solve/Resolve and OO..Current R-factor 18% (RCurrent R-factor 18% (Rfree free – 29%) further – 29%) further refinement is still necessaryrefinement is still necessary
TM1865 – overall structureTM1865 – overall structure
Asymmetric trimer Monomer
TM1865 belongs to the RNaseHI superfamily. TM1865 belongs to the RNaseHI superfamily.
RNaseHI overall structure: Structure of catalytic center:
Catalytic site consists of 3-5 residues coordinating two metal ions (Mg or Mn).Metals are known to be crucial for catalysis: one is believed to lower the pKa of attacking nucleophile (water), another is believed to stabilize the negative charge on the formed pentacovalent intermediate.
RNaseHI fold family – proteins in PDB with RNaseHI-like fold RNaseHI fold family – proteins in PDB with RNaseHI-like fold
RNaseHI - cleaves RNA strand if it is in duplex with DNARNaseHI - cleaves RNA strand if it is in duplex with DNA
UvrC – major part of bacterial DNA repair system. Recognizes irregularities UvrC – major part of bacterial DNA repair system. Recognizes irregularities in the DNA structure in the DNA structure
RuvC – Holliday junction resolvaseRuvC – Holliday junction resolvase
Retroviral Integrase – integrates viral genome into host’s DNARetroviral Integrase – integrates viral genome into host’s DNA
Argonaute – Important players in RNA interferenceArgonaute – Important players in RNA interference
Transposase – incorporates DNA fragments into another DNA Transposase – incorporates DNA fragments into another DNA
Mitochondrial ResolvaseMitochondrial Resolvase
RNaseHII - cleaves RNA strand if it is in duplex with DNARNaseHII - cleaves RNA strand if it is in duplex with DNA
All these proteins cleave DNA or RNA strands to perform their functionAll these proteins cleave DNA or RNA strands to perform their function
Closest homologs in PDBClosest homologs in PDB
2nrt (magenta) subdomain of UvrC protein from 2dqe – protein with unknown functionTM. Uvr is a major DNA repair system in bacteria UPF0125 proteins are found in some organisms living in extreme conditions
Active sites of TM1865 (yellow) and UvrC (gray) Active sites of TM1865 (yellow) and UvrC (gray) seem to be identicalseem to be identical
Tm 1865 ConclusionsTm 1865 Conclusions
Endonuclease V belongs to RNase H Endonuclease V belongs to RNase H superfamily of proteinssuperfamily of proteinsThere are no structures of Endonuclease V in There are no structures of Endonuclease V in PDB but 2 recent structures have similar fold; PDB but 2 recent structures have similar fold; there are more similar structures known within there are more similar structures known within RNAse H superfamily.RNAse H superfamily.Catalytic sites of UvrC and EndonucleaseV are Catalytic sites of UvrC and EndonucleaseV are identicalidentical
1414
Tm0439Tm0439
Site 1) E188,K119,K122Site 2) K2, K3Site 3) E30, K31
MWMW 25.0 kDa25.0 kDa
# of Residues# of Residues 214214
pIpI 5.365.36
Gravy IndexGravy Index -.38-.38
# of Mets# of Mets
1515
Rigali, S. et al. J. Biol. Chem. 2002;277:12507-12515
Unrooted tree of the proteins of the GntR family
HTH motifEffector binding domainFour subfamilies: FadR, HutC, MocR, and YtrA.FadR subfamily: FadR and VanR
FadR 1st, regroups 40%All helical C-terminal domain7 or 6 helicesVanR-like regulators, 170 aa and 150 aaRegulation of oxidized substrates
1616
Data collection Statistics
Wavelength (Å)1 (inflection)
0.97980
2 (peak)
0.97960
3 (remote)
0.95370
Space group C2 C2 C2
Unit cell (Å): a=85.09, b=72.72, c=43.32, =90 º, =104.6 º, =90 º
ResolutionHighest bin (Å)Redundancy
Completeness (%)Rmerge
I/I
41.92-2.102.18-2.106.5 (3.6)
81.7 (27.4)0.63 (0.358)
31.2 (2.5)
41.92-2.102.18-2.106.8 (4.2)
94.5 (64.9)0.053 (0.286)
42.4 (3.4)
41.92-2.102.18-2.107.0 (5.1)
97.8 (84.2)0.054 (0.209)
12.3 (1.3)
Refinement Statistics
Resolution (Å)Reflections (working)
Reflections (test)Rfree test (%)
R (%)Rfree (%)
Number of watersNumber of molecules in the asymmetric unit
41.92-2.211966
6204.9
17.724.1160
1
r.m.s. Deviations
Bonds (Å)Angels (º)
0.0101.120
SERp
Crystal
1717
Crystal contact of Tm0439Crystal contact of Tm0439
130A131A
134A
Wild type: crystals, poor
Mutant: 130E131K134K2AAA, 1A, good quality
Crystal contact
N
C
N
C
1818
Tm0439 2HS5 1E2X
V46
D78
E54
D58
A33
D19
D85
T25
V91
S7
N76
1
2
3
DNA-Binding domain of Tm0439DNA-Binding domain of Tm0439
An HTH motif: 2 and 3, tight turn
Superimpose: conserved 2nd structure element, HTH motif: Tm0439: V46-E70, 2HS5: E54-D78, 1E2X: A33-D58
1-2 loops, equal length, conformation
E70
1919
Stereo model of Tm0439-DNA complexStereo model of Tm0439-DNA complex
12
3
1
2
12
12
3
Putative DNA contacts: 4 distinct regions
1: At the N-terminus, side chains of V18, L19, V21, and M13-E17 couldn’t be seen
2: At the beginning of 2 helix, V46 and R47
3: 3, major groove, residues S56, F57, T58, P59 and R61
4: At the tip of the 1-2 hairpin, P78 and R79
The proposed Tm0439-DNA binding mode
2020
1E2X 2HS5Tm0439
45
6
7
8
9
86
226
Effector-binding domain of Tm0439Effector-binding domain of Tm0439
C-terminal domain: 6 -helices (4-9) with short connecting loops, form a bundle
1E2X has 7 helices
2HS5 has 6 helices
All helices bundle, superimposed together
2121
Tm0439 FadR Tm0439 dimer
FadR dimer
The putative switch mechanism of The putative switch mechanism of Tm0439Tm0439
45
6 7
8 9
45
6 7
8 9
Cavity
557 7
N
C
N
2222
Tm1382Tm1382
Site 1) K158,E159,K160Site 2) K77,Q78,E80Site 3) E47, E49
MWMW 22.9 kDa22.9 kDa
# of Residues# of Residues 199199
pIpI 4.984.98
Gravy IndexGravy Index -.31-.31
# of Mets# of Mets 44
2323
NudixNudix Hydrolase Superfamily Hydrolase SuperfamilyPyrophosphohydrolases that act upon Nucleoside DIphosphate connected to another moiety (X)
Such substrates include (d)NTPs (both canonical and oxidised derivatives), nucleotide sugars and alcohols, dinucleoside polyphosphates (NpnN),
dinucleotide coenzymes and capped RNAs.
The substrate diversity requires equally diverse chemistries. The substrate diversity requires equally diverse chemistries.
Tm1382 is classified as a MutT hydrolase by the JCSG, but Tm1382 is classified as a MutT hydrolase by the JCSG, but it is 50% larger than most members of the family.it is 50% larger than most members of the family.
Consensus Nudix Sequence
Gx5Ex5[UA]xREx2EExGU
Tm1382 Sequence Gx4Ex5LxREx2EExDV
2424
Tm1382 Tm1382 SpacegroupSpacegroup PP2211
CellCell a=47.6 b=62.65 a=47.6 b=62.65 c=74.5 c=74.5 ββ==98.598.5
Resolution (Resolution (Å)Å) 40 – 2.3 (2.38-2.30)40 – 2.3 (2.38-2.30)
Completeness (%)Completeness (%) 90.8 (67.3)90.8 (67.3)
Rsym (%)Rsym (%) 7.8 (26.5)7.8 (26.5)
Average I/Average I/ 21.6 (3.19)21.6 (3.19)
Current Working Model
2525
Some parts are missingSome parts are missingtm1382-wt MKSERILVVKTEDFLKEFGEFEGFMRVNFEDFLNFLDQYGFFRERDEAEYDETTKQVIPY 60working-chA --GGG---GGGGGFLKEFGEFEGFMRVNFEDFLNFLDQYGFFRERDEAEYDETTKQVIPY 55working-chB -----ILVVKTEDFLKEFGEFEGFMRVNFEDFLNFLDQYGFFRERDEAEYDETTKQVIPY 55 .***********************************************
tm1382-wt VVIMDGDRVLITKRTTKQSEKRLHNLYSLGIGGHVREGDGATPREAFLKGLEREVNEEVD 120working-chA VVIMDGDRVLITK-------------YSLGIGGHVRR-------EAFLKGLEREVNEEVD 95working-chB VVIMDGDRVLIT--------------YSLGIGGHVRE------REAFLKGLEREVNEEVD 95 ************ **********. ****************
tm1382-wt VSLRELEFLGLINSSTTEVSRVHLGALFLGRGKFFSVKEKDLFEWELIKLEELEKFSGVM 180working-chA VGGGGGGFLGLINSSTTEVSRVHLGALFLGRGKFFSVGGGGG------GGGGGGGFSGVM 149working-chB VSLRELEFLGLINSSTTEVSRVHLGALFLGRGKFFSVGGGGG------GGGGGGGFSGVM 149 *. ****************************** . *****
tm1382-wt EGWSKISAAVLLNLFLTQN 199working-chA EGWSKISAAVLAG---GGG 165working-chB EGWSKISAAVLL------- 161 ***********
Gx4Ex5LxREx2EExDV
2626
Some Distant HomologuesSome Distant Homologues(Top Dali Hits) (Top Dali Hits)
1htz1hx3
2fkb ModBaseModelFound on the PSI Knowledgebase
2727
Tm1679Tm1679
Site 1) K159,E160Site 2) K78,E79Site 3) K100, K101
MWMW 28.528.5
# of Residues# of Residues 255255
pIpI 5.955.95
Gravey IndexGravey Index -.25-.25
# of Mets# of Mets 44
2828
Tm1679Tm1679We thought there was no viable MR model (see below), but thank to the PSI Structural Genomics Knowledgebase, we have the structure. (http://kb.psi-structuralgenomics.org/KB/)
2p4z35% Identity
RFZ=7.3 TFZ=8.8 PAK=0 LLG=74 LLG=74
The Surface problemThe Surface problem““In accordance with the assumption that solvent exposure In accordance with the assumption that solvent exposure of a residue is directly related to its probability of forming of a residue is directly related to its probability of forming random contacts, accessible surface area might be used random contacts, accessible surface area might be used as the basis of a reference state to compute the number as the basis of a reference state to compute the number of random contacts expected.”(Dasgupta1997)of random contacts expected.”(Dasgupta1997)
surface = sum over all atoms.surface = sum over all atoms.
85% residues have ASA > 085% residues have ASA > 0
ASA
VdW
contacts
Selection is futileSelection is futile
Area-based comparisons are almost as bad as Area-based comparisons are almost as bad as number based.number based.
No ASA or rASA threshold will fix different No ASA or rASA threshold will fix different distributionsdistributions
Leu
Lys
Patch analysis of crystal contactsPatch analysis of crystal contacts
Jones&Thornton introduced a patch Jones&Thornton introduced a patch methodology to analyse properties of methodology to analyse properties of biologically relevant interfaces on the protein biologically relevant interfaces on the protein surface. surface.
The major problems are:The major problems are: defining a single contact (interface): defining a single contact (interface):
coordination number (only binary)coordination number (only binary)
clustering (artifacts)clustering (artifacts) sampling the surface:sampling the surface:
make random interfacesmake random interfaces
Spherical protein approximationSpherical protein approximation
coordinate system and distance measure:coordinate system and distance measure:
x,y r,φ
r,φ φ
in 3D:- three (0,2π) angles.- one for each axis.- + r the radius
Pros:- easy to cluster!- with r, mahalanobis
do we need r?
Space is the place Space is the place ”Sun Ra””Sun Ra”
We need to measure the distance between We need to measure the distance between atoms to make continuous patches on the atoms to make continuous patches on the surface:surface: the coordinate space affects sampling frequency the coordinate space affects sampling frequency
possibly introducing bias. possibly introducing bias.
zenpdbzenpdb
getting information from pdb filesgetting information from pdb files
robust ... workflow based ... scalablerobust ... workflow based ... scalable
object orientedobject oriented
outsourcing:outsourcing: Areaimol, Ncont/Act, Stride, MSMS numpy/scipy (k-means clustering) scipy-cluster (hierarchical clustering) Bio.KDTree (NN distance look-up) scikits.ANN (NN k look-up) CGAL, CGAL-python (voronoi) PyTables (bindings for hdf5)
The noble 8-fold path:The noble 8-fold path:from zenpdb import *from zenpdb import *
file_name = 'some_pdb_file'file_name = 'some_pdb_file'
parser = PDBParser(forgive =1)parser = PDBParser(forgive =1)
parser.set_file(file_name)parser.set_file(file_name)
structure = p.get_structure(file_name[0:4])structure = p.get_structure(file_name[0:4])
ACTAtomContacts(in_file, structure)ACTAtomContacts(in_file, structure)
residues = einput(structure, 'R')residues = einput(structure, 'R')
r_x = residues._select_children({}, 'gt', \r_x = residues._select_children({}, 'gt', \
'CNT_ACT_X', xtra=True).values()'CNT_ACT_X', xtra=True).values()
HierarchicalResidueClusters(r_x, dmethod ='mahalanobis', HierarchicalResidueClusters(r_x, dmethod ='mahalanobis', lmethod='average', criterion ='maxclust', t=6)lmethod='average', criterion ='maxclust', t=6)
BeQu('new_pdb_file.pdb', structure, 'R', 'H_CLUST')BeQu('new_pdb_file.pdb', structure, 'R', 'H_CLUST')
http://code.google.com/p/zenpdb/
3636
Structures Around the CornerStructures Around the Corner(need phasing power)(need phasing power)
Tm0260 Tm0260 Several data sets diffracting to ~2.2 Several data sets diffracting to ~2.2 Å (Å (RR32)32) Should have 8 Seleniums in the ASUShould have 8 Seleniums in the ASU MR encouragingMR encouraging
Tm1024Tm1024 Lots of beautiful crystals Lots of beautiful crystals Several data sets to ~ 2.4 Å of 1A and 1Y mutantsSeveral data sets to ~ 2.4 Å of 1A and 1Y mutants Only 1 Methionine. Only 1 Methionine.
Creating several L->M mutationsCreating several L->M mutations
Creating the 1M Mutant (K45M, K46M) Creating the 1M Mutant (K45M, K46M)
3737
Tm0260Tm0260Putative phosphate regulatory proteinPutative phosphate regulatory protein
Site 1) K153,E154,K155Site 2) E10,E11Site 3) E78,K79
MWMW 25.8 kDa25.8 kDa
# of Residues# of Residues 222222
pIpI 5.055.05
Gravy IndexGravy Index -.36-.36
# of Mets# of Mets 88
3838
MR encouraging, but…MR encouraging, but…
The closest model is only The closest model is only 16% identical and is 16% identical and is symmetrical. Long helices symmetrical. Long helices can be seen, but there are no can be seen, but there are no side chain features and the side chain features and the
ends are ambiguous.ends are ambiguous. 2iiu
3939
UVAUVAZygmunt DerewendaZygmunt DerewendaJakub BielnickiJakub Bielnicki
Marvin CieslikMarvin CieslikWonChan ChoiWonChan ChoiDavid CooperDavid CooperUlla Derewenda Ulla Derewenda Monika KijanskaMonika KijanskaNatalya Olekhnovich Natalya Olekhnovich Darkhan Darkhan UtepbergenovUtepbergenovJennifer WingardJennifer WingardMeiying ZhengMeiying Zheng
Tomek BoczekTomek BoczekKasia GrelewskaKasia GrelewskaGosia PinkowskaGosia PinkowskaMichal ZawadzkiMichal ZawadzkiEliza ZylkiewicEliza Zylkiewic
Los Alamos Nat’l LabLos Alamos Nat’l LabTom Terwilliger Tom Terwilliger Chang Yub KimChang Yub Kim
UCLAUCLADavid Eisenberg David Eisenberg Luki GoldschmidtLuki GoldschmidtTom HoltonTom Holton
Lawrence Berkeley Nat’l LabLawrence Berkeley Nat’l LabLi-Wei Hung Li-Wei Hung Minmin Yu (Big Thanks)Minmin Yu (Big Thanks)Jeff HabelJeff Habel
And ALL ISFI members!And ALL ISFI members!
The ISFI is funded by NIH U54 GM074946.
Several slides follow.
4040
DisMeta – a NESG MetaServerDisMeta – a NESG MetaServer http://www-nmr.cabm.rutgers.edu/bioinformatics/disorder/
Queries up to 12 different disorder prediction servers.
4141
http://kb.psi-structuralgenomics.org/KB/http://kb.psi-structuralgenomics.org/KB/
Submit a sequence!
4242
http://kb.psi-structuralgenomics.org/KB/http://kb.psi-structuralgenomics.org/KB/
Click Here
To access these tabs
4343
http://http://ffas.burnham.org/XtalPred-cgi/xtal.plffas.burnham.org/XtalPred-cgi/xtal.pl