+ All Categories
Home > Technology > Computational Protein Design. 2. Computational Protein Design Techniques

Computational Protein Design. 2. Computational Protein Design Techniques

Date post: 11-May-2015
Category:
Upload: pablo-carbonell
View: 1,639 times
Download: 9 times
Share this document with a friend
Popular Tags:
45

Click here to load reader

Transcript
Page 1: Computational Protein Design. 2. Computational Protein Design Techniques

Computational Protein Design2. Computational Protein Design Techniques

Pablo [email protected]

iSSB, Institute of Systems and Synthetic BiologyGenopole, University d’Évry-Val d’Essonne, France

mSSB: December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 1 / 45

Page 2: Computational Protein Design. 2. Computational Protein Design Techniques

Outline

1 Introduction

2 Computational Protein Descriptors

3 Sequence-based CPD

4 Structure-based CPD

5 Search Algorithms in CPD

6 De Novo Design

7 Challenges in Sequence and Structure-Based CPD

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 2 / 45

Page 3: Computational Protein Design. 2. Computational Protein Design Techniques

Outline

1 Introduction

2 Computational Protein Descriptors

3 Sequence-based CPD

4 Structure-based CPD

5 Search Algorithms in CPD

6 De Novo Design

7 Challenges in Sequence and Structure-Based CPD

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 3 / 45

Page 4: Computational Protein Design. 2. Computational Protein Design Techniques

Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 4 / 45

Page 5: Computational Protein Design. 2. Computational Protein Design Techniques

A Blueprint of CPD Approaches

∗RS : research studiesPablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 5 / 45

Page 6: Computational Protein Design. 2. Computational Protein Design Techniques

Outline

1 Introduction

2 Computational Protein Descriptors

3 Sequence-based CPD

4 Structure-based CPD

5 Search Algorithms in CPD

6 De Novo Design

7 Challenges in Sequence and Structure-Based CPD

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 6 / 45

Page 7: Computational Protein Design. 2. Computational Protein Design Techniques

Molecular Signature Descriptors

A 2D representation of the molecular graphsas an undirected colored graphs G(V ,E ,C),with V : atoms, E : bonds, C : atom type

The signature descriptor of height h of atom xin the molecular graph G, or hσ(x), is acanonical representation of the subgraph ofG containing all atoms that are at distance hfrom x

Atomic signature :

hσ(G) =Xx∈V

hσ(x) (1)

The signature is a systematiccodification of the moleculargraph [Faulon et al., 2004]

σ(methylcyclopropane) =

1 [C]([H][C]([H][H][C,0])[C,0]([H][H])[C]([H][H][H]))2 [C]([H][H][C]([H][C,0][C]([H][H][H]))[C,0]([H][H]))1 [C]([H][H][H][C]([H][C]([H][H][C,0])[C,0]([H][H])))1 [H]([C]([C]([H][H][C,0])[C,0]([H][H])[C]([H][H][H])))4 [H]([C]([H][C]([H][C,0][C]([H][H][H]))[C,0]([H][H])))3 [H]([C]([H][H][C]([H][C]([H][H][C,0])[C,0]([H][H]))))

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 7 / 45

Page 8: Computational Protein Design. 2. Computational Protein Design Techniques

Molecular Signature of Reactions and Proteins

Signature of a reaction. The signature of reaction R

S1 + S2 + . . .+ Sn → P1 + P2 + . . .+ Pn (2)

that transforms n substrates into m products is given by the difference between thesignature of the products and the signature of the substrates:

hσ(R) =Xp∈P

hσ(p)−Xs∈S

hσ(s) (3)

Signature of protein sequences. The protein P is represented by the linearchain given by its collapsed graph at residue level, a reduced molecular graphrepresentation G(V ,E ,C) known as string signature where V : residues a ∈ A,E : contiguous in sequence, C : amino acid type

hσ(P) =Xa∈A

hσ(a) (4)

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 8 / 45

Page 9: Computational Protein Design. 2. Computational Protein Design Techniques

Protein Contact Maps

The protein contact map is a graphrepresentation of the 3D interactionsat residue level G(V ,E ,C) where V :residues, E : contacts, C : amino acidtype

Two residues are considered tointeract when atoms between bothresidues are at a distance lower than apredetermined threshold (tipically4.5 ∼ 5 Å)

Contact maps can account forlong-range interactions andconformational states

Song et al. [2010]

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 9 / 45

Page 10: Computational Protein Design. 2. Computational Protein Design Techniques

Outline

1 Introduction

2 Computational Protein Descriptors

3 Sequence-based CPD

4 Structure-based CPD

5 Search Algorithms in CPD

6 De Novo Design

7 Challenges in Sequence and Structure-Based CPD

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 10 / 45

Page 11: Computational Protein Design. 2. Computational Protein Design Techniques

Sequence and Structure-Based CPD

Sequence-based CPD methods are in some cases a good trade-off betweencomplexity of the model and accuracy of the predictions

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 11 / 45

Page 12: Computational Protein Design. 2. Computational Protein Design Techniques

Sequence-based Knowledge-based potentials

The simplest way to score a protein and to identify active regions is through aminoacid scales or indexesAAindex is a database of

544 amino acid indexes94 Amino Acid Matrices47 amino acid pair-wise contact potentials

Examples: hydrophobicity,accessibility, van der Waals volume,secondary structure propensity,flexibility

This approach is widely used whenanalyzing conserved motifs andcorrelated mutations in protein foldfamilies through multiple alignments

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 12 / 45

Page 13: Computational Protein Design. 2. Computational Protein Design Techniques

Quantitative Structure-Activity Relationship (QSAR) Techniques

QSAR is a statistical method usedextensively by the chemical andpharmaceutical industries insmall-molecules and peptideoptimization

The goal is to model causal relationshipsbetween

structures of interacting molecules

measurables properties of scientificor commercial interest such asADME/Tox (absorption, distribution,metabolism, excretion, and toxicity) ofdrugs

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 13 / 45

Page 14: Computational Protein Design. 2. Computational Protein Design Techniques

QSAR Model Evaluation

Model predictability is generally evaluated through the leave-one-out (LOO)cross-validation correlation coefficient q2

Partial least-squares (PLS) regression is commonly used

Additional nonlinear terms can be added through the use of nonlinear regressionor machine learning techniques (kernel methods, random forests, etc)

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 14 / 45

Page 15: Computational Protein Design. 2. Computational Protein Design Techniques

QSAR Modeling Workflow

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 15 / 45

Page 16: Computational Protein Design. 2. Computational Protein Design Techniques

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 16 / 45

Page 17: Computational Protein Design. 2. Computational Protein Design Techniques

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 17 / 45

Page 18: Computational Protein Design. 2. Computational Protein Design Techniques

The ProSAR Algorithm

An extension of SAR-based approaches to CPD

It formalizes the decision-making processes about which mutations to include incombinatorial libraries

y =NX

i=1

Xj∈A

cijxij (5)

y : the predicted function (activity) of the protein sequencecij : the regression coefficients corresponding to the mutational effect of having residuej among the 20 amino acids A at postion ixij : binary variable indicating the presence or absence of residue j at position i

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 18 / 45

Page 19: Computational Protein Design. 2. Computational Protein Design Techniques

Improving Catalytic Function by ProSAR-driven Enzyme Evolution

Codexis Inc.

Statistical analysis of protein sequenceactivity relationships

Bacterial biocatalysis ofAtorvastatin (Lipitor)

(cholesterol-lowering drug)

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 19 / 45

Page 20: Computational Protein Design. 2. Computational Protein Design Techniques

Outline

1 Introduction

2 Computational Protein Descriptors

3 Sequence-based CPD

4 Structure-based CPD

5 Search Algorithms in CPD

6 De Novo Design

7 Challenges in Sequence and Structure-Based CPD

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 20 / 45

Page 21: Computational Protein Design. 2. Computational Protein Design Techniques

Structure-based CPD

Energy functions and molecular force fields

Local conformational restrictions

Predicting entropic factors

Protein topological properties

From Narasimhan et al. [2010]

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 21 / 45

Page 22: Computational Protein Design. 2. Computational Protein Design Techniques

Energy Functions and Molecular Force Fields

In structure-based CPD, folds are usuallyrepresented by the spatial coordinates of thebackbone atoms or design scaffoldProtein design is done by amino acid sidechains along the scaffold

Side chains are only permitted to assume adiscrete set of statistically preferredconformations: rotamersRotamer/backbone and rotamer/rotamerinteraction energies are tabulated

These potential energies can then beapproximated by using any of the standardforce fields : CHARMM, AMBER, GROMOS

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 22 / 45

Page 23: Computational Protein Design. 2. Computational Protein Design Techniques

Molecular Force Fields

AMBER: a classical force field for energy and MD calculations:

V (rN) =Xbonds

12

kb(l − l0)2 +X

angles

12

ka(θ − θ0)2 +X

torsions

12

Vn[1 + cos(nω − γ)]

+N−1Xj=1

NXi=j+1

(εi,j

"„r0ij

rij

«12

− 2„

r0ij

rij

«6#

+qiqj

4πε0rij

)(6)

1P

bonds(·): energy between covalently bonded atoms.2P

angles(·): energy due to the geometry of electron orbitals involved in covalentbonding.

3P

torsions(·): energy for twisting a bond due to bond order (e.g. double bonds) andneighboring bonds or lone pairs of electrons.

4PN−1

j=1

PNi=j+1(·): non-bonded energy between all atom pairs:

1 van der Waals energies2 Electrostatic energies

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 23 / 45

Page 24: Computational Protein Design. 2. Computational Protein Design Techniques

Structure-based Knowledge-based Potentials

They are built by performing a large-scale statistical study of structural databasessuch as PDB (Protein Data Bank)

Rotamer libraries (∼ 150 rotameric states)Binary patterning: only some type of amino acids are allowed based on thehydrophobic environmentAn implicit solvation modelSecondary structure propensityFrequency of small segments in the PDBPairwise potentialsvan der Waals interactionsHydrogen bondingElectrostaticsEntropy-based penalties for flexible side-chains

From Boas and Harbury [2007]

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 24 / 45

Page 25: Computational Protein Design. 2. Computational Protein Design Techniques

Energy Functions

Design along the backbone or scaffoldRotamer/backbone and rotamer/rotamer interact. energies tabulated

Precomputed from molecular force fields : CHARMM, AMBER, GROMOS

Total energy of the protein

ETOT =X

k

Ek (rk ) +Xk 6=l

Ekl (rk , rl ) (7)

N : length of the protein

rk : the rotamer of the kth side chain

Ek (rk ) : the self-energy of a particular rotamer rk

Ekl (rk , rl ) : the pair energy of rotamers rk , rj

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 25 / 45

Page 26: Computational Protein Design. 2. Computational Protein Design Techniques

The Role of Dynamics

Besides protein structure, protein dynamics can play a direct role in molecularrecognition

Flexible proteins recognize their targets through induced fit or conformationalselection, likely showing promiscuity

Binding is commonly enthalpy-driven, but in some cases entropy is important, forinstance:

Proteins with multiple binding sitesSmall hydrophobic molecules

Two types of source of protein motions:Protein flexibility: intraconformational dynamics (fast time scale motions)Conformational heterogeneity: interconformational dynamics

Gibbs free energy:

∆G = ∆H − T ∆S (8)

∆S = ∆Ssolv + ∆Sconf + ∆Srt (9)

∆Sconf : conformational entropy of protein and ligand

∆Srtf : rotational and translational degree of freedoms

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 26 / 45

Page 27: Computational Protein Design. 2. Computational Protein Design Techniques

Predicting Side-chain Dynamics from Structural Descriptors

The Lipari-Szabo model free approach approach allows to quantify motions fromNMR experiments by computing the generalized order parameter S2

Protein backbone dynamics : 15NH and 13CαH NMR relaxation methodsProtein side chain methyl dynamics : 13CαH NMR relaxation methods (side-chainmotions in the picosecond-to-nanosecond time regime)

From the BMRB we compiled S2 data for 18 proteins, including 10 proteins in 2 ormore different states : calmodulin, barnase, pdz, mup, dfhr, staphylococcalnuclease, pin1, sh3 domain, MSG

This technique provides only measurements for the Cα of methyl groups in sidechains : ALA, LEU, ILE, MET, THR, VAL

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 27 / 45

Page 28: Computational Protein Design. 2. Computational Protein Design Techniques

Structural Descriptors of Methyl Dynamics

We consider the following parameters influencing side-chain dynamics :Packing density at the methyl site i and its neighboring residues j within a sphere ofr = 5 Å

Pi =X

rij<5Å

Cj e−rij =X

rij<5Å

0B@ Xrjk<5Å

e−rjk

1CA e−rij (10)

Side chain stiffness : number of dihedral angles separating the backbone from themethyl carbon. weighted by the side-chain packingRotameric state : angular distance ∆χ = χ− χ0 to the closest rotameric state χ0 inthe libraryElongation : distance from the methyl site to the CαPairwise contact potential : a knowledge-based potential of frequence of contactsbetween residues at several distances computed from the PDBSolvation effect : DSSP accessibility and residue hydrophobicityVan der Waals contactsHydrogen bonds (in the case of Threonine)

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 28 / 45

Page 29: Computational Protein Design. 2. Computational Protein Design Techniques

Predicting Methyl Side-chain Dynamics

Algorithm : neural networkCross-validation : r = 0.71± 0.029(p-value = 4.6× 10−87)

Protein MD method r (MD) r (nnet)

ubiquitin AMBER99SB 0.81 0.81TNfn3 CHARMM 22 0.62 0.79FNfn10 CHARMM 22 0.51 0.64barnase OPLS-AA/L 0.55 0.64calmodulin FDPB 0.60 0.72

Example : experimental and predictedchanges in ∆S2 of barnase after bindingbarstar

∆S2 > 0 ∆S2 < 0

rigidification flexibilization

[Carbonell and del Sol, 2009]

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 29 / 45

Page 30: Computational Protein Design. 2. Computational Protein Design Techniques

Outline

1 Introduction

2 Computational Protein Descriptors

3 Sequence-based CPD

4 Structure-based CPD

5 Search Algorithms in CPD

6 De Novo Design

7 Challenges in Sequence and Structure-Based CPD

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 30 / 45

Page 31: Computational Protein Design. 2. Computational Protein Design Techniques

Search Algorithms in CPD

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 31 / 45

Page 32: Computational Protein Design. 2. Computational Protein Design Techniques

Search Algorithms

Objective: finding the best design within the space of all possible aminoacid/rotameric states

A vast search space: 20N or pN

N: number of positions to mutatep: number of rotameric states

StrategiesDeterministic algorithms

Dead-end elimination (DEE) algorithm: a pruning method.Some accelerations of the DEE algorithm: upper-bound estimation; the “magic bullet” metric;conformational splitting; background optimization

Stochastic algorithmsMonte CarloSimulated annealingGenetic algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 32 / 45

Page 33: Computational Protein Design. 2. Computational Protein Design Techniques

The DEE Algorithm

It assumes that the energy of the protein can be written as

ETOT =X

k

Ek (rk ) +Xk 6=l

Ekl (rk , rl ) (11)

N : length of the protein

rk : the rotamer of the kth side chain

Ek (rk ):" the self-energy of a particular rotamer rk

Ekl (rk , rl ): the pair energy of the rotamers rk , rj

Complexity:Single search scales quadratically with total number of rotamers O((p × N)2)Pair search scales cubically O((p × N)3)Brute force enumeration : O(pN )

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 33 / 45

Page 34: Computational Protein Design. 2. Computational Protein Design Techniques

The DEE Algorithm

Single rotamers and rotamer pairs are eliminated during the computational cyclesSingle elimination : eliminate rotamer if some other rotamer in the side chain givesbetter energy

Ek (rAk ) +

NXl=1

minX

Ekl (rAk , r

Xl ) > Ek (rB

k ) +NX

l=1

maxX

Ekl (rBk , r

Xl ) (12)

Pairs elimination : eliminate pair of rotamers in two positions if there exists anotherpair that gives better energy

UABkl

def= Ek (rA

k ) + El (rBl ) + Ekl (rA

k , rBl ) (13)

UABkl +

NXi=1

minX

“Eki (rA

k , rXi ) + Elj (rB

l , rXj )”>

UCDkl +

NXi=1

maxX

“Eki (rC

k , rXi ) + Elj (rD

l , rXj )”

(14)

Values are precomputed and stored in energy matrices

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 34 / 45

Page 35: Computational Protein Design. 2. Computational Protein Design Techniques

Stochastic Algorithms

Search in the space of feasible designs by making a series of combinations ofrandom and directed moves

Monte Carlo Metropolis: a move consists of exchanging one rotamer for anotherat a randomly chosen position, a modification is accepted if it lowers the energy

Simulated Annealing allows to explore nearby solutions at the initial cycles of thesearch

Genetic Algorithms: a population of models is propagated (evolved) throughoutthe course of the run and genetic operators, such as recombination, are used tocreate new models from existing parents

They are fast, can be scaled up to problems of large complexity

They are not guaranteed to converge to the optimal solution

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 35 / 45

Page 36: Computational Protein Design. 2. Computational Protein Design Techniques

The SCHEMA Algorithm

Equivalent to an in silico directed evolutionConsists of scoring libraries of hybrid proteinsequences against the parental sequenceScoring:

Calculate the number of interactions between residues(contacts within 4.5 Å) that are disrupted in the creationof hybrid proteinsHybrids are scored for stability by counting the number ofdisruptionsProtein is partitioned into blocks that should notinterrupted by crossovers (analog to genetic algorithms) From [Meyer et al., 2006]

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 36 / 45

Page 37: Computational Protein Design. 2. Computational Protein Design Techniques

The OPTCOM and IPRO Algorithms for Library Design

The OPTCOM algorithm:Balances size andquality of the library

The IPRO algorithm:Identify point mutations in the parent sequencesusing energy-based scoring fuctionsResidue and rotamer choices are driven by amixed-integer linear programming formulation(MILP)

From [Saraf et al., 2006]

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 37 / 45

Page 38: Computational Protein Design. 2. Computational Protein Design Techniques

Some Web Resources

IPRO: Iterative Protein Redesign and Optimization.http://maranas.che.psu.edu/IPRO.htm

EGAD: A Genetic Algorithm for protein Design.http://egad.ucsd.edu/software.php

RosettaDesign: A software package.http://rosettadesign.med.unc.edu/

SCHEMA A pair-wise energy function for scoring protein chimeras made fromhomologous proteins. http://www.che.caltech.edu/groups/fha/schema-tools/schema-overview.html

SHARPEN: Systematic Hierarchical Algorithms for Rotamers and Proteins onan Extended Network.http://koko.che.caltech.edu/sharpenabout.html

WHAT IF: Software for protein modelling, design, validation, andvisualisation. http://swift.cmbi.ru.nl/whatif/

FoldX: A force field for energy calculations and protein design.http://foldx.crg.es/

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 38 / 45

Page 39: Computational Protein Design. 2. Computational Protein Design Techniques

Outline

1 Introduction

2 Computational Protein Descriptors

3 Sequence-based CPD

4 Structure-based CPD

5 Search Algorithms in CPD

6 De Novo Design

7 Challenges in Sequence and Structure-Based CPD

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 39 / 45

Page 40: Computational Protein Design. 2. Computational Protein Design Techniques

De Novo-Designed Proteins

In de novo designs, some assumptions are needed in order to make the searchspace tractable

Usually we start from some basic motifs or domains as scaffolds for the design

Examples:βαβ motif resembling a zinc finger3 and 4 helix bundlesHelical coiled-coils

Helix bundle motifs can be parametrized using a few global variables thatdescribe the global structure

Applications:New metal-binding sitesNonbiological cofactors for novel biomaterials and electromechanical devicesNovel enzymatic activities

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 40 / 45

Page 41: Computational Protein Design. 2. Computational Protein Design Techniques

Example: De Novo Design of a Metalloprotein

Computational de novo design of a four-helix (108 residues) bundle containing thenon-biological cofactor iron diphenyl porphyrin (DPP-Fe) [Bender et al., 2007]

The initial helix bundle was selected as low-energy structure computed with MCSASTITCH: a program to select loops connecting helices from PDB SelectCHARMM and PROCHECK for removing overlaps4 His and the 4 Thr residues to support the 6-point coordination of the Fe(III) cationsSCADS: provides side-dependent amino acid probabilities in each round

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 41 / 45

Page 42: Computational Protein Design. 2. Computational Protein Design Techniques

Outline

1 Introduction

2 Computational Protein Descriptors

3 Sequence-based CPD

4 Structure-based CPD

5 Search Algorithms in CPD

6 De Novo Design

7 Challenges in Sequence and Structure-Based CPD

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 42 / 45

Page 43: Computational Protein Design. 2. Computational Protein Design Techniques

Challenges in Sequence and Structure-Based CPD

ModelingGreater availability of 3D protein structural informationMore accurate energy functionsImprovement of rigid and flexible docking

DesignImprovement in search algorithmsParametrization for non-natural amino acids

PredictionBeyond additive models: using machine-learning algorithms

More complete environment descriptors

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 43 / 45

Page 44: Computational Protein Design. 2. Computational Protein Design Techniques

Computational Protein Design2. Computational Protein Design Techniques

Pablo [email protected]

iSSB, Institute of Systems and Synthetic BiologyGenopole, University d’Évry-Val d’Essonne, France

mSSB: December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 44 / 45

Page 45: Computational Protein Design. 2. Computational Protein Design Techniques

Bibliography I

Gretchen M. Bender, Andreas Lehmann, Hongling Zou, Hong Cheng, H. Christopher Fry, Don Engel, Michael J. Therien, J. Kent Blasie, Heinrich Roder,Jeffrey G. Saven, and William F. DeGrado. De Novo Design of a Single-Chain Diphenylporphyrin Metalloprotein. Journal of the American ChemicalSociety, 129(35):10732–10740, September 2007. ISSN 0002-7863. doi: 10.1021/ja071199j. URL http://dx.doi.org/10.1021/ja071199j.

F. Edward Boas and Pehr B. Harbury. Potential energy functions for protein design. Current opinion in structural biology, 17(2):199–204, April 2007. ISSN0959-440X. doi: 10.1016/j.sbi.2007.03.006. URL http://dx.doi.org/10.1016/j.sbi.2007.03.006.

Pablo Carbonell and Antonio del Sol. Methyl side-chain dynamics prediction based on protein structure. Bioinformatics, pages btp463+, July 2009. doi:10.1093/bioinformatics/btp463. URL http://dx.doi.org/10.1093/bioinformatics/btp463.

Jean-Loup L. Faulon, Michael J. Collins, and Robert D. Carr. The signature molecular descriptor. 4. Canonizing molecules using extended valencesequences. Journal of chemical information and computer sciences, 44(2):427–436, 2004. ISSN 0095-2338. doi: 10.1021/ci0341823. URLhttp://dx.doi.org/10.1021/ci0341823.

Michelle M. Meyer, Lisa Hochrein, and Frances H. Arnold. Structure-guided SCHEMA recombination of distantly related β-lactamases. Protein EngineeringDesign and Selection, 19(12):563–570, December 2006. ISSN 1741-0126. doi: 10.1093/protein/gzl045. URLhttp://dx.doi.org/10.1093/protein/gzl045.

Diwahar Narasimhan, Mark R. Nance, Daquan Gao, Mei-Chuan Ko, Joanne Macdonald, Patricia Tamburi, Dan Yoon, Donald M. Landry, James H. Woods,Chang-Guo Zhan, John J. G. Tesmer, and Roger K. Sunahara. Structural analysis of thermostabilizing mutations of cocaine esterase. ProteinEngineering Design and Selection, 23(7):537–547, July 2010. doi: 10.1093/protein/gzq025. URL http://dx.doi.org/10.1093/protein/gzq025.

Manish C. Saraf, Gregory L. Moore, Nina M. Goodey, Vania Y. Cao, Stephen J. Benkovic, and Costas D. Maranas. IPRO: an iterative computational proteinlibrary redesign and optimization procedure. Biophysical journal, 90(11):4167–4180, June 2006. ISSN 0006-3495. doi: 10.1529/biophysj.105.079277. URLhttp://dx.doi.org/10.1529/biophysj.105.079277.

Jiangning Song, Kazuhiro Takemoto, Hongbin Shen, Hao Tan, Michael M. Gromiha, and Tatsuya Akutsu. Prediction of Protein Folding Rates from StructuralTopology and Complex Network Properties. IPSJ Transactions on Bioinformatics, 3:40–53, 2010. doi: 10.2197/ipsjtbio.3.40. URLhttp://dx.doi.org/10.2197/ipsjtbio.3.40.

Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 45 / 45


Recommended