Download - PROCOS: Computational analysis of protein-protein complexes

PROCOS: Computational Analysis of Protein–ProteinComplexes

FLORIAN FINK,1 JOCHEN HOCHREIN,1 VINCENT WOLOWSKI,2 RAINER MERKL,3 WOLFRAM GRONWALD1

1Institute of Functional Genomics, University of Regensburg, Regensburg, Germany2Faculty of Mathematics and Computer Science, University of Hagen, Hagen, Germany

3Institute of Biophysics and Physical Biochemistry, University of Regensburg, Regensburg, Germany

Received 9 November 2010; Revised 15 April 2011; Accepted 15 April 2011DOI 10.1002/jcc.21837

Published online 31 May 2011 in Wiley Online Library (wileyonlinelibrary.com).

Abstract: One of the main challenges in protein–protein docking is a meaningful evaluation of the many putativesolutions. Here we present a program (PROCOS) that calculates a probability-like measure to be native for a givencomplex. In contrast to scores often used for analyzing complex structures, the calculated probabilities offer the advantageof providing a fixed range of expected values. This will allow, in principle, the comparison of models corresponding todifferent targets that were solved with the same algorithm. Judgments are based on distributions of properties derivedfrom a large database of native and false complexes. For complex analysis PROCOS uses these property distributions ofnative and false complexes together with a support vector machine (SVM). PROCOS was compared to the establishedscoring schemes of ZRANK and DFIRE. Employing a set of experimentally solved native complexes, high probabilityvalues above 50% were obtained for 90% of these structures. Next, the performance of PROCOS was tested on the 40binary targets of the Dockground decoy set, on 14 targets of the RosettaDock decoy set and on 9 targets that participatedin the CAPRI scoring evaluation. Again the advantage of using a probability-based scoring system becomes apparent anda reasonable number of near native complexes was found within the top ranked complexes. In conclusion, a novel fullyautomated method is presented that allows the reliable evaluation of protein–protein complexes.

© 2011 Wiley Periodicals, Inc. J Comput Chem 32: 2575–2586, 2011

Key words: protein–protein complex; docking; scoring; reranking; support vector machine

Introduction

Protein–Protein Interactions

Proteins are an essential part of nearly all cellular processes. Oneimportant aspect of proteins is their three-dimensional structure,which must be known to understand their function in detail. Mostfrequently, protein structures are determined by means of X-raycrystallography and NMR spectroscopy, leading to a rapidly grow-ing number of solved structures. To date, more than 60,000 proteinstructures are deposited in the Protein Data Bank (PDB) availableat (www.rcsb.org).1 However, cellular functions are rarely carriedout by single proteins but by complexes composed of several inter-acting proteins. It has been estimated that each protein has nineinteraction partners on average.2 However, due to experimentalcomplexity only a very small part of the deposited structures con-sists of protein–protein complexes. High-throughput methods fordetecting protein interactions, like yeast2hybrid assays or tandem-affinity-purification mass spectrometry, predict a large numberof protein–protein interactions. These experimental approachesare supplemented by bioinformatic methods such as phylogenetic

profiling, investigations of gene neighborhoods, and gene fusionanalysis. Unfortunately, it is not possible to determine the struc-tures of all these protein complexes by experimental methods dueto limitations concerning large or transient complexes. In addition,the experimental structure determination of protein–protein com-plexes is in most cases a time-consuming and challenging process.For that reason, computational approaches like docking algorithmsthat predict the structure of these complexes are needed. During thelast few years, considerable effort has been put in the developmentand application of docking algorithms; for a review see.3 The suc-cess of docking algorithms has consistently improved over the lastyears, as measured by the CAPRI blind docking experiment.4, 5 Dueto such efforts, on one hand the applicability of in silico created com-plexes is becoming more widely accepted, and on the other hand thevarious available docking algorithms can be objectively compared.

Correspondence to: W. Gronwald; e-mail: [email protected]

Contract/grant sponsor: Bavarian Genome Research Network (BAYGENE)

© 2011 Wiley Periodicals, Inc.

2576 Fink et al. • Vol. 32, No. 12 • Journal of Computational Chemistry

Docking

All docking approaches assume that the native complex is near theglobal minimum of the energy landscape constituted by the set ofall theoretically possible complex conformations of the interact-ing proteins. The main challenges of any docking algorithm canbe divided into three separate elements: First, the possible dockingorientations have to be enumerated at a sufficiently high resolution.Second, minor or even major structural changes that occur uponcomplex formation have to be considered. Third, from all putativesolutions the near native ones have to be selected. For a reliabledecision, a scoring function that distinguishes native(-like) fromnon-native(-like) docking solutions is necessary.

In the following, we focus on optimizing the third task, the selec-tion step. Usually, several factors are considered in the identificationof near native models, including steric surface complementarity,6, 7

electrostatic interactions,8–10 hydrogen bonding,11 knowledge basedpair-potentials,12, 13 desolvation energies14 and van der Waals inter-actions.15 It has been shown that evaluation of complexes can beimproved considerably by combining the information of severalanalysis functions,16 and this is increasingly becoming commonpractice.17–20 Despite these efforts the selection step still remainsa challenging task18, 21 and frequently, high scores are obtainedfor non-native solutions. One important drawback of many scoringapproaches used today is that in most cases score values stronglydepend on other factors, e.g., the size of the proteins and vary widelyeven for correct solutions from target to target. Therefore, it is dif-ficult to define a priori thresholds up to which a docked complexshould be retained for further analysis. In addition, it is not possi-ble to directly compare the scores of different complexes with eachother.

Here, we describe with the PROtein COmplex analysis Server(PROCOS) a novel approach for the evaluation of both computation-ally and experimentally derived protein complexes that overcomesseveral of the limitations mentioned earlier. For complex analysisPROCOS uses a combination of well established analysis functions.Specifically, van der Waals and electrostatic energies and knowl-edge based pair potentials are used. As detailed later, the novel partof PROCOS concerns the way in which these functions togetherwith a large database of native and false complexes are used tocalculate a probability-like measure to be native for each givencomplex.

Methods

The underlying idea of PROCOS is to classify complexes based onBayes’s theorem,22 which is used to calculate the probability p thata complex with a global score value S belongs to the class of nativecomplexes N [eq. (1)]. Here, p(N) and p(F) denote the a prioriprobabilities that a complex belongs to the class of native complexes(N) or to the class of false ones (F). In addition, estimates of theprobability distributions DN and DF of the scores S for the classesN and F are required for obtaining the probabilities p(S|N) andp(S|F), respectively. Although, it is possible to formulate a prioriassumptions on these distributions, the extraction of this informationfrom known complex structures is more robust. Therefore, nativecomplexes were taken from a database, termed the Mintz database inthe following, which contains 2541 nonhomologous native protein

complexes.23 A meaningful antipode of false complexes was takenfrom CAPRI scoring data as detailed later.

For each of the native and false complexes the values ofthree analysis functions were calculated: intermolecular electro-static energy (e), intermolecular van der Waals energy (v), and thescore of an intermolecular amino acid based pair-potential (k).24

The e, v, and k values obtained for each complex were used to traina support vector machine (SVM) with two classes. In this case, theproperty S is related to the position of an individual complex relativeto the separating hyperplane of the SVM model. Next, using thesedata, probability distributions were obtained for the two classes Nand F. Figure 1 gives an overview of the procedure which is detailedlater.

Finding reasonable values for the a priori probabilities p(N)

and p(F) that a complex belongs to the class of native complexesN or to the class of false complexes F is a difficult task thatdepends on several factors such as the docking algorithm used,the system under investigation, etc. As an approximation, we usedp(N) = p(F) = 0.5. This does, of course, not at all reflect the realproportion between the amount of true solutions and all theoreticallypossible conformations. However, it would be meaningless to selectsome other arbitrarily chosen values as long as there are no factsavailable resulting in more reasonable estimates for the priors. Thisaffects the results in a way that our so-called “probabilities” are notreal probabilities to be native structures. To obtain somewhat morerealistic priors, one could scan the solutions of typical docking runsfor the fraction of native and non-native complexes. For example,near native and false complexes of the recent CAPRI scoring com-petitions could be used for this purpose. This would lead to priorsp(N) = 0.062 and p(F) = 0.938. However, it should be noted thatthese are no general values and therefore, in this contribution priorsof p(N) = p(F) = 0.5 were used.

For our approach it is necessary to obtain a reasonable set of falsecomplexes. Creating this set, one cannot simply join two proteins inan arbitrary way, since the resulting complexes would be extremelyunrealistic. For a realistic set, false complexes are needed that donot exist in nature but are, nevertheless, optimized in a way that theycould theoretically exist. As a possible solution to this problem wetook already existing decoys from targets of the last CAPRI scoringcompetitions (T29, T32, T35, T36, T37_1, T38, T39, T40_CA, T41)that were generated by many different predictor groups using a vari-ety of different algorithms. Of those, for each target 25% arbitrarilychosen complexes (2194 structures) that were marked as incorrectaccording to the CAPRI criteria were used for the calculation ofthe probability distributions of the false complexes. This approachensures that the resulting distributions are not biased towards a singlealgorithm used for calculating the structures.

In addition, to provide roughly the same amount of structures forthe computation of the probability distributions of the classes N andF, only a subset of the incorrect CAPRI structures was used. Note,that for targets 37 and 40 two evaluations were performed by CAPRI.For T37 this was done due to high symmetry between the two chainsin the ligand of T37 and their close proximity to each other and theinterface. For target 40 there are two possible interfaces at oppositesides of the receptor (see CAPRI homepage for details25). However,to not overuse the structures of these targets, they were used onlyonce for the generation of probability distributions. The so obtainedprobability distributions for the false complexes should represent

Journal of Computational Chemistry DOI 10.1002/jcc

PROCOS: Computational Analysis of Protein–Protein Complexes 2577

a meaningful antipode to the group of the native complexes. Theremaining 75% of false complexes together with the CAPRI com-plexes that were marked as being at least acceptable were later usedfor testing the algorithm.

Using the above definitions the calculation of the probability pthat a complex with a global score value S belongs to the class ofnative complexes N is described by the following equation:

p(N | S) = p(N) · p(S | N)

p(N) · p(S | N) + p(F) · p(S | F)(1)

Complex Analysis

A central part of the present work is the design of a globalprobability-like measure deduced from a variety of individual anal-ysis functions. Currently, we include intermolecular electrostaticenergies, intermolecular van der Waals energies and a term basedon knowledge based amino acid pair-potentials. However, the algo-rithm easily allows the addition of further functions. A goodoverview of discriminating features that could be used for the anal-ysis of protein–protein interactions is given in the recent review ofEzkurdia et al.26 Note, that the energy functions used within PRO-COS mainly aim at the analysis of complexes that only contain amoderate number of clashes. Before the individual energy terms arecomputed within PROCOS, all hydrogens present are first removedfollowed by the addition of polar and nonpolar hydrogens using theprogram REDUCE3.10.27 This ensures a comparable protonationstate and atom nomenclature for all pdb-files investigated.

Electrostatic Energy

The electrostatic energy of the complex is the sum of the individualelectrostatic energies of all intermolecular atom pairs in the com-plex. The used function (2) is similar to the one used by the moleculardynamics program CNS28 and the docking program HADDOCK20

which is based on CNS.

Eelec =∑n,m

qnqmC

ε0R

[1 − R2

R2off

](2)

where n and m enumerate the atoms of the first and second pro-tein, respectively; q is the charge of the atom. The individual partialcharges are similar to those taken by HADDOCK2.0 (see HAD-DOCK distribution, file “topallhdg5.3.pro”); C is a scaling factoras it is also used in CNS and HADDOCK; the dielectric constant ε0

is set to one; R denotes the distance between the atoms. The termin brackets ensures that the electrostatic energy approaches zero ata cutoff value of Roff = 8.5 Å. This cutoff saves computation timeand the introduced error is negligible.

Van der Waals Energy

The van der Waals energy is a combination of the Pauli repulsion andthe van der Waals attraction. Similar to the electrostatic energy, it iscalculated as a sum over all intermolecular atom pairs. As described

for the electrostatic energy the used function (3) is similar to the oneused by CNS and HADDOCK.

Evdw =∑n,m

4ε

[(σ

R

)12 −(σ

R

)6]

SW(R, Ron, Roff ) (3)

with

SW

=

0 if R > Roff(R2 − R2

off

)2 · (R2 − R2

off − 3(R2 − R2

on

))(R2

off − R2on

)3 if Roff > R > Ron

1 if R < Ron

(4)

where ε and σ parametrize the Lennard-Jones potential of identicalatom types. Between different atom types, the following combina-tion rule is used: σij = σii+σjj

2 and εij = √εiiεjj . The individual

values were taken from the literature and are similar to thoseused by HADDOCK2.0 (see HADDOCK distribution, file “top-par/parallhdg5.3.pro”).20, 29 Ron and Roff were set to 6.5 Å and 8.5 Å,respectively.

Pair-Potential

A knowledge based potential for the occurrence of amino acidsin protein interfaces has been deduced from the nonhomologouscomplexes of the Mintz database.24 A score Sinter(aa1, aa2) has beencalculated for each intermolecular amino acid pair in contact at aninterface, according to the following equation:

Sinter(aa1, aa2) = log

[fpair(aa1, aa2)

fsurface(aa1)fsurface(aa2)

](5)

Sinter is a typical log-odds-score. In (5) the denominator modelsthe frequency of finding a contacting pair given the occurrence ofamino acids at the surface of proteins. The numerator is the observedfrequency deduced from protein interfaces in the Mintz database.Therefore, Sinter(aa1, aa2) is positive, if a contacting pair of aminoacids occurs in interfaces more frequently than expected, given theamino acid frequencies for the protein surface. Analogously, thescore is negative, if the amino acid pair is found less frequent thanexpected. For example, Sinter(Val, Trp) is 3.1 and Sinter(Glu, Asp) is−1.9, which are among the most extreme values. In the following,the pair-potential score of a complex is the sum of the scores of allcontacting amino acid pairs.12

Visualization Through Probability Distribution Plots

As the functions above are very diverse in their physical meaning,rescaling of the individual functions was performed for easier visualinspection. Therefore, in all data and each analysis function thepoint where the value of the analysis function reached zero wasset to the origin. Going in the direction of more favorable values,



Figure 1. Overview of the work-flow to obtain probability distribu-tions for native and false protein complexes: Protein complexes fromthe Mintz database23 are used as native complexes. False complexeswere taken from data of the CAPRI scoring competition according tothe definitions provided by CAPRI. For all complexes three differentanalysis functions were used, namely van der Waals energies, electro-static energies, and amino acid based pair-potentials. Resulting valueswere rescaled for reasons of data comparison. A support vector machine(SVM) was trained with the different scores and a measure related to thedistance of every complex to the separating hyperplane was calculated.This data was used to calculate a new set of probability distributions forthe two classes N and F. The data flow of native and false complexes isrepresented by blue and red arrows, respectively.

a maximum number of 1000 was assigned to the point where theprobability density values for the distributions of the native andfalse complexes both approached a value of zero, i.e., they wereboth below 0.1% of the largest obtained probability density value

of this analysis function. Using the step size that was derived inthe calculation above and the same cutoff criteria, rescaling wasalso performed in the opposite direction. Using the rescaled data,probability distributions were obtained for each analysis function forthe groups of native and false complexes. Curves were calculatedand smoothed using the following Kernel Density Estimator:

D(x) =∑

n

1

σn√

2πexp

[−1

2

(x − µn

σn

)2]

. (6)

From every data-point n and its m neighbors the mean µn and thevariance σn were calculated. These values were used to derive aGauss function for the corresponding data point and all Gaussianswere added to produce the density. The parameter m (number ofneighbors) determines the degree of smoothing and was set to 200.

The resulting rescaled probability distributions are shown inFigure 2. Analysis of the diagrams showed that in all cases distinctdifferences were obtained between the distributions of the nativeand false complexes. For reasons of comparison also distributionsobtained from near native complex structures of the latest CAPRIscoring competitions were included. Note that the latter distributionswere not used for any calculations.

Calculation of Probabilities with an SVM

To combine the three calculated scores to one global probabilitymeasure, an SVM was trained using the libSVM library.30 For

Figure 2. Probability distribution plots for electrostatic energy (top left), van der Waals energy (top right)and amino acid based pair-potentials (bottom). The curves for the native complexes (DN ) are plotted inblue, those for the false complexes in red (DF). For reasons of comparison also distributions obtained forthe near native structures of the CAPRI test data are included (green). All values are rescaled, see Methodssection for details.



Figure 3. Probability distributions of the obtained SVM model. Thedistributions of native (DN ) and false (DF) complexes are plotted inblue and red, respectively.

training, the e, v, and k values obtained from the complexes of theMintz database and from the selected false complexes of the CAPRIset were used. In all cases a radial basis function kernel was utilized.

The standard output of an SVM is a yes/no answer. In our case,the SVM decides whether the complex belongs to the group of thenative complexes or not. However, as mentioned before, the aim ofPROCOS is to calculate a probability-like measure that a complexbelongs to the class of native complexes. For this, after training,a measure related to the distance of every complex to the separat-ing hyperplane is computed. This measure is called the decisionvalue. Based on this data, probability distributions DN and DF arecalculated as described earlier. Figure 3 shows the correspondingdistributions for native and false complexes.

For a newly investigated complex the e, v, and k values are calcu-lated, and based on this data, the position relative to the separatinghyperplane is calculated according to the previously trained model.Using the position relative to the separating hyperplane and the dis-tributions DN and DF shown in Figure 3 probabilities p(S|N) andp(S|F) are estimated. With eq. (1) and the a priori probabilitiesp(N) and p(F) the PROCOS probability-like measure p(N |S) thatthis complex belongs to the class of native complexes is computed.

User Interface

We have implemented a web interface, PROCOS (http://compdiag.uni-regensburg.de/procos), that allows the analysis of binary pro-tein complexes. For data processing single or multiple models areuploaded as a pdb-file. After parsing the input data, values of theanalysis functions are calculated and displayed together with thecorresponding probability distribution plots and the actual valuesmarked by colored bars within it. This data is provided as additionalinformation to the calculated probability measures.

Validation

To validate the performance of PROCOS, results were compared tothose from ZRANK 2.0 and DFIRE. ZRANK is a well establishedprogram for the analysis and reranking of docked complexes.31

As PROCOS, it uses a combination of different scoring terms,

namely van der Waals, electrostatic and desolvation energies,whereas DFIRE uses an all-atom knowledge-based potential fordecision making.13 Here, the newest version dDFIRE was used.32

DFIRE is available as web server (http://sparks.informatics.iupui.edu/yueyang/DFIRE/dDFIRE-service) as well as a stand-aloneprogram.

Results and Discussion

Analysis of Native Complexes

One of the potential goals of PROCOS includes the compari-son between different targets. Therefore, for complexes of similarquality but stemming from different source proteins, comparablevalues should be obtained. To further investigate this claim wechoose to analyze a set of experimentally solved native complexes,since for most of theses structures a comparable high quality can beassumed. For this task, 95 native complexes were selected in a ran-dom manner from the PDB. All of them were, on the sequence level,less than 25% identical to an entry in our training set. These com-plexes were analyzed by PROCOS, ZRANK, and DFIRE. Resultsshow that PROCOS yielded for 87 of these complexes probability-like values p(N |S) between 100% and 50% and only for 8 ofthem lower values. The average probability value obtained for allnative test structures amounts to 85.2%. In addition to the globalprobability-like values, PROCOS provides the values of the indi-vidual analysis functions to allow for a detailed evaluation of theresults. Analyzing the complex showing the lowest probability valueof 7.9%, it became apparent that this complex shows very high vander Waals energies, indicating a possible problem with the experi-mental structure determination. There were a few other complexeswith low probability values; in these cases this is mostly due to anunfavorable pair potential score.

Figure 4. Receiver operating characteristics (ROC) obtained for theDockground decoy set employing PROCOS (blue), ZRANK (red), andDFIRE (yellow).



This data clearly shows the advantage of using a probabilitybased analysis scheme, since the values obtained for a set of verydifferent complexes are directly comparable with each other. Theprobability values between 100% and 50% that were obtained byPROCOS for native structures provide for the analysis of dockingruns an expected upper-range of values that may be reached usinghigh resolution docking algorithms. When using more conventionalscoring schemes like ZRANK and DFIRE, for the same set of 96native complex structures a range of scoring values between −814and −14 (ZRANK) and −234 and 301 (DFIRE) is obtained. Thesevalues are potentially more difficult to interpret than the probability-like values obtained by PROCOS. To obtain these results, it wasnot necessary to divide the complexes into different groups, e.g.,enzyme-inhibitor, antibody-antigen and others, as proposed in theliterature33, 34 and adapted by several scoring approaches (e.g.,refs. 17, 19). This enhances the usability and general applicabilityof PROCOS.

Investigated Decoy Sets

The main application of PROCOS is the analysis of the manymodels generated by a docking approach. We tested PROCOS’performance on three different sets of test data. First, decoysfrom all 40 binary targets of the Dockground decoyset35 thatare listed in Tables 1 and 3 were investigated. For all com-plexes the sequence identity between targets is less than 30%.Decoys were generated using the GRAMM-X FFT docking method,where top scoring predictions were subjected to conjugate gra-dient enerergy minimization using a smoothed Lennard-Jonespotential.36 The decoyset contains for each target the 100 low-est energy non-native structures and at least one near-nativestructure. Second, decoys generated employing RosettaDock37 wereobtained from the website of the Gray lab (http://graylab.jhu.edu/docking/decoys/unbound_global.tgz). For each target the top 200structures from global searches starting from unbound startingstructures with rebuilt sidechains were available. To ensure that areasonable amount of near native structures was available, decoysfor each target were screened for near native structures leading tothe selection of the decoys of 14 enzyme/inhibitor complexes, listedin Table 4. Next, decoys of the last CAPRI scoring competion wereselected for testing. For each target deocys were generated by manydifferent predictor groups using a large variety of algorithms. Note,that for the CAPRI targets 25% of the complexes that were markedas incorrect were used for training of PROCOS. Therefore, thesecomplexes were excluded from testing. For CAPRI targets T36 andT38 no near native sructures were present and therefore, these tar-gets were also excluded leading to the selection of nine targets, (T29,T32, T35, T37_1, T37_2, T39, T40_CA, T40_CB, and T41). Dueto high symmetry between the two chains in the ligand of T37 andtheir close proximity to each other and the interface, two assessmentswere performed. For target 40 there are two possible interfaces atopposite sides of the receptor. Therefore, two examinations wereperformed (see CAPRI homepage for details25). In the followingevaluation we distinguished only between two groups of complexes:incorrect structures and near native structures. Those that are accept-able or better according to criteria proposed by CAPRI5 are termedas near native structures. All structures were reranked by PRO-COS, ZRANK, and DFIRE. The results are shown in Tables 1–6and Figure 4 and are discussed in the following.

Analysis of Near Native Structures

First, results obtained for the experimentally solved native structureswere compared to those of the near native structures from docking.When looking at the average probability values that were obtainedby PROCOS for the near native structures of the various complexesof the Dockground decoy set (Table 1), a range between of 55.9%and 0.3% with an average value over all targets of 10.4% ± 11.8%(Table 2) can be seen. The first obvious observation is that the aver-age values of the near native structures of the Dockground decoysare considerably smaller than those calculated for the experimen-tally solved native complex structures. As noted earlier, for 87out of the 95 experimentally solved structures probability valuesbetween 100% and 50% were achieved. This is a clear indicationthat the average quality of the near native Dockground structuresis still well below the quality of experimentally solved structures.When analyzing the Rosetta decoys, considerably higher averagevalues of 65.1% ± 20.6% were obtained by PROCOS for the nearnative structures (Table 2). These values are already in the range ofthose obtained for the experimental stuctures. These data indicate aconsiderable influence of the used docking algorithm.

To further investigate the average quality of near native structuresobtained from docking runs, the CAPRI test data was employed.Here, average values of 27.5% ± 23.8% were computed by PRO-COS for the near native structures. These figures are somewhatin between those obtained for the Dockground and Rosetta data,which might be explained by the fact that the CAPRI data weregenerated by numerous different algorithms. In addition, probabil-ity distributions for the near native CAPRI decoys were computed.These curves are indicated in green in Figure 2. Analysis of Figure 2shows that for the van der Waals and electrostatic energies as well asfor the pair-potentials the green curves are somewhere between thecurves of the native and false complexes and especially for the pair-potentials the curve of the near native complexes is far apart from thecurve of the native ones and almost identical to the curve obtained forthe false complex structures. This data is a further indication thatoften still considerable differences exist between experimentallysolved native complex structures and near native termed structuresthat were generated by docking. This observation is probably oneof the main causes why in many cases scoring approaches havedifficulties to single out near native structures.

Threshold Selection

One of the main problems when analyzing docking runs is the selec-tion of an appropriate threshold for the used scoring function topick structures that should be considered for further analysis. Weanalyzed whether it is possible to achieve such a goal with PROCOS.

When comparing the average PROCOS probability-like valuesfor the near native structures of the Dockground decoy set (Table 1)with those of the worst 25% solutions, with a corresponding rangebetween 0.1 and 0.2%, it is apparent that the probability-like valuesfor these two sets of structures do not overlap. Therefore, setting of aglobal threshold to safely remove a considerable subset of the wrongsolutions seems feasible. The advantage of such a global thresholdis that it may be selected a priori, independent of the investigatedtarget.



Table 1. Analysis of Average Results of the Dockground Decoy Set.

Average results near Average results 25%native solutions worst solutions

Complex PROCOS ZRANK DFIRE PROCOS ZRANK DFIRE

1avw_A_B 20.62 623.10 −20.27 0.21 1522.81 −14.001bui_A_C 20.85 455.49 −10.27 0.17 1805.63 −10.021bui_B_C 1.07 623.70 −16.36 0.18 1814.42 −12.011bvn_P_T 24.40 1050.17 −30.70 0.20 1707.82 −16.881cho_E_I 29.81 283.83 −20.73 0.20 1047.46 −12.601dfj_E_I 36.36 3253.77 −11.83 0.10 2786.44 −8.641e96_A_B 0.30 627.08 −18.29 0.19 1597.88 −9.511ewy_A_C 6.17 356.14 −15.42 0.17 1801.21 −11.131f6m_A_C 0.28 970.85 −12.21 0.20 1878.92 −11.411fm9_A_D 18.30 903.57 −28.02 0.17 1841.94 −9.921g6v_A_K 1.00 399.81 −10.85 0.18 1566.24 −10.411gpq_A_D 11.01 431.02 −14.61 0.22 1220.22 −9.221gpw_A_B 1.24 1672.01 −20.09 0.18 1944.96 −8.771he1_A_C 2.48 799.12 −14.87 0.15 1791.32 −10.071he8_A_B 55.93 118.37 −10.28 0.11 2489.49 −11.421ku6_A_B 2.30 733.64 −13.19 0.19 1564.13 −12.771ma9_A_B 8.71 1269.32 −28.27 0.10 2864.80 −15.721nbf_A_D 14.35 220.88 −10.09 0.14 1913.58 −4.781oph_A_B 3.40 510.61 −14.54 0.14 2016.66 −11.831ppf_E_I 22.21 37.68 −18.52 0.24 1277.76 −11.471r0r_E_I 1.85 476.91 −17.25 0.23 1304.52 −10.231s6v_A_B 2.29 261.11 −8.82 0.15 1704.70 −4.801t6g_A_C 7.86 1142.53 −28.15 0.20 1843.32 −18.911tmq_A_B 5.08 990.57 −27.09 0.20 1621.02 −15.851tx6_A_I 1.11 1223.19 −18.24 0.18 2158.75 −18.001u7f_A_B 17.99 366.85 −17.46 0.20 1729.67 −15.331ugh_E_I 3.53 1388.10 −26.31 0.20 1636.86 −8.071w1i_A_F 1.06 539.27 −8.38 0.10 2546.49 −10.501wq1_R_G 0.30 1151.19 −14.41 0.16 2426.31 −10.401xd3_A_B 17.12 625.74 −17.48 0.19 1685.97 −6.011yvb_A_I 8.11 65.21 −19.16 0.21 1630.27 −11.492a5t_A_B 0.26 1583.81 204.80 0.12 2394.10 213.732bkr_A_B 0.77 1217.61 −14.90 0.19 1638.06 −6.192btf_A_P 18.89 584.26 −18.13 0.13 1799.41 −12.042ckh_A_B 6.94 160.95 −12.39 0.18 1528.02 −4.722fi4_E_I 11.64 649.15 −14.86 0.22 1144.50 −12.472goo_A_C 0.27 714.86 −22.68 0.21 1205.66 −15.852sni_E_I 8.35 627.15 −15.50 0.22 1317.28 −9.563fap_A_B 9.53 201.55 −11.76 0.23 1087.03 −10.393sic_E_I 12.24 155.81 −17.21 0.19 1381.82 −13.95

For all targets, the average scores (ZRANK and DFIRE) and probability values in % (PROCOS) of the near native solutionsaccording to CAPRI criteria were calculated. The corresponding values are shown in the first three columns. The datacontained in the three columns to the right was calculated by taking the mean of the 25% worst solutions of a targetaccording to the measures calculated by PROCOS, ZRANK, and DFIRE.

When analyzing the corresponding average score values fromZRANK for the near native solutions of the various targets of theDockground decoy set (Table 1), values are between 37.7 and 3253.8with an average of 736.6 ± 588.0 and for the 25% worst solutionsthere is a range from 1047.5 to 2864.8 with an average of 1755.9 ±441.4 (Table 2). For DFIRE a range between −30.7 and 204.8 withan average of −11.6±35.6 is computed for the near native solutionswhile for the 25% worst solutions there is a range between −18.9

and 213.7 with a corresponding average of −5.6 ± 35.7 (Tables 1and 2). Especially, for DFIRE considerable overlap exists betweenthe score values obtained for the near native structures and the 25%worst structures, which makes the setting of a target independentthreshold for selection purposes more difficult.

Next, it was investigated whether it is possible to set a thresh-old for selection purposes independent of the used docking method.As mentioned earlier all decoy sets investigated were obtained by



Table 2. Analysis of Average Results with Respect to Used Docking Algorithm.

Average results near Average results 25%native solutions worst solutions

Decoy set PROCOS ZRANK DFIRE PROCOS ZRANK DFIRE

Dockground 10.40 ± 11.78 736.65 ± 587.98 −11.62 ± 35.57 0.18 ± 0.04 1755.94 ± 441.38 − 5.59 ± 35.73Rosetta 65.05 ± 20.60 −195.98 ± 40.34 −751.72 ± 303.64 23.86 ± 12.83 −94.11 ± 20.28 −747.63 ± 303.78CAPRI 27.49 ± 23.84 147.80 ± 407.77 −593.18 ± 318.18 0.26 ± 0.04 1109.01 ± 590.77 −160.10 ± 165.68

Analysis of average results with respect to the used docking algorithm. For all used targets of a given decoy set averagescore values and corresponding standard deviations are given according to the measures calculated by PROCOS, ZRANK,and DFIRE. The Dockground decoy set was generated using the GRAMM-X docking approach, the Rosetta decoy setwas obtained by RosettaDock and the decoys of the CAPRI targets were generated by a multitude of different algorithms.Targets which were used for calculating average values are summarized in Tables 3–5 for the Dockground, Rosetta, andCAPRI targets, respectively.

different methods. Table 2 summarizes the average score values andstandard deviations obtained for the near native solutions and 25%worst solutions for all targets of one decoy set combined. In general,the same trend between near native structures and 25% worst solu-tions that was observed for the Dockground decoy set is also visiblefor the other decoy sets. However, as can be clearly seen for PRO-COS as well as for ZRANK and DFIRE the obtained results stronglydepend on the method used for decoy computation. The lowest,most favorable, scores (ZRANK and DFIRE) and highest probabil-ity values (PROCOS) were obtained for the decoys generated byRosettaDock, while for the decoys generated by GRAMM-X con-siderably higher scores, respectively lower probability values wereobtained. The values obtained for the CAPRI test data are somewhatin between these values. This is true for both the near native struc-tures and the 25% worst solutions. Further analysis shows that usingeither PROCOS, ZRANK, or DFIRE the average values obtainedfor the 25% worst solutions of the Rosetta data are more favorablethan the average figures calculated for the near native structures ofthe Dockground set. These results clearly show that for the a priorisetting of thresholds one has to take the used docking method intoaccount.

Influence of Priors

Although, as explained in the Methods section, due to the issue ofdefining appropriate priors, a PROCOS probability value cannot yetbe interpreted as a real probability that a given complex structureis close to its native form. Currently PROCOS uses the approxi-mation of p(N) = p(F) = 0.5 in eq. (1). It was also investigatedto which degree results are influenced by this selection. Employingthe values obtained from the data of the CAPRI scoring competi-tion of 0.062 for p(N) and 0.938 for p(F) calculations were repeatedfor target 1dfj_E_I of the Dockground decoy set. The average val-ues obtained by PROCOS for the near native and the 25% worstsolutions are 6.17% and 0.01%, respectively. These probability-likevalues would of course be nearer to real probabilities that nativecomplexes are found, but would depend significantly on the arbi-trarily chosen dataset they were derived from. A comparison withthe previous values of 36.36% for the near native structures and of0.10% for the 25% worst solutions (Table 1) shows that a relativelylarge shift is caused by the selection of the priors but that in both

cases a clear gap is visible between the near native structures and the25% worst solutions. Also, the selection of the priors has only a neg-ligible influence on the ranking performance of PROCOS (data notshown). As a consequence we choose to stick to the approximationp(N) = p(F) = 0.5.

Reranking of Decoys

As shown earlier, often the setting of a global threshold for the selec-tion of decoys for further analysis is difficult, in many applicationsthe 10 or 20 best structures according to their corresponding scoresare selected. The results of the rerankings of PROCOS, ZRANK andDFIRE for the Dockground decoy set are given in Table 3, showingthe number of near native solutions found in the top 10, top 20, andtop 50 structures. For method comparison, the number of near nativestructures within the 10 top ranked structures was counted. In casethat the same number of structures was assessed the following lineof Table 3 (top 20 ranked solutions) was evaluated. In case that nodistinction was achieved by the top 2 lines, the compared methodswere considered as equal. Inspection of Table 3 shows that in com-parison with ZRANK PROCOS performs in 23 cases better, in threecases both methods perform equally and in 14 cases ZRANK out-performs PROCOS. In comparison with DFIRE, PROCOS performsbetter in 29 cases, equally in 4 cases and worse in 7 cases. Addingthe total number of near native structures found in the top 10 rankedstructures for all targets, PROCOS detects 159 structures, ZRANK105 and DFIRE 98 (Table 6). To further investigate the performanceof the different methods, receiver operating characteristics (ROCs)were calculated for the Dockground decoy set. Analysis of the ROCcurves in Figure 4 shows that using PROCOS an area under the curve(AUC) of 0.71 was obtained, followed by DFIRE with an AUC of0.62, and ZRANK with an AUC of 0.58. These data clearly showthat for the Dockground decoy set PROCOS performs quite well interms of reranking decoys.

For the 14 targets of the Rosetta decoy set results are shown inTable 4 and methods were compared as described earlier. For thesetargets PROCOS performs in 1 case better than ZRANK, while theopposite is true in 10 cases and in 3 cases both methods performequally. The comparison with DFIRE shows for 4 targets a bet-ter performance of PROCOS, while in 5 cases DFIRE outperforms



Table 3. Analysis of Dockground Targets.

PROCOS ZRANK DFIRE PROCOS ZRANK DFIRE PROCOS ZRANK DFIRE PROCOS ZRANK DFIRE

Complex 1avw_A_B 1bui_A_C 1bui_B_C 1bvn_P_T

Top 10 4 2 0 5 1 0 0 1 0 7 0 10Top 20 5 3 0 8 4 0 5 4 0 8 2 12Top 50 8 5 5 10 9 0 6 6 0 10 6 12Near natives 10 of 110 10 of 110 10 of 110 12 of 110

Complex 1cho_E_I 1dfj_E_I 1e96_A_B 1ewy_A_C


Complex 1f6m_A_C 1fm9_A_D 1g6v_A_K 1gpq_A_D


Complex 1gpw_A_B 1he1_A_C 1he8_A_B 1ku6_A_B


Complex 1ma9_A_B 1nbf_A_D 1oph_A_B 1ppf_E_I


Complex 1r0r_E_I 1s6v_A_B 1t6g_A_C 1tmq_A_B


Complex 1tx6_A_I 1u7f_A_B 1ugh_E_I 1w1i_A_F


Complex 1wq1_R_G 1x3d_A_B 1yvb_A_I 2a5t_A_B


Complex 2bkr_A_B 2btf_A_P 2ckh_A_B 2fi4_E_I


Complex 2goo_A_C 2sni_E_I 3fap_A_B 3sic_E_I


Reranking results of all 40 binary targets from the Dockground decoy set that were generated using the GRAMM-Xdocking approach. For every target the number of near native structures in the top 10, top 20, and top 50 ranked solutionsis shown, and the last line contains the total number of near native structures for each target.



Table 4. Analysis of Rosetta Targets.


Complex 1acb 1brs 1cgi 1cse


Complex 1fss 1mah 1tgs 1ugh


Complex 2ptc 2sic 2sni 1stf


Complex 1tab 2tec

Top 10 0 0 0 0 1 1Top 20 0 0 0 0 2 2Top 50 0 1 2 1 9 9Near natives 44 of 200 18 of 200

Results achieved by PROCOS, ZRANK, and DFIRE in terms of reranking Rosetta targets. For every target the numberof near native structures in the top 10, top 20, and top 50 ranked solutions is shown, and the last line contains the totalnumber of near native structures for each target.

Table 5. Analysis of CAPRI Targets.


Complex T29 T32 T35 T37_1

Top 10 1 0 0 0 0 0 0 0 0 0 0 1Top 5% 58 28 26 0 0 0 0 0 0 0 0 3Top 10% 86 69 66 0 0 1 0 0 0 1 5 3Top 25% 92 99 109 0 0 9 0 0 2 9 9 13Top 50% 108 132 137 6 6 15 0 0 2 20 20 25Near natives 143 of 1607 15 of 386 2 of 351 34 of 843

Complex T37_2 T39 T40_CA T40_CB

Top 10 0 0 0 0 0 0 2 7 0 2 2 6Top 5% 0 0 0 0 0 0 4 29 17 52 35 23Top 10% 0 1 0 0 0 0 8 63 34 79 59 33Top 25% 1 1 5 0 0 0 81 180 95 91 84 54Top 50% 5 2 6 0 0 0 202 245 147 106 96 67Near natives 7 of 843 4 of 1049 346 of 1603 123 of 1603

Complex T41

Top 10 5 1 4Top 5% 22 8 14Top 10% 40 36 19Top 25% 101 128 43Top 50% 192 186 178Near natives 295 of 893

Results achieved by PROCOS, ZRANK, and DFIRE in terms of reranking recent CAPRI targets. The number of nearnative structures found in the top 10 ranked structures is given in the first line (this is what would have been submitted ina CAPRI participation). Since the number of available decoys greatly varies between targets, the number of near nativestructures in the top 5%, top 10%, top 25%, and top 50% are provided for better comparison between targets. The lastline shows the total number of near native structures together with the total number of decoys.



Table 6. Total Number of Near Native Structures Found in the Top 10Structures of All Targets.

Detected near native structures

Decoy set PROCOS ZRANK DFIRE

Dockground 159 105 98Rosetta 37 62 32CAPRI 10 10 11

Total number of near native structures found by PROCOS, ZRANK, andDFIRE in the top 10 structures of all targets of the Dockground, Rosetta andCAPRI decoy sets.

PROCOS. For 5 targets both methods work equally. When analyz-ing the total number of near native structures found in the top 10structures of all targets, ZRANK detects 62 structures, followed byPROCOS with 37 structures and DFIRE with 32 structures (Table6). These results show that for the Rosetta decoys ZRANK shows thebest performance in terms of reranking, while PROCOS and DFIREperform almost equally with a slight advantage for PROCOS whenconsidering the total number of decoys in the top 10 structures.

For the decoys of the CAPRI scoring competition the resultsof the rerankings obtained by PROCOS, ZRANK and DFIRE aregiven in Table 5. As for the other decoy sets the number of nearnative structures found in the top 10 ranked structures is given inthe first line. Since the number of available decoys greatly variesbetween targets, the number of near native structures in the top5%, top 10%, top 25%, and top 50% are provided to allow fora meaningful comparison between different targets. Note that thedecoys for targets 36 and 38 contained no near native structures andtherefore, were omitted in Table 5. As noted in the description of thedifferent decoy sets, two evaluations were performed for target T37and for target T40. As described earlier, the first two lines of Table5 were evaluated for method comparison. Analysis of results showsthat PROCOS outperforms ZRANK for 4 targets, whereas ZRANKperforms better for 1 target and for the remaining 5 targets PROCOSand ZRANK performed equally well. In comparison with DFIREPROCOS performs in 3 cases better, in 2 cases worse and for theremaining 4 targets both methods performed equally. Inspecting thetotal number of near native structure found in the top 10 structuresof all targets shows that DFIRE detects 11 structures followed byboth ZRANK and PROCOS with 10 decoys each (Table 6). Fortargets T35 and T39 neither PROCOS, nor ZRANK, nor DFIREperformed well, however, for T35 only 2 and for T39 only four nearnative structures were available.

Combining all three investigated decoy sets an overall numberof 63 targets was analyzed. For 45 of these targets both PROCOSand ZRANK found at least one near native structure within the top10 structures, while for DFIRE this was true in 26 cases.

A summary of the results discussed previously shows that PRO-COS compares in a considerable number of cases favorable withthe other methods investigated. However, it also becomes clear thatnone of the methods is superior to the others in all cases. For theDockground decoy set on average a clear advantage for PROCOS isvisible, while for the structures calculated by RosettaDock ZRANK

performs quite well. For each of the CAPRI targets decoys were gen-erated by a number of different approaches. Here a slight advantagefor PROCOS is visible.

Conclusion

In our opinion, the PROCOS probability-like values are easier tointerpret than the score values obtained by other approaches. One ofthe advantages of the PROCOS values is that they are by definitionwithin well defined limits between 0 and 100% and therefore, aremore general in nature than scores. In addition, they open the pos-sibility to compare the docking results of different targets that weregenerated by the same docking method. The analysis of native struc-tures has shown that for structures of comparable high quality, as itcan be assumed for native structures, consistently high probabilityvalues irrespective of the investigated target were obtained by PRO-COS. For the analysis of docking applications these values mightserve as an expected upper-range of values that may be reached insome cases using a high resolution docking algorithm. A consider-able subsection of false complexes can be eliminated from furtheranalysis by setting an appropriate threshold a priori. As detailedearlier this threshold has to be set with respect to the used dockingmethod. This is in part due to the functions used for calculatingthe van der Waals and electrostatic energies that are sensitive tosmall structural changes. In addition, it was shown that PROCOSperforms well in terms of reranking existing decoys as examplifiedon the different decoy sets.

PROCOS is freely available as an easy-to-use web server. Pro-cessing a pdb-file containing one or several models of a proteincomplex, it calculates a probability-like measure for each modelthat this structure belongs to the class of native complex structures.To support the user’s decision, the computed values are visualizedin a plot which represents the probability distributions of the train-ing data. In future developments we expect further improvements byadding additional discriminatory features such as hydrophobicity orsolvent accessible surface area as mentioned in ref. 26. Due to themodular concept of PROCOS, this can easily be achieved.

Acknowledgments

The authors thank Joël Janin and Marc Lensink for providing accessto the CAPRI scoring data. They also thank Tully Ernst for carefullyreading the manuscript.

References

1. Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.;Weissig, H.; Shindyalov, I. N.; Bourne, P. E. Nucleic Acids Res 2000,28, 235.

2. Aloy, P.; Russel, R. B. Nat Biotech 2004, 22, 1317.3. Ritchie, D. W. Curr Prot Pep Sci 2008, 9, 1.4. Janin, J.; Wodak, S. Structure 2007, 15, 755.5. Lensink, M. F.; Méndez, R.; Wodak, S. Proteins 2007, 69, 704.6. Katchalski-Katzir, E.; Shariv, I.; Eisenstein, M.; Friesem, A. A.; Aflalo,

C.; Vakser, I. A. Proc Natl Acad Sci USA 1992, 89, 2195.7. Walls, P. H.; Sternberg, M. J. J Mol Biol 1992, 228, 277.



8. Gabb, H. A.; Jackson, R. M.; Sternberg, M. J. J Mol Biol 1997, 272,106.

9. Mandell, J. G.; Roberts, V. A.; Pique, M. E.; Kotlovyi, V.; Mitchell, J.C.; Nelson, E.; Tsigelny, I.; TenEyck, L. F. Protein Eng 2001, 14, 105.

10. Heifetz, A.; Katchalski-Katzir, E.; Eisenstein, M. Protein Sci 2002, 11,571.

11. Meyer, M.; Wilson, P.; Schomburg, D. J Mol Biol 1996, 264, 199.12. Moont, G.; Gabb, H. A.; Sternberg, M. J. Proteins 1999, 35, 364.13. Zhang, C.; Liu, S.; Zhou, H.; Zhou, Y. Protein Sci 2004, 13, 400.14. Fernández-Recio, J.; Totrov, M.; Skorodumov, C.; Abagyan, R. Proteins

2005, 58, 134.15. Camacho, C. J.; Gatchell, D. W.; Kimura, S. R.; Vajda, S. Proteins 2000,

40, 525.16. Murphy, J.; Gatchell, D. W.; Prasad, J. C.; Vajda, S. Proteins 2003, 53,

840.17. Li, C. H.; Ma, X. H.; Shen, L. Z.; Chang, S.; Chen, W. Z.; Wang, C. X.

Biophys Chem 2007, 129, 1.18. Müller, W.; Sticht, H. Proteins 2007, 67, 98.19. Martin, O.; Schomburg, D. Proteins 2008, 70, 1367.20. Dominguez, C.; Boelens, R.; Bonvin, A. M. J. J. J Am Chem Soc 2003,

125, 1731.21. Kastritis, P. L.; Bonvin, A. M. J. J. J Proteome Res 2010, 9, 2216.22. Cornfield, J. Biometrics 1969, 25, 617.23. Mintz, S.; Shulman-Peleg, A.; Wolfson, H. J.; Nussinov, R. Proteins

2005, 61, 6.

24. Wolowski, V. R. Computational analysis of protein–protein complexesrelated to knowledge based predictions of interaction, Master thesis,Department of Computer Science, University of Hagen, German, 2008.

25. Capri homepage. Available at http://www.ebi.ac.uk/msd-srv/capri/.26. Ezkurdia, I.; Bartoli, L.; Fariselli, P.; Casadio, R.; Valencia, A.; Tress,

M. L. Brief Bioinform 2009, 10, 233.27. Word, J. M.; Lovell, S. C.; Richardson, J. S.; Richardson, D. C. J Mol

Biol 1999, 285, 1735.28. Brünger, A. T.; Adams, P. D.; Clore, G. M.; DeLano, W. L.; Gros, P.;

Grossekunstleve, R. W.; Jiang, J. S.; Kuszewski, J.; Nilges, M.; Pannu,N. S.; Read, R. J.; Rice, L. M.; Simonson, T.; Warren, G. L. Acta CrystD 1998, 54, 905.

29. deVries, S. J.; vanDijk, A. D. J.; Krzeminski, M.; vanDijk, M.; Thureau,A.; Hsu, V.; Wassenaar, T.; Bonvin, A. M. J. J. Proteins 2007, 69,726.

30. Chang, C. C.; Lin, C. J. http://www.csie.ntu.edu.tw/∼cjlin/libsvm 2001.31. Pierce, B.; Weng, Z. Proteins 2007, 67, 1078.32. Yang, Y.; Zhou, Y. Proteins 2008, 72, 793.33. Mintseris, J.; Wiehe, K.; Pierce, B.; Anderson, R.; Chen, R.; Janin, J.;

Weng, Z. Proteins 2005, 60, 214.34. Chen, R.; Mintseris, J.; Janin, J.; Weng, Z. Proteins 2003, 52, 88.35. Liu, S.; Gao, Y.; Vakser, I. A. Bioinformatics 2008, 24, 2634.36. Tovchigrechko, A.; Vakser, I. A., Proteins 2005, 60, 296.37. Gray, J. J.; Moughon, S.; Wang, C.; Schueler-Furman, O.; Kuhlman, B.;

Rohl, C. A.; Baker, D. J Mol Biol 2003, 331, 281.