+ All Categories
Home > Documents > PhosCalc: A tool for evaluating the sites of peptide phosphorylation from Mass Spectrometer data

PhosCalc: A tool for evaluating the sites of peptide phosphorylation from Mass Spectrometer data

Date post: 30-Sep-2016
Category:
Upload: daniel-maclean
View: 218 times
Download: 1 times
Share this document with a friend
9
BioMed Central Page 1 of 9 (page number not for citation purposes) BMC Research Notes Open Access Technical Note PhosCalc: A tool for evaluating the sites of peptide phosphorylation from Mass Spectrometer data Daniel MacLean, Michael A Burrell, David J Studholme and Alexandra ME Jones* Address: The Sainsbury Laboratory, John Innes Centre, Norwich Research Park, Colney Lane, Norwich, NR4 7UH, UK Email: Daniel MacLean - [email protected]; Michael A Burrell - [email protected]; David J Studholme - [email protected]; Alexandra ME Jones* - [email protected] * Corresponding author Abstract Background: We have created a software implementation of a published and verified method for assigning probabilities to potential phosphorylation sites on peptides using mass spectrometric data. Our tool, named PhosCalc, determines the number of possible phosphorylation sites and calculates the theoretical masses for the b and y fragment ions of a user-provided peptide sequence. A corresponding user-provided mass spectrum is examined to determine which putative b and y ions have support in the spectrum and a probability score is calculated for each combination of phosphorylation sites. Findings: We test the implementation using spectra of phosphopeptides from bovine beta-casein and we compare the results from the implementation to those from manually curated and verified phosphopeptides from our own experiments. We find that the PhosCalc scores are capable of helping a user to identify phosphorylated sites and can remove a bottleneck in high throughput proteomics analyses. Conclusion: PhosCalc is available as a web-based interface for examining up to 100 peptides and as a downloadable tool for examining larger numbers of peptides. PhosCalc can be used to speed up identification of phosphorylation sites and can be easily integrated into data handling pipelines making it a very useful tool for those involved in phosphoproteomic research. Findings Challenges of detecting phosphorylated residues in mass spectrometer data Phosphorylation is probably the most common of protein post translational modifications (PTMs), with 30% of eukaryotic proteins estimated to be modified this way [1]. Phosphorylation is essential to the cell by playing a cen- tral role in signal transduction cascades, regulation of pro- tein activity and protein-protein interactions. Therefore, protein phosphorylation is one of the most intensely studied PTMs. Protein phosphorylation can be detected as a mass shift (+79.99 Da) in mass spectra, which corre- sponds to the addition of HPO 3 to a peptide, generally at serine, threonine or tyrosine residues. In the mass spec- trometer, peptides fragment in predictable ways and pro- grams such as MASCOT [2] use algorithms to match predicted fragmentation patterns of peptides from sequence databases to that observed in MS spectra. While Published: 23 June 2008 BMC Research Notes 2008, 1:30 doi:10.1186/1756-0500-1-30 Received: 8 May 2008 Accepted: 23 June 2008 This article is available from: http://www.biomedcentral.com/1756-0500/1/30 © 2008 MacLean et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Transcript
Page 1: PhosCalc: A tool for evaluating the sites of peptide phosphorylation from Mass Spectrometer data

BioMed CentralBMC Research Notes

ss

Open AcceTechnical NotePhosCalc: A tool for evaluating the sites of peptide phosphorylation from Mass Spectrometer dataDaniel MacLean, Michael A Burrell, David J Studholme and Alexandra ME Jones*

Address: The Sainsbury Laboratory, John Innes Centre, Norwich Research Park, Colney Lane, Norwich, NR4 7UH, UK

Email: Daniel MacLean - [email protected]; Michael A Burrell - [email protected]; David J Studholme - [email protected]; Alexandra ME Jones* - [email protected]

* Corresponding author

AbstractBackground: We have created a software implementation of a published and verified method forassigning probabilities to potential phosphorylation sites on peptides using mass spectrometricdata. Our tool, named PhosCalc, determines the number of possible phosphorylation sites andcalculates the theoretical masses for the b and y fragment ions of a user-provided peptide sequence.A corresponding user-provided mass spectrum is examined to determine which putative b and yions have support in the spectrum and a probability score is calculated for each combination ofphosphorylation sites.

Findings: We test the implementation using spectra of phosphopeptides from bovine beta-caseinand we compare the results from the implementation to those from manually curated and verifiedphosphopeptides from our own experiments. We find that the PhosCalc scores are capable ofhelping a user to identify phosphorylated sites and can remove a bottleneck in high throughputproteomics analyses.

Conclusion: PhosCalc is available as a web-based interface for examining up to 100 peptides andas a downloadable tool for examining larger numbers of peptides. PhosCalc can be used to speedup identification of phosphorylation sites and can be easily integrated into data handling pipelinesmaking it a very useful tool for those involved in phosphoproteomic research.

FindingsChallenges of detecting phosphorylated residues in mass spectrometer dataPhosphorylation is probably the most common of proteinpost translational modifications (PTMs), with 30% ofeukaryotic proteins estimated to be modified this way [1].Phosphorylation is essential to the cell by playing a cen-tral role in signal transduction cascades, regulation of pro-tein activity and protein-protein interactions. Therefore,

protein phosphorylation is one of the most intenselystudied PTMs. Protein phosphorylation can be detected asa mass shift (+79.99 Da) in mass spectra, which corre-sponds to the addition of HPO3 to a peptide, generally atserine, threonine or tyrosine residues. In the mass spec-trometer, peptides fragment in predictable ways and pro-grams such as MASCOT [2] use algorithms to matchpredicted fragmentation patterns of peptides fromsequence databases to that observed in MS spectra. While

Published: 23 June 2008

BMC Research Notes 2008, 1:30 doi:10.1186/1756-0500-1-30

Received: 8 May 2008Accepted: 23 June 2008

This article is available from: http://www.biomedcentral.com/1756-0500/1/30

© 2008 MacLean et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Page 1 of 9(page number not for citation purposes)

Page 2: PhosCalc: A tool for evaluating the sites of peptide phosphorylation from Mass Spectrometer data

BMC Research Notes 2008, 1:30 http://www.biomedcentral.com/1756-0500/1/30

these programs allow for modification to peptides, theydo not explicitly compare the evidence that may supportlocalisation of a modification to a specific residue ratherthan a neighbouring position. Nor are they explicit whenthe data cannot distinguish between alternatives. As phos-phorylation may occur at rather common amino acid res-idues, it is not unusual for a peptide to contain severalpossible sites. The evidence that allows one to discrimi-nate between two possibilities can be as low as one or twopeaks in a mass spectrum. Alternatively, if the potentialsites are well separated on the peptide, there may be directevidence in the form of several well identified peaks tosupport one site over another.

It is important to be explicit about the level of confidencea mass spectrum can provide for a particular phosphoryla-tion site because this information has a large impact onsubsequent laboratory work (for example, identifying tar-gets for site-directed mutagenesis). However, it is hard toevaluate MS data and accurately judge the informationprovided by MS2 fragmentation spectra (MS2 spectra arespectra from the first fragmentation, MS3 spectra areselected from the MS2 fragmentation and so forth) with-out time-consuming manual examination by experiencedpersonnel. The interpretation of mass spectra of phos-phopeptides, particularly from ion trap instruments, isfurther complicated by the tendency of phosphopeptidesto preferentially fragment at the labile phosphoester bond(with neutral loss of -98 Da; H3PO4) often accompaniedby poor fragmentation along the peptide backbone. Thisproblem can be addressed (in ion traps) by a further frag-mentation event (MS3) on the neutral loss product ionproduced in the MS2 fragmentation event.

An algorithm to identify the phosphorylation site with best support in the spectrumRecently an algorithm has been developed to provide fur-ther support of peptide identifications from MS2 spectraby comparison to MS3 [3]. The use of such an algorithmin an analysis pipeline allows automatic phosphorylationsite identification or allows pre-selection of spectra formanual identification. The method was subsequentlydeveloped to validate the position of phosphorylationfrom similar data [4]. The Olsen-Mann algorithm uses thefour most intense peaks per 100 m/z units in an MS2 orMS3 spectrum, determines the theoretical masses of b andy ions, and makes corrections for the masses of the ionsappropriate to whether the peptide sequence and spec-trum are derived from an MS2 or MS3 spectrum. By calcu-lating all possible b and y ions and all combinations ofphosphorylation site the algorithm is able to work onpeptide sequences with any number of potential phos-phorylation sites. The algorithm counts the matches of thefour most intense peaks to each theoretical b and y ion. Amatch is called whenever a peak from the spectrum falls

within a user-specified window of error around the theo-retical ion mass; the mass accuracy of the mass spectrom-eter used to generate the spectrum file should determinethe size of the appropriate window of error.

The probability of phosphorylation [3] at a site is given as

where

n = the total number of possible b and y ions,

k = the number of successful matches

P = 0.04 (as 4 peaks are allowed per 100 Da [2])

The probability score is

-10log(p(phos)) (2)

PhosCalc: an implementation of the algorithmThe algorithm has gained popularity as the number oflarge scale projects increase and is incorporated into theopen source program MSQuant [4] but, surprisingly, isnot available as a stand-alone tool. Therefore, we haveimplemented an exact version of the leading methoddescribed and verified in [5] and used in other publishedstudies that calculates a probability based score for eachpotential phosphorylation site. An existing implementa-tion of the algorithm, Ascore [6], described in [7] is avail-able but unlike PhosCalc is apparently tied to anunderlying human protein database for a compulsorypeptide identification step, removing its applicability todata derived from other organisms. PhosCalc allows theuser to provide a peptide sequence from any source.Unlike Ascore, PhosCalc will analyse data from MS2 andMS3 spectra, not just MS2 spectra. PhosCalc also permitsthe user to vary the window of error for peak matchingallowing for analysis of data from mass spectrometers ofdifferent mass accuracy.

Our implementation, called PhosCalc, is available as aweb-tool or downloadable Perl script. In both versions,the user must provide predicted peptide sequences, thenumber of phosphorylation sites and a correspondingmass spectrum in the Sequest DTA format. Data can beentered via the simple web-interface for one to 100 pep-tide sequences (Figure 1A); two files are required: a list ofpeptide sequences and the zipped peak lists as dta files.The standalone package can handle an unlimited numberof sequences. The PhosCalc tool returns an easily inter-preted table showing probability of phosphorylation(defined as p(phos) in formula 1) and score (defined in

p(phos) n

kP ( P)k (n k)=

⎝⎜

⎠⎟ ⋅ ⋅ − −1 (1)

Page 2 of 9(page number not for citation purposes)

Page 3: PhosCalc: A tool for evaluating the sites of peptide phosphorylation from Mass Spectrometer data

BMC Research Notes 2008, 1:30 http://www.biomedcentral.com/1756-0500/1/30

formula 2) for each combination of phosphorylation sitesand the number of peak matches in the spectrum for thatvariant (Figure 2B). Our implementation uses mono-iso-topic residue masses and the value for cysteine is unmodi-fied (i.e., not alkylated). These values can be easily alteredin the downloadable version of PhosCalc. The variableoxidation of methionine is included by 'M*' with theaddition of +15.99 Da. The mass value for phosphoryla-tion is +79.9799 Da and is indicated by @ or # in the pep-tide sequence. As MS3 spectra are triggered by loss ofphosphate plus water, the loss of 18.0105 Da is expectedto replace phosphorylation site in MS3 spectra.

Using PhosCalcAs a minimum, PhosCalc requires that the user provide apeptide sequence that is thought to be phosphorylatedand a corresponding mass/intensity (dta) file from an MSexperiment. File formats for both the command line andweb versions are equivalent. The command line versionruns non-interactively from a single command line invo-cation and creates output in a tab-delimited text file so

that the tool can be easily incorporated into pipelines andworkflows. The following description is of web-toolusage; instructions on how to run the downloadable ver-sion are available in the README file that comes with thedownload.

• On the main PhosCalc page [8] insert into the box Pep-tide Sequence (Figure 1A) the amino acid sequence of thephosphorylated peptide and the markers representing thepotentially phosphorylated peptides. The amino acidsequence of the peptide may be represented using the fol-lowing standard IUPAC amino acid symbols: A, C, D, E, F,G, H, I, K, L, M, N, P, Q, R, S, T, V, W or Y and the poten-tially phosphorylated sites may be identified by insertionof one of @, # or ^ (as far as PhosCalc is concerned thesesymbols are interchangeable, but some upstream analysissoftware make distinctions) after the putative phosphor-ylation site suggested by the MS software; e.g., the peptidesequence YNS#DTPEGVNSNWQR indicates that the pep-tide is thought to carry a single phosphorylation, possiblyon the first serine (the calculator will assess and return the

Data input screens from the web version of PhosCalcFigure 1Data input screens from the web version of PhosCalc.

Page 3 of 9(page number not for citation purposes)

Page 4: PhosCalc: A tool for evaluating the sites of peptide phosphorylation from Mass Spectrometer data

BMC Research Notes 2008, 1:30 http://www.biomedcentral.com/1756-0500/1/30

likelihood of phosphorylation at all possible sites withinthe peptide, irrespective of the putative phosphorylationsite suggested by the MS software). An oxidised methio-nine residue may be indicated by entering 'M*'.

• Select the spectrum file to be evaluated (Figure 1A); thisis the output from the MS machine that represents themass peaks associated with this peptide. PhosCalc expectsthat this is to be in .dta format file. A dta file is a mass/intensity pair list that is a representation of the originalMS/MS spectrum and consists of two columns of decimalnumbers separated by one or more space characters andended by a carriage return. Examples are provided on theweb site and with the downloadable tool.

• Select a Window Size (Figure 1A). The hypothetical masspeaks derived from the peptide sequence are matchedwith the peaks from the mass spec by allowing a windowof error. This option defines the width of this window oferror.

• Select the Experiment Type (Figure 1A).

• Select whether the data came from an MS2 or MS3experiment. If this is not selected, an MS2 experiment willbe assumed. The effect of selecting MS3 is that the dehy-dration values will be used for #, @ and ^ symbols (-18.0105 Da) not phosphorylation (+79.9799 Da).

Example of results of PhosCalc for the peptide FQSEEQQQTEDELQDK with annotated MS2 spectrum from SequestFigure 2Example of results of PhosCalc for the peptide FQSEEQQQTEDELQDK with annotated MS2 spectrum from Sequest. A) PhosCalc output. B) Spectrum from a correct assignment. C) The same spectrum as 'B' with an incorrect assign-ment. Peaks marked in blue and marked with empty circles are matched to b or y ions by Sequest and * marks those peaks that do not match with the incorrect position of phosphorylation. For clarity only singly charged b and y ions without further loss of OH or NH3 groups are annotated.

3 .9 6 E 3

4 0 0 6 0 0 8 0 0 1 0 0 0 1 2 0 0 1 4 0 0 1 6 0 0 1 8 0 0 2 0 0 0

m /z

0

Inten

sity

9 7 7 .5

7 4 7 .41 2 3 3 .51 1 0 5 .6 1 4 9 0 .7

5 0 3 .4 1 4 3 0 .58 2 9 .4 1 9 1 5 .61 3 4 4 .6 1 6 1 9 .5

7 0 1 .4

6 3 2 .43 7 3 .2y3

y4

9 82.6

y5

b5y6

[M+2H-H3PO4]2+MS2 from 1031.69 m/z

b6

y8

y9y10 y11 b11

y12

y13b15

FQ [S ]E E Q Q Q T E D ELQ D K

y7

3 .9 6 E 3

m /z

Inten

sity

y3

y4

9 82.6

b5

y6

[M+2H-H3PO4]2+MS2 from 1031.69 m/z

b6 y7

b10b11 b15

FQ S E E Q Q Q [T ]E D ELQ D K

4 0 0 6 0 0 8 0 0 1 0 0 0 1 2 0 0 1 4 0 0 1 6 0 0 1 8 0 0 2 0 0 0

7 4 7 .4

5 0 3 .4 1 4 3 0 .5 1 9 1 5 .66 0 3 .4 1 5 5 9 .51 3 1 5 .5b12

** ***3 7 3 .2

A

B C

Page 4 of 9(page number not for citation purposes)

Page 5: PhosCalc: A tool for evaluating the sites of peptide phosphorylation from Mass Spectrometer data

BMC Research Notes 2008, 1:30 http://www.biomedcentral.com/1756-0500/1/30

When analysing data from 2 – 100 peptides in the web-tool (Figure 1B), a file is used to provide the peptidesequences and their associated dta files. The Peptide +Spectra file should consist of two tab separated columns.The first column should contain the peptide sequence andpotential phosphorylation site information, formatted asdescribed here; the second column should contain thename of the corresponding spectrum file. The file can eas-ily be created using MS Excel or another spreadsheet pro-gram and saved as a tab delimited text file. To prevent theneed to upload each .dta file individually, a zip archive ofthe .dta files is used. To create a zip archive of the .dta fileslisted in the Peptide + Spectra file on a Windows basedcomputer, there are numerous commercially availablearchiving programs such as the Winzip program [9] whichcan be used. On other operating systems such as MacOS Xand Linux variants, a version of zip should be installed bydefault and the user should refer to the relevant documen-tation. Note that only zip archives can be decompressedby the server and other archive types will not work.

When run, the calculator will return the following results(Figure 2A),

• phosphosite variant: a list of phosphosite variants of theprovided sequence, with the phosporylation sites consid-ered in square brackets

• ions: the number of ions predicted from the peptidesequence

• ions matched: the number of predicted ions whose massmatched the masses in the dta file,

• p-values: the likelihood of this number of matches(defined in formula 1)

• score: phosphorylation site score (defined in formula 2).

Efficacy of PhosCalcTo demonstrate the utility and limitations of PhosCalc,we analysed a sample of the well characterised phospho-protein, bovine beta-casein (Sigma). A sample of bovinecasein protein was digested with trypsin and analysedwith an LTQ mass spectrometer. Peptides were identifiedusing SEQUEST (ThermoFisher) on Bioworks 3.1 (full MSdetails, SEQUEST search parameters and peptide scoresare available from the PhosCalc website). Phosphopep-tides were identified from casein alpha-S1, beta-caseinvariant CnH and casein alpha-S2. These are products of asmall related gene family and their inclusion in the prep-aration from Sigma was expected; this was useful as it pro-vided a range of well characterised phosphopeptides. Theobserved phosphopeptides of casein αS1, αS2 and β aredetailed in Supplementary Table 1. All dta files and

Sequest scores can be downloaded from the PhosCalcwebsite.

Peptide FQSEEQQQTEDELQDK has only two possiblephosphorylation sites and they are well separated (Ser3and Thr9). The output of PhosCalc clearly shows that theMS2 and MS3 spectra strongly support the known phos-phorylation site at Ser3 (Figures 2A and 3A). The first col-umn shows the site being assessed, the second reportshow many ion peaks in the spectrum were evaluated (30)and the third column contains the predicted b or y ionsmatched (20 or 11 for Ser3 or Thr9, respectively). Finallythe p-value and score are also reported. So that experi-enced MS users can assess how the numbers produced byPhosCalc compare with visual representations of spectra,two annotated versions of the same spectrum are pro-vided for both the correct Ser3 assignment and the falseThr9 (Figures 2A and 3A). In this case, as the spectrum isgood and the putative phosphorylation sites are well sep-arated, several y ions (from y8 to y14; covering the por-tion SEEQQQT of the peptide) do not match if thephosphorylation is located on Thr9. Such matching is veryclear in the MS3 spectra annotated in Figures 3B and 3C.While the differences are clear in this example it is not triv-ial to make these comparisons manually and PhosCalccan assist in identification of those spectra that do providesufficient evidence to discriminate between alternativepositions.

Of all the phosphopeptides identified, KTVDMESTEVFTKprovides the most detailed example because this peptideshould contain only one 'true' phosphorylation site andhas three alternative positions. The peptide was alsoobserved with an oxidised methionine residue (Figures 4Band 4D) and with a mis-cleavage (K) (Figures 4A and 4C).Figure 4A shows that only with the unoxidised form ofTVDMESTEVFTK can both MS2 and MS3 spectra distin-guish between the correct phosphorylation site at Ser6and Thr7. We recommend that at least two peaks shouldbe differential between alternative positions before thephosphorylation site can be unambiguously identified.This correlates to a difference of three orders of magnitude

Table 1: The sensitivities and specificities based on the distributions of PTM score for phosphorylated and non-phosphorylated sites in known phosphopeptides.

Sensitivity (%) Specificity (%) PTM score cut-off

MS2 99 82 83.9795 82.5 85.9890 82.47 86.99

MS3 99 39 39.995 43.5 56.3590 48 76.95

Page 5 of 9(page number not for citation purposes)

Page 6: PhosCalc: A tool for evaluating the sites of peptide phosphorylation from Mass Spectrometer data

BMC Research Notes 2008, 1:30 http://www.biomedcentral.com/1756-0500/1/30

between position scores. That it is often hard to preciselysupport one phosphorylation site does not demonstrate aweakness of PhosCalc but rather the limitations of themass spectra provided. As the two main alliterative sitesare next to each other, one could only expect very few ionsto be discriminatory between alternative sites.

To assess the ability of PhosCalc to distinguish betweenphosphorylated and non-phosphorylated sites fromexperimental data, we utilised the spectra of phosphopep-tides from Arabidopsis thaliana which had been previouslyidentified by other researchers [10-12]. All spectra weregenerated on an LTQ (ThermoFisher Corp.) using the con-ditions detailed previously. The spectra can be down-loaded as dta files from the PhosCalc website. For these

data an error tolerance of 0.4 Da was utilised. A total of 50sites were analysed with PhosCalc and the PTM score ofknown phospho-sites was compared with those of poten-tial sites. Figure 5 shows the PTM score distributions ofeach set. A clear and significant difference can be seenbetween the PTM score distributions of non-phosphor-ylated and phosphorylated sites in both MS2 (p < 2.51 ×10-4) and MS3 (p < 0.049), showing that the implementa-tion is capable of distinguishing between true phosphor-ylated sites and non-phosphorylated sites.

We have also tested scores generated by PhosCalc withscores generated by Ascore on sample datasets and findthat they are largely equivalent.

Example of results of PhosCalc for the peptide FQSEEQQQTEDELQDK with annotated MS3 spectrum from SequestFigure 3Example of results of PhosCalc for the peptide FQSEEQQQTEDELQDK with annotated MS3 spectrum from Sequest. A) PhosCalc output. B) Spectrum from a correct assignment. C) The same spectrum as 'B' with an incorrect assign-ment. Peaks marked in blue and marked with empty circles are matched to b or y ions by Sequest and * marks those peaks that do not match with the incorrect position of phosphorylation. For clarity only singly charged b and y ions without further loss of OH or NH3 groups are annotated, with the exception of the intense y14 2+ ion.

1.43E2MS3 from 1031.69 m/z then 982.64 m/z

FQ [S ]E E Q Q Q T E D ELQ D K

4 0 0 6 0 0 8 0 0 1 0 0 0 1 2 0 0 1 4 0 0 1 6 0 0 1 8 0 0 2 0 0 00

1 6 1 9 .6

6 3 2 .4

7 4 7 .3

1 3 6 1 .4

1 1 0 5 .6

1 6 8 8 .6

3 9 0 .31 4 6 1 .65 0 3 .5

8 7 6 .4

1 8 1 7 .2

6 0 3 .3

1 4 9 0 .5

m /z

Inten

sity

y3 y4

y5

b5

y6

y9

y10

y11

b12

y12

y13

y7 y14

y15

1.43E2MS3 from 1031.69 m/z then 982.64 m/z

FQ S E E Q Q Q [T ]E D ELQ D K

4 0 0 6 0 0 8 0 0 1 0 0 0 1 2 0 0 1 4 0 0 1 6 0 0 1 8 0 0 2 0 0 00

6 3 2 .4

7 4 7 .3

1 6 8 8 .6

3 9 0 .31 4 6 1 .65 0 3 .5

8 7 6 .4

1 8 1 7 .2

6 0 3 .3

m /z

Inten

sity

y3 y4

y5

b5

y6

b12

y7 y14

y15

+2

8 4 5 .0y14

*

**

*

A

B C

Page 6 of 9(page number not for citation purposes)

Page 7: PhosCalc: A tool for evaluating the sites of peptide phosphorylation from Mass Spectrometer data

BMC Research Notes 2008, 1:30 http://www.biomedcentral.com/1756-0500/1/30

To guide the user to a useful PTM score cut-off, we calcu-lated sensitivities and specificities based on the distribu-tions of PTM score for phosphorylated and non-phosphorylated sites in known phosphopeptides, Table 1.The implementation of the algorithm works extremelywell. We are able to obtain 99% sensitivity at a specificityof 82% in spectra from MS2 experiments, using a PTMscore of 83.97 or higher. The implementation also workswell with MS3 spectra, allowing a specificity of 48% at asensitivity of 90% with a PTM score of 76.95 or higher.

Previously published studies have not used this algorithmin isolation, rather it has been used in conjunction withother measures such as MASCOT scores [2]. These pipe-lines assign different confidence thresholds depending onthe study and type of MS. We advise that users shouldimplement additional scoring criteria particularly regard-ing the sequence assignment and that PhosCalc scores andcut-offs should be chosen with care.

The PhosCalc software is a fast and simple tool for reliablyidentifying phosphorylation sites in mass spectrometerdata. PhosCalc should find utility in laboratories carryingout phosphorylation site analyses at any scale. By usingour empirical sensitivity/specificity estimations and PTMscore cut-offs or those used in other studies or by compar-ing with PTM scores in previously curated data sets fromin-house examinations, the software can be used to speedup or automate decisions on phosphorylation site iden-tity. With low-mass accuracy data, it should be noted thatwhen putative phosphorylation sites are close to eachother on the peptide, or if the mass spectrum contains fewpeaks of reasonable intensity in the area of interest, theremay not be enough information (from that spectrum) todiscriminate between alternatives. It is important to beaware of the limitations of the spectra obtained andexplicit about the levels of confidence in a particular phos-phorylation site. The strength of PhosCalc is to enableusers to rapidly identify those spectra which provide

PhosCalc output for variations of KTVDMESTEVFTKFigure 4PhosCalc output for variations of KTVDMESTEVFTK.

Page 7 of 9(page number not for citation purposes)

Page 8: PhosCalc: A tool for evaluating the sites of peptide phosphorylation from Mass Spectrometer data

BMC Research Notes 2008, 1:30 http://www.biomedcentral.com/1756-0500/1/30

strong evidence for a specific phosphorylation site, evenfrom low-mass accuracy data.

Availability and RequirementsProject name: PhosCalc

Project home page: http://www.ayeaye.tsl.ac.uk/PhosCalc

Operating system(s): Platform independent

Programming language: Perl

Other requirements: For the download version, Perl 5.6 orhigher, Perl Math module, also under GPL and providedwith PhosCalc download

License: GPL 3

Restrictions to use by non-academics: none

AbbreviationsAmu: Atomic mass units; Da:= Dalton; PTM: Post-transla-tional modification.

Competing interestsThe authors declare that they have no competing interests.

Authors' contributionsDM developed the implementation of the algorithm andthe web tool and carried out the statistical analyses, MABassisted in making the web-tool and implemented andmaintains the infrastructure upon which it runs, DJS par-ticipated in the design of the implementation and AMEJconceived of the study, participated in its design and car-ried out manual curation of peptide phosphorylation sitedata to allow verification of the implementation.

AcknowledgementsWe are grateful to the Gatsby Charitable Foundation for supporting the Sainsbury Laboratory.

Comparison of PTM score distributions generated by PhosCalcFigure 5Comparison of PTM score distributions generated by PhosCalc. Grey boxes indicate distributions of PTM score in sites that are known not to be phosphorylated, pink boxes indicate distributions of PTM score in sites that are known to be phosphorylated.

Page 8 of 9(page number not for citation purposes)

Page 9: PhosCalc: A tool for evaluating the sites of peptide phosphorylation from Mass Spectrometer data

BMC Research Notes 2008, 1:30 http://www.biomedcentral.com/1756-0500/1/30

Publish with BioMed Central and every scientist can read your work free of charge

"BioMed Central will be the most significant development for disseminating the results of biomedical research in our lifetime."

Sir Paul Nurse, Cancer Research UK

Your research papers will be:

available free of charge to the entire biomedical community

peer reviewed and published immediately upon acceptance

cited in PubMed and archived on PubMed Central

yours — you keep the copyright

Submit your manuscript here:http://www.biomedcentral.com/info/publishing_adv.asp

BioMedcentral

References1. Steen H, Mann M: The ABC's (and XYZ's) of peptide sequenc-

ing. Nat Rev Mol Cell Biol 2004, 5:699-711.2. Matrix Science [http://www.matrixscience.com/]3. Olsen JV, Mann M: Improved peptide identification in pro-

teomics by two consecutive stages of mass spectrometricfragmentation. Proc Natl Acad Sci USA 2004, 101:13417-13422.

4. MSQuant at Sourceforge.net [http://sourceforge.net/projects/msquant/]

5. Olsen JV, Blagoev B, Gnad F, Macek B, Kumar C, Mortensen P, MannM: In Vivo, and Site-Specific Phosphorylation Dynamics inSignaling Networks. Cell 2006, 127:635-648.

6. Ascore [http://ascore.med.harvard.edu/ascore.php]7. Beausoleil SA, Villén J, Gerber SA, Rush J, Gygi SP: A probability-

based approach for high-throughput protein phosphoryla-tion analysis and site localization. Nat Biotechnol 2006,24:1226-1227.

8. PhosCalc [http://www.ayeaye.tsl.ac.uk/PhosCalc]9. WinZip [http://www.winzip.com]10. Nuhse TS, Stensballe A, Jensen ON, Peck SC: Large-scale analysis

of in vivo phosphorylated membrane proteins by immobi-lized metal ion affinity chromatography and mass spectrom-etry. Mol Cell Proteomics 2003, 22:22.

11. Nuhse TS, Stensballe A, Jensen ON, Peck SC: Phosphoproteomicsof the Arabidopsis plasma membrane and a new phosphoryla-tion site database. Plant Cell 2004, 16:2394-2405.

12. Niittylä TATF, Palmgren Michael G, Frommer Wolf B, Waltraud X,Schulze : Temporal analysis of sucrose-induced phosphoryla-tion changes in plasma membrane proteins of Arabidopsis.Mol Cell Proteomics 2007, 10:1711-1726.

Page 9 of 9(page number not for citation purposes)


Recommended