+ All Categories
Home > Documents > RSC MB C3MB25534D 3. - Molecular Biophysics...

RSC MB C3MB25534D 3. - Molecular Biophysics...

Date post: 05-Jun-2018
Category:
Upload: hakhanh
View: 216 times
Download: 0 times
Share this document with a friend
25
This journal is c The Royal Society of Chemistry 2013 Mol. BioSyst. Cite this: DOI: 10.1039/c3mb25534d Rapid mass spectrometric determination of disulfide connectivity in peptides and proteinsMoitrayee Bhattacharyya,z Kallol Gupta,z Konkallu Hanumae Gowdy and Padmanabhan Balaram* Disulfide crosslinks are ubiquitous in natural peptides and proteins, providing rigidity to polypeptide scaffolds. The assignment of disulfide connectivity in multiple crosslinked systems is often difficult to achieve. Here, we show that rapid unambiguous characterisation of disulfide connectivity can be achieved through direct mass spectrometric CID fragmentation of the disulfide intact polypeptides. The method requires a direct mass spectrometric fragmentation of the native disulfide bonded polypeptides and subsequent analysis using a newly developed program, DisConnect. Technical difficulties involving direct fragmentation of proteins are surmounted by an initial proteolytic nick and subsequent determination of the structures of these proteolytic peptides through DisConnect. While the connectivity in proteolytic fragments containing one cystine is evident from the MS profile alone, those with multiple cystines are subjected to subsequent mass spectrometric fragmentation. The wide applicability of this method is illustrated using examples of peptide hormones, peptide toxins, proteins, and disulfide foldamers of a synthetic analogue of a marine peptide toxin. The method, coupled with DisConnect, provides an unambiguous, straightforward approach, especially useful for the rapid screening of the disulfide crosslink fidelity in recombinant proteins, determination of disulfide linkages in natural peptide toxins and characterization of folding intermediates encountered in oxidative folding pathways. Introduction The disulfide bond is the most widely observed covalent cross- link in naturally occurring peptides and proteins. This post- translational modification limits the range of accessible conformational states, confers thermal stability and enhances resistance to proteolysis. 1–3 Disulfide crosslinks are commonly found in secreted proteins, antibodies, key growth factors and a wide range of polypeptide toxins that constitute the venoms of diverse organisms, ranging from snakes to marine cone snails. 4–10 Recent genomic evidence suggests that disulfide bonds also occur in the intracellular proteins of archaea, contrary to the commonly held perception that such proteins lack such crosslinks. 11 While databases are flooded with amino acid sequences of polypeptides, most often derived from genomic sequences, a tribute to the pace and power of DNA sequencing, the determination of cysteine pairing schemes in a multiple disulfide bonded molecule is considerably less effi- cient. The number of disulfide bond isomers rapidly increases with the number (n) of disulfide bonded Cys residues; the general formula being n!/[(n/2)!2 n/2 ](e.g., n = 4, 3 isomers; n = 6, 15 isomers; n = 8, 105 isomers and so on). In many cases these crosslinks provide the structural scaffolds that are essen- tial for their recognition at specific receptor sites, making connectivity determination a key step in establishing structure– function relationships. Also in synthetic disulfide bonded polypeptides, produced either chemically or through recombi- nant DNA technology, the identity of the cysteine crosslinks in the native and synthetic structure is essential. The early determination of S–S pairing in proteins was a heroic effort, using separation of proteolytic fragments by electrophoresis in one dimension followed by performic acid oxidation and paper chromatographic separation in the other. 12 More recently disulfide connectivity in proteins can be unambiguously determined by X-ray diffraction. However, experimental difficulties in obtaining single crystals limit the general applicability of the method, especially for natural peptides. Molecular Biophysics Unit, Indian Institute of Science, Bangalore-560012, India. E-mail: [email protected]; Fax: +91-80-236060683/+91-80-23600535; Tel: +91-80-22932337 † Electronic supplementary information (ESI) available: Supplementary Tables S1, Fig. S1–S13, supplementary methods and supplementary ref. 1–9. See DOI: 10.1039/ c3mb25534d ‡ Both these authors contributed equally. § Current address: Undergraduate Program, Indian Institute of Science, Bangalore-560012, India. Received 20th November 2012, Accepted 7th February 2013 DOI: 10.1039/c3mb25534d www.rsc.org/molecularbiosystems Molecular BioSystems PAPER Downloaded by Indian Institute of Science on 08 March 2013 Published on 08 February 2013 on http://pubs.rsc.org | doi:10.1039/C3MB25534D View Article Online View Journal
Transcript
Page 1: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

This journal is c The Royal Society of Chemistry 2013 Mol. BioSyst.

Cite this: DOI: 10.1039/c3mb25534d

Rapid mass spectrometric determination of disulfideconnectivity in peptides and proteins†

Moitrayee Bhattacharyya,z Kallol Gupta,z Konkallu Hanumae Gowdy andPadmanabhan Balaram*

Disulfide crosslinks are ubiquitous in natural peptides and proteins, providing rigidity to polypeptide

scaffolds. The assignment of disulfide connectivity in multiple crosslinked systems is often difficult to

achieve. Here, we show that rapid unambiguous characterisation of disulfide connectivity can be

achieved through direct mass spectrometric CID fragmentation of the disulfide intact polypeptides. The

method requires a direct mass spectrometric fragmentation of the native disulfide bonded polypeptides

and subsequent analysis using a newly developed program, DisConnect. Technical difficulties involving

direct fragmentation of proteins are surmounted by an initial proteolytic nick and subsequent

determination of the structures of these proteolytic peptides through DisConnect. While the connectivity

in proteolytic fragments containing one cystine is evident from the MS profile alone, those with

multiple cystines are subjected to subsequent mass spectrometric fragmentation. The wide applicability

of this method is illustrated using examples of peptide hormones, peptide toxins, proteins, and disulfide

foldamers of a synthetic analogue of a marine peptide toxin. The method, coupled with DisConnect,

provides an unambiguous, straightforward approach, especially useful for the rapid screening of the

disulfide crosslink fidelity in recombinant proteins, determination of disulfide linkages in natural peptide

toxins and characterization of folding intermediates encountered in oxidative folding pathways.

Introduction

The disulfide bond is the most widely observed covalent cross-link in naturally occurring peptides and proteins. This post-translational modification limits the range of accessibleconformational states, confers thermal stability and enhancesresistance to proteolysis.1–3 Disulfide crosslinks are commonlyfound in secreted proteins, antibodies, key growth factors and awide range of polypeptide toxins that constitute the venomsof diverse organisms, ranging from snakes to marine conesnails.4–10 Recent genomic evidence suggests that disulfidebonds also occur in the intracellular proteins of archaea,contrary to the commonly held perception that such proteinslack such crosslinks.11 While databases are flooded with amino

acid sequences of polypeptides, most often derived fromgenomic sequences, a tribute to the pace and power of DNAsequencing, the determination of cysteine pairing schemes in amultiple disulfide bonded molecule is considerably less effi-cient. The number of disulfide bond isomers rapidly increaseswith the number (n) of disulfide bonded Cys residues; thegeneral formula being n!/[(n/2)!2n/2] (e.g., n = 4, 3 isomers;n = 6, 15 isomers; n = 8, 105 isomers and so on). In many casesthese crosslinks provide the structural scaffolds that are essen-tial for their recognition at specific receptor sites, makingconnectivity determination a key step in establishing structure–function relationships. Also in synthetic disulfide bondedpolypeptides, produced either chemically or through recombi-nant DNA technology, the identity of the cysteine crosslinks inthe native and synthetic structure is essential.

The early determination of S–S pairing in proteins was aheroic effort, using separation of proteolytic fragments byelectrophoresis in one dimension followed by performic acidoxidation and paper chromatographic separation in theother.12 More recently disulfide connectivity in proteins canbe unambiguously determined by X-ray diffraction. However,experimental difficulties in obtaining single crystals limit thegeneral applicability of the method, especially for natural peptides.

Molecular Biophysics Unit, Indian Institute of Science, Bangalore-560012, India.

E-mail: [email protected]; Fax: +91-80-236060683/+91-80-23600535;

Tel: +91-80-22932337

† Electronic supplementary information (ESI) available: Supplementary Tables S1,Fig. S1–S13, supplementary methods and supplementary ref. 1–9. See DOI: 10.1039/c3mb25534d‡ Both these authors contributed equally.§ Current address: Undergraduate Program, Indian Institute of Science,Bangalore-560012, India.

Received 20th November 2012,Accepted 7th February 2013

DOI: 10.1039/c3mb25534d

www.rsc.org/molecularbiosystems

MolecularBioSystems

PAPER

Dow

nloa

ded

by I

ndia

n In

stitu

te o

f Sc

ienc

e on

08

Mar

ch 2

013

Publ

ishe

d on

08

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2553

4D

View Article OnlineView Journal

Page 2: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

Mol. BioSyst. This journal is c The Royal Society of Chemistry 2013

Nuclear Overhauser effects across disulfide bonds have beenused in NMR studies,13 but ambiguity in specific resonanceassignments for the CbH2 protons of cysteine residues restrictsits broad use.14 Similarly, the success of the protocols ofstepwise reduction and subsequent sequencing (mass spectro-metric or Edman) is largely restricted to smaller peptides and islimited by the difficulties in achieving homogenous, partiallyreduced states.15–17

We have recently described an approach to disulfide con-nectivity assignments using fragmentation of disulfide bondedpeptides.18 This method relies on the unique modes of frag-mentation of disulfide bonds to yield dehydroalanine (AD),persulfides and thioaldehydes, with a lesser amount of thelatter, under positive ion CID conditions.18–27 However, themanual analyses of the fragmentation patterns to identifyproduct ions, which are key to the assignment of disulfideconnectivity, is non-trivial, owing to the complexity of cystinefragmentation and internal losses observed under collisioninduced dissociation (CID) conditions. Also, difficulties infragmenting large polypeptides and subsequent complexity indata analyses limit the applicability of this method.

For proteins, numerous efforts have been made to determinedisulfide connectivity through MS profiling of proteolytic digests;the critical prerequisite for success being the requirement tosequester each cystine (disulfide bonded Cys pair) crosslink in asingle proteolytic fragment.7,28–32 This is not always achieved inpractice, illustrating the limitations of the method. Considerableefforts have been made in the computational assessment ofdisulfide connectivity from the mass spectral data that resultedduring the early development of the DISULPHIDE algorithm.33

Subsequently, the Fenyo disulfide algorithm34 and its modifiedversion were also developed.35 But these methods again rely onthe proteolytic separation of cystine containing peptide frag-ments. Subsequent advancements allow assignment of theBiemann type (b & y) ions present in MS/MS spectra of disulfidecrosslinked peptides, which are generated through the cleavageof the peptide bonds residing outside the disulfide loop.36–41

Singh and co-workers have recently developed an interestingapproach along with an automation protocol termed MS2DB.39,40,42

The algorithm calculates all the b and y type ions (Biemann typeions) for peptide containing multiple chains that are held togetherby the disulfide bonds. Such approaches only allow annotation ofthe Biemann type (b & y) ions, generated through the cleavage ofthe peptide bonds residing outside the disulfide loop, presentin the MS/MS spectra of disulfide crosslinked peptides. Assignmentof such ions may be adequate in identifying the proteolytic peptidesthrough the identification of the residues residing outside thedisulfide loop. But determination of disulfide connectivity fromCID MS/MS spectra of peptides with multiple disulfide bondsrequires a complete assignment of all the ions through theinvestigation of all the modes of fragmentation. Indeed, in thepresent study the ions, which are diagnostic to a particularconnectivity, arise through further complex modes of fragmen-tations, as shown in Fig. 1 and discussed in the Results andDiscussion section. Some of these existing methods are sum-marized in Table S1, (ESI†).

A critical feature of the disulfide crosslink is its ability tofragment under CID conditions. A disulfide bond can fragmentin two probable pathways, yielding four possible residuemasses for a resultant modified-Cys residue.18,27 Further, sub-sequent cleavage of the peptide bonds within the disulfide loopis also observed, promoting internal losses of residues. Theassignment of product ions derived from these specific modesof cleavage for disulfide bonded polypeptides under massspectrometric conditions (summarized in Results and Discus-sion and Fig. 1) needs to be used in the development of ageneral procedure for determination of disulfide pairingschemes in polypeptides containing multiple disulfide bonds.

Here, we show that unambiguous determination of disulfidetopology can be achieved through direct tandem mass spectro-metric CID fragmentation in an ion trap instrument, whichgenerates fragment ions bearing the signature of a particularS–S connectivity, and subsequent rapid assignments of thesefragment ions using a newly developed algorithm, DisConnect(open source, freely distributed software). DisConnect useschemical information on modes of disulfide fragmentation,contributing to its robustness and efficiency. The method isapplicable, in principle, to both peptides and proteins,irrespective of size, sequence and disulfide topology. Thisreport demonstrates the application of the methodology toproteins, peptide hormones and peptide toxins. Further, thecharacterization of all the disulfide foldamers (isomers withdifferent disulfide connectivity) of a synthetic analog of peptidetoxin illustrates the potential of the method in the analysis offolding mixtures.

Fig. 1 Different fragmentation pathways of disulfide containing peptides underCID conditions. Paths A and B show the two probable modes of disulfidefragmentation under CID conditions, generating four probable residue masses.Path A is prevalent under positive ion CID conditions. Paths C–E show probableevents involving backbone amide bond fragmentation.

Paper Molecular BioSystems

Dow

nloa

ded

by I

ndia

n In

stitu

te o

f Sc

ienc

e on

08

Mar

ch 2

013

Publ

ishe

d on

08

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2553

4DView Article Online

Page 3: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

This journal is c The Royal Society of Chemistry 2013 Mol. BioSyst.

Results and discussionModes of fragmentation

Under CID conditions, the disulfide bonds and the backboneamide bonds are the two probable sites of fragmentation forthe disulfide bonded polypeptides. The disulfide bonds canfragment in two possible pathways: yielding dehydroalanine(69 Da) and cysteinepersulfide (135 Da) (Path A, Fig. 1) orcysteine (103 Da) and cysteinethioaldehyde (101 Da) (Path B,Fig. 1). Path A is prevalent under positive ion CID conditions.For each fragment ion structure, These modified Cys residuemasses are indicated in the DisConnect output by the notationM1 | M2 | M3 | M4, where M1–M4 corresponds to the totalresidue masses of dehydroalanine, cysteinepersulfide, cysteine-thioaldehyde and cysteine in the ion, respectively. For example,138 | 270 | 0 | 0 implies that there are two dehydroalanines(2 � 69 = 138) and two cysteinepersulfides (2 � 135 = 270) inthat fragment ion. Alternatively, all the native disulfide cross-links may be retained in the fragment ion if the number ofcysteines present is equal to that in the parent peptide(depicted as ‘All Cys Connected’).

The fragmentation at the backbone amide bonds may againyield two types of ions (as summarized for a model sequence inFig. 1). While dissociation of the amide bonds outside thedisulfide loops generate conventional b and y type ions (Path C,Fig. 1), fragmentation within the disulfide loop dissects thepeptide into chains that are held together by disulfide bridge/s[the symbol ‘‘|’’ separates these peptide chains in our repre-sentation] (Path D, Fig. 1). In general, for a fragment ion withn peptide chains, there should at least be (n � 1) disulfidebridge/s. For example, in the model peptide shown in Fig. 1, thecleavage at the R–I bond dissects the peptide chain into twothat are held together by one disulfide bond. Owing to itsidentical m/z value as that of the precursor, this ion gets furtheractivated under ion trap CID conditions. This can either gen-erate the b and y ions from individual peptide chains (Path E,Fig. 1), or trigger a dissociation at the disulfide bond (Path A/Bas previously discussed) resulting in four probable residuesmasses for each cysteine residues. Understandably, for poly-peptides with multiple disulfide bonds, these modes of frag-mentation together may generate a highly complicated spectralpattern.

The process of determination of disulfide connectivity relieson the success of sequestration of individual cystines (disulfidebonded cysteine pair); mass spectrometric gas phase fragmen-tation and solution phase proteolysis being the two availablemethods. In favourable cases, gas phase CID fragmentationalone can perform the task. For larger systems, especially forproteins, which may be difficult to fragment in an intact state,the problem can be sized down through an initial proteolysis.The disulfide connectivity for the cystines exclusively present ina proteolytic fragment is evident from the MS mapping alone.However, unambiguous disulfide connectivity assignments forproteolytic fragments that contain multiple cystines requirefurther cleavages. These subsequent cleavage/s may be achievedthrough CID MSn experiments. The combination of both of

these processes permits determination of the overall disulfideconnectivity. A brief outline of this procedure is provided inScheme 1 and a further detailed flow-chart is provided in Fig. S1,(ESI†). In both cases, the MSn fragment ions and/or the proteolyticions can be queried through DisConnect for their structureassignments.

It is to be noted that throughout the manuscript we haveexplicitly used ion trap CID fragmentation modes. Ion trap CID,in essence, is resonance activation, i.e. molecules with a m/zvalue the same as that of the isolated precursor get activated.During this process, for molecules with multiple disulfidebonds, an initial fragmentation at the peptide bonds withinthe disulfide loop can produce daughter ions with m/z valuesthe same as those of the original precursor. Hence furtheractivation occurs under these CID conditions, facilitating suc-cessive activation.

Assignment of disulfide connectivity in natural peptides usingDisConnect

The application of this method in the rapid determination ofdisulfide folds in natural peptides is illustrated below using theexamples of a peptide hormone and a peptide toxin frommarine cone snails.

Uroguanylin (NDDCELCVNVACTGCL)

Uroguanylin is a 16 residue two disulfide bonded peptidehormone which is an antagonist of the guanylyl cyclase

Scheme 1 The flowchart of the working protocol of DisConnect.

Molecular BioSystems Paper

Dow

nloa

ded

by I

ndia

n In

stitu

te o

f Sc

ienc

e on

08

Mar

ch 2

013

Publ

ishe

d on

08

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2553

4DView Article Online

Page 4: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

Mol. BioSyst. This journal is c The Royal Society of Chemistry 2013

receptor GC–C. Two major fragment ions are obtained uponCID fragmentation of the singly charged species (m/z 1667.7) ofthe peptide (Fig. 2a). The structures of these ions, derivedthrough DisConnect, are shown in Fig. 2b. While the C-terminalloss of Leu gives rise to the ion at m/z 1536.5 (b15), theN-terminal loss of the [N–D–D] tri-peptide yields an ion at m/z1323.5 (y13) (Fig. 2c). Here, through the first generation of CIDfragmentation (MS2), none of the cystine pairs could be sepa-rated. For an unambiguous disulfide assignment, the y13 ion issubjected to further CID fragmentation (MS3, inset to Fig. 2a).Beside the intense H2O loss from the parent ion (1305.5 m/z),two other fragment ions corresponding to m/z 1192.4 and 961.4are observed. Subsequently, the structures for these ions areobtained through DisConnect by querying them against thestructure of the y13 ion (Fig. 2b). The pathway of generationof these MS3 ions from its precursor is summarized in Fig. 2c.The ion 1192.4 m/z originates through the loss of a C-terminalLeu residue from the y13 ion. The structure of the ion at m/z961.4, which holds the key to the complete disulfide connecti-vity determination, can be generated from the parent ion bymultiple backbone cleavages. An initial cleavage of the C1–Ebond dissociates the y13 ion into two peptide chains that areheld together by two disulfide bonds. This product and theparent ion (y13) have identical masses, permitting furtheractivation in the ion trap CID fragmentation conditions. Sub-sequent cleavage at the A–C3 bond can break the productfurther into three peptide chains with the mass of the overallion still being identical to that of the parent ion (y13). Finally, acleavage of the G–C4 bond generates the ion at m/z 961.4.

The presence of two peptide chains in this ion (961.4 m/z)necessitates the two cysteine residues, one from each chain, tobe disulfide bonded. This directly establishes the C2–C4 con-nectivity. Thus, the overall disulfide connectivity in uroguanylinis established as C1–C3/C2–C4 (schematically summarized inFig. 2c). It should be noted that the widely used Biemann–Roepstorff nomenclature for assigning fragment ions in MS/MSspectra,43 cannot be easily applied to describe all the disulfidelinked fragment ions. Calvete et al. have previously proposed anomenclature system where peptide chains were termed inGreek letters and the position of cleavage were denoted bythe number followed.44 Although in this manuscript we haveshown chemical structures for each ion, this previously proposedstrategy, which can be further modified to indicate modified Cysresidue types, can provide an alternative approach.

Ar1446 (CCRLACGLGCHOCC*; O, Hydroxyproline; *, amidatedC terminus)

Ar1446 is a potential neurotoxic peptide obtained from thevenom of a marine cone snail, with three disulfide bonds and15 probable disulfide scaffolds (Fig. S2, ESI†). Upon trypsindigestion, a mass increase of 18 Da establishes the presence ofArg within a disulfide loop (Fig. S3, ESI†). This eliminates thepossibility of C1–C2 connectivity (foldamer [1]-[3]), leaving12 possible isomers (Fig. S4, ESI†). CID MS2 spectrum of thedoubly charged trypsin nicked molecule (Fig. 3a) aids furtherassignment. The structures of the major ions are determinedthrough DisConnect (Fig. S5, ESI†). The ion at m/z 622.72+ isshown to consist of two peptide chains. In order to hold these

Fig. 2 (a) CID MS2 spectrum of singly charged species (m/z 1667.7) of uroguanylin. The inset shows the MS3 spectrum of the ion at m/z 1323.5. The chemicalstructures of the major fragment ions are also shown. (b) DisConnect derived structure for each of the fragment ions. For every structure, Cys residues withindeterminate connectivity are indicated by a wavy line. In structures that establish a particular connectivity, the corresponding Cys residues are shown bonded. Here,the MS3 product ion at m/z 961.4 contains one Cys in each of the two peptide chains, necessitating a disulfide crosslink between them. (c) Schematic summary of theanalysis. The three dimensional fold is obtained from an NMR structure (PDB ID: 1UYA) and represented using PyMOL.54

Paper Molecular BioSystems

Dow

nloa

ded

by I

ndia

n In

stitu

te o

f Sc

ienc

e on

08

Mar

ch 2

013

Publ

ishe

d on

08

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2553

4DView Article Online

Page 5: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

This journal is c The Royal Society of Chemistry 2013 Mol. BioSyst.

two peptide chains together, C2 must be connected to one ofthe three Cys residues in the second chain (C3/C4/C5) (Fig. 3a).Similarly, for the ion at m/z 531.1, C1/C2 is connected to C6.Since C2 is connected to either C3/C4/C5, so in 531.1 C6 must beconnected to C1. This establishes one of the three cysteinepairings (C1–C6).

For the four remaining Cys residues, there are three possibleconnectivity patterns, C2–C3/C4–C5 [structure I], C2–C5/C3–C4

[structure II] and C2–C4/C3–C5 [structure III] (Fig. 3b). To resolvethis connectivity pattern, we analysed the MS3 fragmentationspectrum of the MS2 ion at m/z 622.72+ (inset of Fig. 3a). Besidesthe 34 Da loss, corresponding to a H2S molecule, two otherpeaks are observed at m/z 594.22+ and 554.12+ which corre-sponds to a loss of His and Gly, respectively. This internal lossof His requires cleavage of the C4–H and H–O peptide bonds.In structure II (C2–C5/C3–C4 connectivity) these bonds are notinside a disulfide loop. Hence a backbone fragmentation at anyof these two bonds would result in a separation of the peptidechain into two ions, in contrast to an internal loss of His.

Similarly, the internal Gly loss (594.22+ m/z) cannot takeplace from the C2–C3/C4–C5 connectivity [structure I]. Fragmen-tation at the C-terminal or N-terminal peptide bonds of any of

the two Gly residues would not result in an internal loss Gly buta separation of the peptide chain into two ions. Hence it is onlyC2–C4/C3–C5 (structure III) which is consistent with the internalloss of both Gly and His residues. This establishes the overallconnectivity of Ar1446, as C1–C6/C2–C4/C3–C5.

These results highlight the utility of this method in thedetermination of disulfide connectivity in various classes ofpolypeptides through direct fragmentation of the native mole-cules. Characterization of novel natural peptides, especiallypeptide toxins, has been largely facilitated by the applicationof MS methodologies.5,45–47 This method presents a rapidmeans for the determination of disulfide folds that are respon-sible for the recognition at the receptor sites.

Dissection of the disulfide folding problem using DisConnect

Disulfide bonded peptides are also frequent targets forchemical synthesis in view of the high pharmacological activityof peptide toxins.48,49 Characterization of distinct disulfidefoldamers provide a direct means for monitoring the kineticsof disulfide bond formation and also product distribution atthermodynamic equilibrium.50,51 The methodology describedabove also provides a rapid way of characterising disulfidebonded isomers in foldameric mixtures. This is exemplifiedby the efficient discrimination of all possible disulfide foldamersof a synthetic peptide (GNWCCSARVCC), which is an analogue ofa natural peptide toxin (Mo1277) from a marine cone snail(Conus monile),52 without a priori knowledge of their disulfidetopology.

The linear peptide, containing four free Cys residues, wasfolded under DMSO oxidising buffer conditions. The foldingreaction was quenched using formic acid and the foldamers areseparated using HPLC analysis of the folding mixture (Fig. S6,ESI†). These separated molecules, termed as F1–F3, representthe equilibrium population of the oxidative folding pathway ofthe dodecameric peptide. Subsequent mass spectral analysisestablished the formation of all the three foldamers (peak1,foldamer F-1; peak2, foldamer F-2; peak3, foldamer F-3, ESI†Fig. S6). Disulfide connectivity in these foldamers was estab-lished through the CID MSn experiments performed on theindividual HPLC fractions (Fig. 4a–c and DisConnect outputssummarized in Fig. S7, ESI†).

Among the fragment ions, m/z 1026.3 and 840.2 are commonto all foldamers and were queried using DisConnect. The ion atm/z 1026.3 arises from a loss of the N-terminal [G–N] fragmentresulting in a y9 ion and a terminal loss of [G–N–W] yields840.2 m/z (y8) (Fig. 4a–c). Further inspection reveals thepresence of an ion at m/z 1098.3 only in the MS2 spectra ofF-1 and F-3, but not in F-2. This ion corresponds to an internalloss of Val from the intact parent peptide ion. This internal lossrequires an initial cleavage at the R–V or V–C3 bond. For bothC1–C3/C2–C4 and C1–C4/C2–C3 connectivity, this will result in adissociation of the parent ion into two peptide chains, withoverall masses identical to the original precursor. Subsequentactivation of this ion facilitates cleavage of the V–C3 or R–Vbond, resulting in an internal loss of Val. As explained above,this is possible only when the Val is inside a disulfide loop,

Fig. 3 (a) CID MS2 spectrum of the tryptic Ar1446 (733.32+ m/z). Inset showsthe MS3 spectrum of the ion at m/z 622.72+. The structures of the major ions arealso written and the key structures are highlighted in a box. For every structure,Cys residues with indeterminate connectivity are indicated with the wavy lines.In the structures where a particular Cys connectivity is evident, the connected Cysresidues are joined through a dashed line. (b) Schematic analysis of the determi-nation of the disulfide connectivity of Ar1446.

Molecular BioSystems Paper

Dow

nloa

ded

by I

ndia

n In

stitu

te o

f Sc

ienc

e on

08

Mar

ch 2

013

Publ

ishe

d on

08

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2553

4DView Article Online

Page 6: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

Mol. BioSyst. This journal is c The Royal Society of Chemistry 2013

as otherwise the cleavage of the R–V or V–C3 bonds would yieldb and y type ions (Fig. 4e). Hence the connectivity C1–C2/C3–C4

is ruled out for peptide fractions F-1 and F-3. F-2 can thus beassigned to this connectivity. Additional evidence in favour ofthis conclusion comes from the ions at m/z 975.3, 876.3and 720.2 which are exclusive to the MS2 spectrum of F-2.The b ion type structures of these fragment ions can only arisefrom C1–C2/C3–C4 connectivity where these residues are outsidethe disulfide loop. This establishes the connectivity of F-2 asC1–C2/C3–C4.

A distinction between the peptide fractions F-1 and F-3 isachieved by the MS3 fragmentation of the ion at m/z 636.2 forboth the foldamers. At this stage a choice between two alter-native disulfide pairing schemes, C1–C3/C2–C4 and C1–C4/C2–C3,is available for assignment. DisConnect provides three possiblestructures for the ion at m/z 636.2. Fig. 4d shows the MS3 spectraof this ion from F-3. These fragment ions are queried against allthe three probable structures of 636.2 (Fig. 4d). As shown, it isonly structure-I (C2SARV|C4) that can explain the generation ofthe queried MS3 product ions. The key ion permitting thisassignment is at m/z 537.2. The terminal loss of the Val residue,

which results in this ion, cannot take place from the other twostructures of 636.2. Structure I, assigned to F-3, implies theC2–C4 connectivity. This establishes the overall C1–C3/C2–C4

connectivity for F-3.This assigns the C1–C4/C2–C3 connectivity to the remaining

foldamer F-1. This assignment is further confirmed by theanalysis of the MS3 spectrum of 636.2 from F-1, as shown inFig. S8b (ESI†). In this case, all the fragment ions are explain-able from the alternative structure (III) of 636.2. Fig. 4e sum-marizes the analysis of the MS2 and the MS3 spectra.

Here in Fig. 4d, at a first glance, the generation of the ion atm/z 450.1 through a subsequent loss from the product at m/z537.1 may appear unreasonable as per the ion trap fragmenta-tion mechanism, which only activates the ion with same m/z asthat of the precursor. But it should be noted that this loss ofC-terminal C4 residue in the dehydroalanine form [1 Da (N-termproton) +69 Da (residue mass of dehydroalanine) +17 (C-termOH group) = 87 Da] is through a side chain thiol loss from the537.2 m/z ion. Such side chain thiol losses resulting in neutralloss of 66 or 34 Da are prevalent in disulfide containingpeptides, even under ion trap CID conditions.18,22 In our case,

Fig. 4 (a)–(c) CID MS2 spectra of the (M + 2H)2+ species (m/z 599.2) of foldamers F-1, F-2 and F-3, respectively. (d) MS3 fragmentation spectrum of the ion at m/z636.2, obtained by isolating the ion from the MS2 spectrum of F-3. DisConnect derived structures of the key MS3 ions for the parent ion structure 636.2 (III) are alsoshown in the figure. DisConnect outputs of these MS3 fragment ions from the other two structures of 636.2 (I, II), which cannot account for the neutral loss of Val, arealso shown. (e) Schematic analysis of the determination of the disulfide connectivity for the foldamers.

Paper Molecular BioSystems

Dow

nloa

ded

by I

ndia

n In

stitu

te o

f Sc

ienc

e on

08

Mar

ch 2

013

Publ

ishe

d on

08

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2553

4DView Article Online

Page 7: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

This journal is c The Royal Society of Chemistry 2013 Mol. BioSyst.

since the disulfide bond holds the peptide chains together,this side chain thiol loss is accompanied by the loss of anamino acid.

Assignment of disulfide connectivity in proteins usingDisConnect

The method is also applicable to proteins, where direct frag-mentation may be difficult under CID conditions, through aninitial proteolysis and subsequent tandem MS experiments.In some cases, proteolysis alone may separate all the singledisulfide containing fragments. But this approach requires afavourable positioning of Cys residues and the proteolytic sites,severely limiting its general applicability. A general methodintegrating both solution and gas phase fragmentation isdescribed below.

a-Lactalbumin

The tryptic profile of a-lactalbumin, a 123 residue protein withfour disulfide bonds, exhibits the presence of 3 non-cysteine[1-3] and 3 cystine [4-6] containing peptides (Fig. 5a andFig. S9a–c, ESI†). Structures for each of these proteolytic frag-ments are obtained through DisConnect. Inspection of thesestructures illustrates that [4] and [5] have one cystine each, thusunambiguously assigning two of the four disulfide bonds.

Peptide [6] (Fig. 5a), containing four cysteines, holds the keyto the assignment of the two remaining S–S linkages. Thepresence of a large number of Asp residues in this peptidewarranted subsequent proteolysis by Asp–N for furtherreduction of charge and size ([7], inset to Fig. 5b), facilitatingmass spectrometric fragmentation. An unambiguous determi-nation of S–S connectivity in [7] is achieved through subsequentCID fragmentation (Fig. 5b). The structure of some of the keyions and the corresponding DisConnect output is shown inFig. 5b–c. The structure of fragment ion at m/z 1072.42+ consistsof two peptide chains, containing one Cys each (C2 and C4).This necessitates a disulfide bond between C2–C4, establishingan overall disulfide crosslink pattern of C1–C3/C2–C4.

Further proof of this connectivity comes from an analysis ofthe structure of ion at m/z 668.4. It is worth mentioning thatthere are two probable structures for 668.4 and 581.4 and aunique possibility for 515.4. Interestingly, 515.4 is related to theion at 581.4 m/z by a loss of H2S2 (66 Da) and can only occurfrom structure (II) of 581.4 (the H2S2 loss is not possiblefrom the other structures as all the Cys residues are disulfidelinked). Subsequently, this structure of 581.4 can only arisefrom structure (I) of 668.4. The parent–product relationshipamong these ions is further established by the MS3 fragmenta-tion of 668.4 that shows the presence of both 581.4 and 515.4

Fig. 5 (a) ESI-MS of the three cysteine containing tryptic peptides of lactalbumin. (b) CID MS2 spectra of (M + 3H)3+ species of peptide [7] generated throughsuccessive proteolysis by trypsin and Asp-N (m/z 966.8, MS shown in inset), (c) Schematic summary of the analysis. The structures of the key ions, derived throughDisConnect, are also shown. The three dimensional fold is obtained from an X-ray structure (PDB ID: 1F6S) and represented using PyMOL.54

Molecular BioSystems Paper

Dow

nloa

ded

by I

ndia

n In

stitu

te o

f Sc

ienc

e on

08

Mar

ch 2

013

Publ

ishe

d on

08

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2553

4DView Article Online

Page 8: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

Mol. BioSyst. This journal is c The Royal Society of Chemistry 2013

(Fig. S8d, ESI†). The structure of 668.4 demands a disulfide linkbetween C1 and C3. Hence, the connectivity of [7] is C1–C3/C2–C4

(here the numbering of Cys residues ‘Ci’ is according to theirorder in the proteolytic peptide [7] sequence) and the overallS–S connectivity of a-lactalbumin is Cys1–Cys8/Cys2–Cys7/Cys3–Cys5/Cys4–Cys6 (here the numbering of Cys ‘Cysi’ residuesis according to their order in the protein primary sequence).This analysis is schematically summarized in Fig. 5d.

It is worth noting that a possibility of thiol disulfide inter-change may arise during proteolysis. Since DisConnect matchesthe experimentally obtained masses against peptides generatedfrom all possible disulfide connectivity patterns among theavailable Cys residues, such events will undoubtedly bereflected in the mapping of the proteolytic masses throughDisConnect. Controlling the pH of proteolysis is an effectivemeans of suppressing thiol disulfide exchange reactions thatare generally facile at pH values approaching the pKa of freethiols (B8.0). Hence, in cases where thiol disulfide interchangereactions are observed, proteolysis at much lower pH (Glu–C atpH 4.0 or trypsin at 6.5) would prevent such artifacts. Alter-natively, in molecules with free thiols, a thiol-selective protec-tion can be used under the proteolytic conditions.16

Further application on proteins is illustrated taking otherexample in the ESI† section of the manuscript (Fig. S10–S13,ESI†).

Conclusions

The key to disulfide connectivity determination is the ability tosequester fragments containing single disulfide bonds throughMS fragmentation. Owing to the complexity of the probablemodes of fragmentation, manual interpretation of the data isoften prohibitively tedious. DisConnect allows a rapid assign-ment of the probable structures of the fragment ions. Anyambiguity arising from multiple probable structures for afragment ion can be resolved by querying the next generationfragment ions against all the probable structures of the pre-cursor. However, it is worth emphasizing that only a specific setof fragment ions are critical to the determination of disulfideconnectivity. The method requires a direct MS fragmentation ofthe native disulfide bonded polypeptides and subsequentanalysis using DisConnect. Technical difficulties involvingdirect fragmentation of proteins can be surmounted by initialproteolysis and subsequent MSn strategies. To summarize, thedescribed methodology, in conjunction with the algorithmDisConnect, is a general tool for rapid disulfide connectivityassignment in peptides and proteins. The approach is ofspecific utility in unambiguous determination of disulfidelinkages in peptide toxins, establishing native cysteine connec-tivity in synthetic polypeptides and can be valuable in char-acterizing non-native disulfide bonded intermediates that areformed in the oxidative folding pathways of disulfide linkedpolypeptides.53 Besides assigning fragmentation spectrum ofdisulfide intact polypeptides, DisConnect is likely to have adirect application in proteomic identification of disulfidebonded proteins, without requiring the conventional steps of

chemical reduction and alkylation. The program DisConnectcan be downloaded from: http://mbu.iisc.ernet.in/Bpbgrp/DisConnect.htm

Materials and methodsProteolytic digestion

To 5 ml of 100 mM ammonium bicarbonate buffer (pH 7.8),containing 1 mg of a-lactalbumin, trypsin was added at 100 : 1(protein : protease) ratio. The reaction was allowed to proceedfor 8 hours at 37 1C. Further Asp–N digestion was performed byadding the protease to this tryptic digest solution at 30 : 1(protein : protease) ratio. For Asp–N proteolysis, the reactionwas allowed to proceed for 6 hours at 37 1C. a-lactalbumin(UniProt_id P00711) is obtained from Sigma (St. Louis,MO, USA). Trypsin and Asp–N are obtained from Promega(Madison, WI, USA).

Mass spectrometric protocol

Proteolytic peptides were analyzed through LC-MS analysis,performed using an H2O–acetonitrile (with 0.1% formic acid)solvent system, using a HCT-Ultra ETD II (Bruker Daltonics,Bremen, Germany) ion trap mass spectrometer coupled to anAgilent 1100 HPLC. The LC gradient was typically set from 90%H2O to 90% acetonitrile in a linear gradient over a 50 minuteruntime, at a flow rate of 0.2 ml min�1. During the LC-MS runthe spectra were averaged over four scans; the dry gas and thenebulizer were set at 10 l min�1 and 30 psi, respectively. Themass spectrometric experiments on the three foldamers ofdesbromo Mo1277, Ar1446 and uroguanylin were performedthrough direct injection of the peptides. During the directinjection, the sample was infused through an injector pump(Cole Parmer, Vernon Hill, USA) at a flow rate of 2 ml min�1. Forboth the LC-MS and direct injection experiments, the CIDexperiments were performed through the collision with Hegas and varying the fragmentation amplitude (Vp–p) between1 and 3.

Protocol of synthesis of foldamers

The linear peptide was synthesized using standard Fmocchemistry. The folding reaction was carried out by dissolving5 nmol of linear peptide in 200 mL of oxidation buffer containing100 mM NH4HCO3 (pH 8.0) and 10% of dimethyl sulfoxide(DMSO). The progress of the reaction was monitored usingmass spectrometry and quenched after 24 h by acidificationwith formic acid (10% final concentration). The reaction mixturewas subjected to C18 analytical column, peptides were elutedover a linear gradient of 20–26% acetonitrile and fractions weredetected at 226 nm.

DisConnect: algorithm for automated assignment of disulfideconnectivity

For the rapid determination of disulfide crosslinks, we intro-duce the freely distributed open source program packageDisConnect that identifies the probable structures of specificproteolytic peptides and CID MSn fragment ions for multiple

Paper Molecular BioSystems

Dow

nloa

ded

by I

ndia

n In

stitu

te o

f Sc

ienc

e on

08

Mar

ch 2

013

Publ

ishe

d on

08

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2553

4DView Article Online

Page 9: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

This journal is c The Royal Society of Chemistry 2013 Mol. BioSyst.

disulfide bonded molecules. Besides the peptide bond cleav-ages that yield Biemann type ions (b and y), DisConnect alsoconsiders fragmentation within the disulfide loop and at thedisulfides (see Results and discussion and Fig. 1).

Description of the algorithm

I. Analysis of proteolytic fragments. The input polypeptidesequence is subjected to an in silico specific proteolytic diges-tion (with an option of large number of specific proteases andtheir combinations, as well as user defined proteases). The m/zvalues of the theoretical linear peptides that do not contain Cysare directly matched against the experimentally obtained MSprofile. The theoretical Cys containing linear proteolytic pep-tides are used to generate all chemically possible combinationsof intermolecular disulfide bonded peptides. The occurrence ofintra-molecular disulfide bonds within a proteolytic peptide isalso considered in the algorithm. The m/z values of thesetheoretically obtained Cys containing combinations of peptidesare matched against the experimentally obtained masses(MS profile) to infer the probable structures of the proteolyticfragment ions.

II. Analysis of the MSn fragment ionsa. Combination generating code. Input sequence was coarse-

grained by re-writing the contiguous stretch of peptides as asingle entity ‘Pi’ interspersed by the Cys residues. For example,an input sequence CALDGXCGHTCC is represented asC1P1XC2P2C3C4, where P1 = ALDG, P2 = GHT and Ci = Cys(multiple peptide chains are separated by X). All chemicallypossible combinations of C1P1XC2P2C3C4 are computed(combinations containing X are filtered out) to theoreticallygenerate all possible MSn fragments.

Theoretical fragments that contain only Cys residues (Cys),do not undergo any further combinatorics. For those with onlynon-Cys residues (i.e. only Pi), all contiguous combinations ofthe residues are taken. For example, for a theoretical fragmentALDG, the possible combinations are A, L, D, G, AL, LD, DG,ALD, and LDG. In theoretical fragments containing both Ci andPi, all Pi should be covalently linked with at least one of itsoriginal neighboring Cys residues (Ci�1/Ci+1) for chemical feasi-bility of obtaining such a fragment mass spectrometrically. Forexample, both P1P2 and P1C3 are filtered out whereas P2C3and C1P1P2C3 are retained. The segments C1P1 and P2C3 inC1P1P2C3 are held together through C1–C3 disulfide bond.

Once these computed fragments were generated and initi-ally filtered from the coarse grain representation, we substi-tuted the corresponding sequence of each ‘Pi’ in thesecombinations. Simple chemical logic is applied to filter outthe chemically implausible combinations of Pi to be appendedto these coarse-grained combinations, hence making the pro-gram time effective. In a particular combination, if Pi is presentwith a Ci on the right, then the possible combinations for itsresidues can only be of the y ion type (C to N-terminus).Similarly for a Pi with Ci on the left, only b ion type combi-nations of residues are possible (N to C-terminus). ForC1P1P2C3, the possible combinations for P1 are A, AL, ALD,and ALDG whereas for P2, the possibilities are T, HT, and GHT.

So the total number of combinations for C1P1P2C3 will be4 � 3, i.e., 12.

However, for Pi flanked by two neighboring Ci, probablemodes of combination for the residues will be more complex.Here, both b and y ion type fragments along with thosegenerated from contiguous loss of internal residues are feasi-ble. Mathematically, if Pi is n residues long, then there will ben b and y ion type fragments each, adding up to (2n � 1).Moreover loss of internal residues ranging from length 1 ton � 2 needs to be considered. For example, for a combinationC1P1C2, the probable combinations for P1 will be A, AL, ALD,ALDG, G, DG, LDG, AG, ADG and ALG. However, for combina-tions such as C1P1C3, the possibilities for P1 can only be of theb ion type as here P1 is covalently linked by a peptide bond onlyto C1 and no loss of internal residues is chemically possible.

b. Mass calculation code. For every possible theoreticalcombination, the sum of the residue masses was calculated.The mass calculation takes into account all permutations of themodified-Cys residue masses generated by disulfide bondfragmentation (Fig. 1, Paths A and B). The program offers twomodes for mass calculation: Smart mode in which only Path Aderived modified-Cys residue masses are considered, andRigorous mode, which takes into account all four modified-Cys residue masses (Paths A and B). The C-terminal status ofthe polypeptide is also taken into consideration. The experi-mentally obtained m/z values for each MSn fragment ions arecompared and matched with these calculated residue massvalues.

c. Using fragmentation chemistry. For chemical feasibility of acombination, n discontinuous peptide segments (peptidechains) must be bridged by (n � 1) disulfide bridge/s. Hencemulti-segment combinations that did not have 2(n � 1) Cysresidues, with at least one Cys in each segment, were filteredout. An extensive calculation and filtration based on thedetailed cystine fragmentation chemistry was performed onthe remaining combinations.

d. Ion trap data filtering. A filter has been implemented tofilter out the structures that are incompatible with ion trap CIDfragmentation. It may be emphasized that in an ion trap, onlyions that have m/z values similar to that of the originalprecursor ion are activated. Hence successive losses throughpeptide bond cleavages are not mechanistically feasible. Thiscriterion is encoded in the program. It should be noted that indisulfide bonded peptides, cleavage of a peptide bond within adisulfide loop only dissociates the precursor polypeptide intotwo peptide chains. But these peptide chains are still heldtogether through the disulfide bond and hence have m/z valueidentical to that of the original precursor. So in this case,successive peptide bond activation under ion trap CID condi-tions is feasible. It is only when an initial peptide bond cleavageresults in a loss of residue/s subsequent losses from thisproduct ion are not expected.

In this report, all the structures of the fragment ions arederived using the ‘smart’ mode of DisConnect with a mass

Molecular BioSystems Paper

Dow

nloa

ded

by I

ndia

n In

stitu

te o

f Sc

ienc

e on

08

Mar

ch 2

013

Publ

ishe

d on

08

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2553

4DView Article Online

Page 10: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

Mol. BioSyst. This journal is c The Royal Society of Chemistry 2013

window of 0.2 Da and ion trap data filtering. A flowchartsummarizing the workflow of DisConnect is shown in Scheme 1.A further detailed flowchart, and input/output formats, is pro-vided in the ESI† section (ESI† Methods and Fig. S1).

Acknowledgements

The authors are grateful to Prof. K. S. Krishnan (NCBS, Bangalore,India) for initiating the conus venom project that raised thedisulfide connectivity problem. The authors thank Prof. SandhyaVisweswariah (IISc, Bangalore) for the uroguanylin sample.KG and MB acknowledge CSIR, Govt of India, for fellowships.KHG acknowledges the INSPIRE faculty fellowship, DST,Govt of India. The mass spectrometric facility and the conuspeptide project are funded by DBT, Govt of India.

References

1 J. M. Thornton, J. Mol. Biol., 1981, 151, 261–287.2 R. Wetzel, L. J. Perry, W. A. Baase and W. J. Becktel, Proc.

Natl. Acad. Sci. U. S. A., 1988, 85, 401–405.3 D. Fass, Annu. Rev. Biophys., 2012, 41, 43–59.4 L. J. England, J. Imperial, R. Jacobsen, A. G. Craig, J. Gulyas,

M. Akhtar, J. Rivier, D. Julius and B. M. Olivera, Science,1998, 281, 575–578.

5 B. M. Ueberheide, D. Fenyo, P. F. Alewood and B. T. Chait,Proc. Natl. Acad. Sci. U. S. A., 2009, 106, 6910–6915.

6 C. J. Bohlen, A. T. Chesler, R. Sharif-Naeini, K. F.Medzihradszky, S. Zhou, D. King, E. E. Sanchez, A. L.Burlingame, A. I. Basbaum and D. Julius, Nature, 2011,479, 410–414.

7 J. Li, K. Shefcheck, J. Callahan and C. Fenselau, Protein Sci.,2010, 19, 174–182.

8 N. Zhang, G. Wu, M. J. Chalmers and S. J. Gaskell, Peptides,2004, 25, 951–957.

9 H. Terlau and B. M. Olivera, Physiol. Rev., 2004, 84, 41–68.10 N. L. Daly and D. J. Craik, Curr. Opin. Chem. Biol., 2011, 15,

362–368.11 P. Mallick, D. R. Boutz, D. Eisenberg and T. O. Yeates, Proc.

Natl. Acad. Sci. U. S. A., 2002, 99, 9679–9684.12 J. R. Brown and B. S. Hartley, Biochem. J., 1966, 101,

214–228.13 J. Boisbouvier, M. Blackledge, A. Sollier and D. Marion,

J. Biomol. NMR, 2000, 16, 197–208.14 H. Liu, M. A. Boudreau, J. Zheng, R. M. Whittal, P. Austin,

C. D. Roskelley, M. Roberge, R. J. Andersen andJ. C. Vederas, J. Am. Chem. Soc., 2010, 132, 1486–1487.

15 W. R. Gray, Protein Sci., 1993, 2, 1732–1748.16 J. Wu, Y. Yang and J. T. Watson, Protein Sci., 1998, 7,

1017–1028.17 R. J. Clarke, H. Fischer, S. T. Nevin, D. J. Adams and

D. J. Craik, J. Biol. Chem., 2006, 281, 23254–23263.18 K. Gupta, M. Kumar and P. Balaram, Anal. Chem., 2010, 82,

8313–8319.19 M. F. Bean and S. A. Carr, Anal. Biochem., 1992, 201,

216–226.

20 S. Thakur and P. Balaram, Rapid Commun. Mass Spectrom.,2007, 21, 3420–3426.

21 H. I. Kim and J. L. Beauchamp, J. Am. Soc. Mass Spectrom.,2009, 20, 157–166.

22 M. Mentinova, H. Han and S. A. McLuckey, Rapid Commun.Mass Spectrom., 2009, 23, 2647–2655.

23 H. P. Gunawardena, R. A. O’Hair and S. A. McLuckey,J. Proteome Res., 2006, 5, 2087–2092.

24 J. Qin and B. T. Chait, Anal. Chem., 1997, 69, 4002–4009.25 P. E. Barran, N. C. Polfer, D. J. Campopiano, D. J. Clarke,

P. R. R. Langridge-Smith, R. J. Langley, J. R. W. Govan,A. Maxwell, J. R. Dorin, R. P. Millar and M. T. Bowers, Int. J.Mass Spectrom., 2005, 240, 273–284.

26 D. Bilusich, C. S. Brinkworth and J. H. Bowie, RapidCommun. Mass Spectrom., 2004, 18, 544–552.

27 M. Mormann, J. Eble, C. Schwoppe, R. M. Mesters,W. E. Berdel and J. Peter-Katalinic, Anal. Bioanal. Chem.,2008, 392, 831–838.

28 H. R. Morris and P. Pucci, Biochem. Biophys. Res. Commun.,1985, 126, 1122–1128.

29 R. Yazdanparast, P. Andrews, D. L. Smith and J. E. Dixon,Anal. Biochem., 1986, 153, 348–353.

30 W.-Y. Kao, J. Qin, K. Fushitani, S. S. Smith, T. A. Gorr,C. K. Riggs, J. E. Knapp, B. T. Chait and A. F. Riggs, Proteins:Struct., Funct., Bioinf., 2006, 63, 174–187.

31 J. D. Tipton, J. D. Carter, J. D. Mathias, M. R. Emmett,G. E. Fanucci and A. G. Marshall, Anal. Chem., 2009, 81,7611–7617.

32 A. G. C. Neves-Ferreira, J. Perales, J. W. Fox, J. D. Shannon,D. L. Makino, R. C. Garratt and G. B. Domont, J. Biol. Chem.,2002, 277, 13129–13137.

33 C. Caporale, C. Sepe, C. Caruso, P. Pucci and V. Buonocore,FEBS Lett., 1996, 393, 241–247.

34 D. Fenyo, Bioinformatics, 1997, 13, 617–618.35 R. Craig, O. Krokhin, J. Wilkins and R. C. Beavis, J. Proteome

Res., 2003, 2, 657–661.36 P. R. Baker and K. R. Clauser, Protein Prospector (MS-Bridge):

http://prospector.ucsf.edu.37 PROWL (PeptideMap Module).: http://prowl.rockefeller.edu.38 H. Xu, L. Zhang and M. A. Freitas, J. Proteome Res., 2007, 7,

138–144.39 W. Murad, R. Singh and TY. Yen, BMC Bioinf., 2011,

12(suppl 1), S12.40 T. Lee, R. Singh, TY. Yen and M. Bacher, Proc. IEEE. Int.

Symp. Computer-Based Med. Syst. CBMS, 2007, 397–402.41 R. Singh, Briefings Funct. Genomics Proteomics, 2008, 7,

157–172.42 T. Lee, R. Singh, TY. Yen and M. Bacher, Proc. Comput. Syst.

Bioinf. CSB, 2007, 41–51.43 K. Biemann, Methods Enzymol., 1990, 193, 886–887.44 J. J. Calvete, M. Jurgens, C. Marcinikiewicz, A. Romero and

M. Schrader, Biochem. J., 2000, 345, 573–581.45 J. A. Jakubowski and J. V. Sweedler, Anal. Chem., 2004, 76,

6541–6547.46 L. L. Tayo, B. Lu, L. J. Cruz and J. R. Yates, J. Proteome Res.,

2010, 9, 2292–2301.

Paper Molecular BioSystems

Dow

nloa

ded

by I

ndia

n In

stitu

te o

f Sc

ienc

e on

08

Mar

ch 2

013

Publ

ishe

d on

08

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2553

4DView Article Online

Page 11: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

This journal is c The Royal Society of Chemistry 2013 Mol. BioSyst.

47 C. A. Elliger, T. A. Richmond, Z. N. Lebaric, N. T. Pierce,J. V. Sweedler and W. F. Gilly, Toxicon, 2011, 57, 311–322.

48 A. M. Steiner, K. J. Woycechowsky, B. M. Olivera andG. Bulaj, Angew. Chem., Int. Ed., 2012, 51, 5580–5584.

49 A. A. Tietze, D. Tietze, O. Ohlenschlager, E. Leipold,F. Ullrich, T. Kuhl, A. Mischo, G. Buntkowsky, M. Gorlach,S. H. Heinemann and D. Imhof, Angew. Chem. Int. Ed., 2012,51, 4058–4061.

50 J. J. Ewbank and T. E. Creighton, Nature, 1991, 350,518–520.

51 B. van den Berg, E. W. Chung, C. V. Robinson, P. L. Mateoand C. M. Dobson, EMBO J., 1999, 18, 4794–4803.

52 S. S. Nair, C. L. Nilsson, M. R. Emmett, T. M. Schaub,K. H. Gowd, S. S. Thakur, K. S. Krishnan, P. Balaram andA. G. Marshall, Anal. Chem., 2006, 78, 8082–8088.

53 K. K. Khoo, K. Gupta, B. R. Green, M.-M. Zhnag, M. Watkins,B. M. Olivera, P. Balaram, D. Yoshikami, G. Bulaj andR. S. Norton, Biochemistry, 2012, 51, 9826–9835.

54 The PyMOL Molecular Graphics System, Version 1.5.0.4Schrodinger, LLC.

Molecular BioSystems Paper

Dow

nloa

ded

by I

ndia

n In

stitu

te o

f Sc

ienc

e on

08

Mar

ch 2

013

Publ

ishe

d on

08

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2553

4DView Article Online

Page 12: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

Electronic Supplementary Material for Molecular BioSystems

Rapid Mass Spectrometric Determination of Disulfide Connectivity in Peptides and Proteins

Moitrayee Bhattacharyya1#, Kallol Gupta1#, Konkallu Hanumae Gowd1$,

Padmanabhan Balaram1*

1Molecular Biophysics Unit, Indian Institute of Science, Bangalore-560012,

India

# Both the authors contributed equally

$Current address: Undergraduate Program, Indian Institute of Science,

Bangalore-560012, India

* Address for reprint requests:

Prof. P. Balaram

Molecular Biophysics Unit, Indian Institute of Science

Bangalore – 560012, India

Fax: 91-80-236060683/ 91-23600535

Phone: 91-80-22932337

E-mail- [email protected]

1

Electronic Supplementary Material (ESI) for Molecular BioSystemsThis journal is © The Royal Society of Chemistry 2013

Page 13: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

Supplementary Table S1: Summary of the mass spectrometric methods used for determination of disulfide connectivity

Mass spectrometric

methodology for Disulfide crosslink

determination

Main features Disadvantages DisConnect perspective

1. Morris R. et al (1985) 2.Yazdanparast R. et al. (1986, 1987) 3. Caporale C. et al (1996) 4. Fenyö D. et al (1997) 5. Craig R. et al (2003) (Protein Disulfide Linkage Modeller) 6. ProteinProspector (MS-Bridge) (Baker, P.R. and Clauser, K.R. http://prospector.ucsf.edu) 7. MS2DB (Murad et al (2011) (MS2DB) 8. Gupta K. et al (2010)

Comparison of MS profiles of the proteolytic digests of native and reduced protein identifies disulfide bonded peptides

Mass shift in gas phase Xe-assisted reduction of the proteolytic fragments of native protein identifies disulfide bonded peptides All possible combinations of theoretically generated linear peptide fragments, containing no more than 3 cysteines, are matched against experimental proteolytic MS profile to infer the structure of the S-S bonded peptides All possible combinations of theoretical proteolytic fragments are matched against experimental proteolytic MS profile to infer the structure of the S-S bonded peptides Implementation of a software system based on Fenyö disulfide assignment algorithm Theoretically generated proteolytic fragments are combined and matched against MS profile to obtain probable structures of S-S bonded peptides MS/MS assignments of disulfide intact peptides, annotating the Biemann type ions. MSn analyses of intact disulfide bond containing native peptides provides S-S connectivity

1. Manual assignments can be prohibitively tedious for large proteins 2. Unambiguous assignments cannot be made if multiple cystines are present in a proteolytic fragment Same as [1] 1. Unambiguous assignments cannot be made if multiple cystines are present in a proteolytic fragment 2. Cannot identify a proteolytic fragment containing more than 3 cystines 1. Unambiguous assignments cannot be made if multiple cystines are present in a proteolytic fragment 1. Unambiguous assignments cannot be made if multiple cystines are present in a proteolytic fragment 1. Unambiguous assignments cannot be made if multiple cystines are present in a proteolytic fragment 1. Complete analysis of all the modes of fragmentation cannot be achieved. Hence non Biemann-type ions, arising through fragmentation at the disulfide and inside the S-S loops cannot be annotated. 1. Tedious manual interpretation of the data 2. Probable experimental difficulties in direct fragmentation of larger proteins

• DisConnect generates specific proteolytic fragments. Non-cysteine containing fragments are matched directly against proteolytic MS profile; whereas for cysteine containing fragments, all possible combinations are queried.

• Ambiguities

arising from multiple structure hits for an experimental m/z value can be solved by querying its MS2 fragment ions against the probable hits, using DisConnect-Pep.

• For proteolytic

fragments with multiple cystines, disulfide connectivity is determined by MSn analysis, using DisConnect.

2

Electronic Supplementary Material (ESI) for Molecular BioSystemsThis journal is © The Royal Society of Chemistry 2013

Page 14: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

Supplementary Figures:

Fig. S1: Detail flowchart description of DisConnect.

Electronic Supplementary Material (ESI) for Molecular BioSystemsThis journal is © The Royal Society of Chemistry 2013

Page 15: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

Fig. S2: 15 possible disulfide foldamers of Ar1446

Fig. S3: Mass spectra of (a) Native and (b) trypsin digest Ar1446.

4

Electronic Supplementary Material (ESI) for Molecular BioSystemsThis journal is © The Royal Society of Chemistry 2013

Page 16: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

Fig. S4: 12 possible disulfide foldamers of tryptic Ar1446

Fig. S5: DisConnect output for the major fragment ions present in the CID MS2 spectrum

of the tryptic Ar1446.

5

Electronic Supplementary Material (ESI) for Molecular BioSystemsThis journal is © The Royal Society of Chemistry 2013

Page 17: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

Fig. S6: (a) HPLC profile of the linear peptide. (b) HPLC profile of the oxidized mixture.

Identical mass spectra for each of the HPLC fractions, corresponding to the two disulfide

bonded peptide, establish the three peaks as three foldamers.

Fig. S7: DisConnect predicted structures of the key fragment ions observed in the MS2

spectra of des bromo Mo1277.

6

Electronic Supplementary Material (ESI) for Molecular BioSystemsThis journal is © The Royal Society of Chemistry 2013

Page 18: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

Fig. S8: (a) CID MS3 spectrum of the ion at m/z 975.3 yielding 876.3, establishing parent

product ion relationship among these ions. (b) CID MS3 spectrum of the ion at m/z

636.1from Foldamer 1 of des bromo Mo1277. All the fragment ions can only be

explained from structure 636.1(c).

Fig. S9: (a)-(c) Non-cysteine containing tryptic peptides of α-lactalbumin. (d) MS3

spectra of the MS2 ion at m/z 668.4. The presence of the ions at m/z 581.4 and 515.4

confirms the structure I for 668.4.

7

Electronic Supplementary Material (ESI) for Molecular BioSystemsThis journal is © The Royal Society of Chemistry 2013

Page 19: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

Fig. S10: Tryptic peptide 1-8 of lysozyme. Peptide 1-5 correspond to non-cysteine

containing peptides. Peptide 6 and 7 contain one pairs of cysteine, whereas peptide 8 two

pairs.

8

Electronic Supplementary Material (ESI) for Molecular BioSystemsThis journal is © The Royal Society of Chemistry 2013

Page 20: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

Fig. S11: CID MS2 spectrum of peptide 9 obtained by Asp-N digestion of the tryptic

digest. Inset (a) shows the MS profile of the peptide. Expansion of m/z region 1382.6

shows the co-presence of a doubly and singly charged species. The structure of the ion at

m/z 1382.69 establishes C1-C3 connectivity.

Fig. S12: DisConnect predicted possible structure/s of the major fragment ions present in

the MS2 spectrum of peptide 9.

9

Electronic Supplementary Material (ESI) for Molecular BioSystemsThis journal is © The Royal Society of Chemistry 2013

Page 21: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

Fig. S13: Scheme of generation of the proteolytic peptides and the determination of

disulfide connectivity in lysozyme.

Lysozyme: Hen egg white lysozyme is a 129 residue protein that contains eight cysteines

that form four disulfide bonds. There are 105 possible ways four disulfide bonds

can be formed between eight cysteine residues. The intact native protein was

subjected to trypsin digestion and subsequently analyzed through LC-MS/MS

analysis. MS profile of the tryptic digest shows 8 peptides, for which the

structures were obtained through the DisConnect (Fig. S10). While the peptide

[1]-[5] corresponds to peptides without cysteines; peptide [6]-[8] contain

cysteines. A quick inspection of these peptide structures reveals that both

peptide [6] and [7] contain a pair of cysteine each. This unambiguously proves

that the pair of cysteine present in each of these peptides are disulfide bonded,

determining the connectivity for two of the four disulfide bonds. To find the

connectivity of the other two disulfide bonds, we turn our analysis to peptide [8]

that contains the remaining two disulfide bonds. Presence of two Asp residues in

10

Electronic Supplementary Material (ESI) for Molecular BioSystemsThis journal is © The Royal Society of Chemistry 2013

Page 22: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

this peptide presents a possibility of further proteolytic digestion by Asp-N, which

results in a shorter proteolytic peptide (Peptide [9], Fig. S11(a)). Fig. S11 and

S12 show the CID fragmentation spectra of (M+3H)3+ species of [9] and the

structures of the corresponding fragment ions. The structure of singly charged

ion at m/z 1382.69 unambiguously brings out the connectivity pattern. The

structure contains a Cys residue in each of the segment (C1 and C3), thus

dictating a C1-C3 connectivity. Similarly the structure of the ion at m/z 788.912+

establishes the C2-C4 connectivity. This means that the cysteine connectivity in

[9] is C1-C3/ C2-C4. Interestingly, the isotopic distribution of this ion merges with

the same of a doubly charge ion of m/z 1383.60. Expansion of the m/z region

establishes the existence of two ions. This connectivity is further supported by

the both probable structures of the ion at m/z 1190.61. Both of these two

structures demand a disulfide bond between C1 and C3. This also illustrates the

point that it is not essential to determine structure of every fragment ion with

absolutely certainty to determine the disulfide connectivity. This proves the

overall disulfide connectivity in lysozyme to be Cys1-Cys8, Cys2-Cys7, Cys3-

Cys5, and Cys4-Cys6 (the Cys residues are number as Cysi, where i stands for

the residue number as per the overall protein sequence, analysis schematically

summarized in Fig. S13).

11

Electronic Supplementary Material (ESI) for Molecular BioSystemsThis journal is © The Royal Society of Chemistry 2013

Page 23: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

SI Materials and Method A. Experimental Protocol

Protocol of synthesis of foldamers

The linear peptide is synthesized using standard Fmoc chemistry, as described

previously. The folding reaction was carried by dissolving 5 nmol of linear peptide in

200μL of oxidation buffer containing 100mM NH4HCO3 (pH 8.0) and 10% of dimethyl

sulfoxide (DMSO). The progress of reaction was monitored using mass spectrometry

and quenched after 24 hr by acidification with formic acid (10% final concentration).

Reaction mixture was subjected to C18 analytical column, peptides were eluted over a

linear gradient of 20-26 % acetonitrile and fractions were detected at 226 nm. B. Description of input/output format of DisConnect

I. Analyses of proteolytic fragments: Input/Output Format

The entire polypeptide sequence, experimental m/z values, along with their charge

states, and a choice of protease are the primary input. In the case of multiple peptide

chains, an X is inserted between every two peptide chains. Paste the protein sequence,

ending with a * symbol, in the file "prot_seq". Put the m/z values and the charge states

of the queried ions, in the format 'm/z value'space'charge state' (e.g. 1020.2 3, where

the m/z value is 1020.2 and charge state 3) in the file 'peak_mass_ms'. Depending on

the resolution of the experimental setup, the mass error range and mass type

(monoisotopic / average) can be chosen. The user also has a handle on choosing the

number of miscleavages. The theoretical peptide structures, with their masses within

the user specified error range from the queried values, are the outputs. Their

corresponding (M+H)+ values and the number of Cys residues present are also shown.

. In the output for probable structures, discontinuous peptide chains (hereafter referred

to as segments) are separated by a ‘|’. For chemical feasibility, it is imperative that such

segments must be held together by S-S bonds. A provision is also made in DisConnect

to study the structure of ions arising through probable neutral losses. This option is

12

Electronic Supplementary Material (ESI) for Molecular BioSystemsThis journal is © The Royal Society of Chemistry 2013

Page 24: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

useful when a key product ion is not obtained, but an ion obtained by subsequent

neutral loss is present.

II. Analysis of the MSn fragment ions: Input/Output Format The sequence of the polypeptide is entered by the user. Polypeptide sequence can be

entered using the standard one letter codes. For peptide containing multiple chains, like

insulin, a letter X should be entered between the two chains (e.g if a peptide contains

two polypeptide chains with sequence GVCSF and RLTCY then the input is

GVCSFXRLTCY). For user benefit, if these peptides are results of proteolytic digestion

then an input file in the Result_MS folder is created, named

"inp_MSn_match_MS_protein name.out" that contains the peptide sequence in the

required format for the MS2 analysis. The user can copy the respective sequence from

there and paste it. For MSn analysis, format of the complete sequence is as above. For

the daughter ion (the ion undergoing MSn fragmentation) the input sequence is given

inside the Resut_MSn folder, termed as input_for_MSn_rigorous/smart_Entered

complete/fragment sequence. Copy the corresponding structure of the ion undergoing

MSn fragmentation from this file. The m/z of the MSn fragment ions, with the charge

states, also goes into the program as user input. Put the m/z values of the fragment

ions, in the format 'm/z value'space'charge state' (e.g. 1020.2 3, where the m/z value is

1020.2 and charge state 3)in the file 'peak_mass_ms2' (for MS2) or 'peak_mass_msn'

(for MSn). Depending on the resolution of the experimental setup, user has the freedom

to tune the error range in mass accuracy and mass type (monoisotopic/average). An

array of other input choices is available to the user. The output of the program contains

probable structure/s of each MSn fragment ion. In the output for probable structures,

discontinuous peptide chains (hereafter referred to as segments) are separated by a ‘|’.

For chemical feasibility, it is imperative that such segments must be held together by S-

S bonds. For those Cys containing outputs that have iterative Cys residues (number of

Cys > 2[n-1], n being the number of segments), the possible residue mass

arrangements of the Cys are also shown. It is to be noted that the output shows the

calculated m/z values within the user specified mass error range. In this present study,

13

Electronic Supplementary Material (ESI) for Molecular BioSystemsThis journal is © The Royal Society of Chemistry 2013

Page 25: RSC MB C3MB25534D 3. - Molecular Biophysics Unitmbu.iisc.ernet.in/~pbgrp/DisConnect-MolBioSyst-2013.pdf · ite this DOI: 1.13/c3mb25534d ... (mass spectro- metric or Edman) is ...

performed in ion trap mass spectrometer, we have used a lenient m/z cut off of 0.2Da

while querying both the proteolytic and MSn fragments. All the program outputs, shown,

are derived using the ‘smart’ mode of DisConnect.

References:

(1) H. R. Morris and P. Pucci, Biophys. Biochem. Res. Commun. 1985, 126, 1122-1128.

(2)(a). R. Yazdanparast, P. Andrews, D. L Smith and J. E. Dixon, Anal. Biochem. 1986,

153, 348-353. (b) R. Yazdanparast, P. Andrews, D. L Smith and J. E. Dixon, J. Biol.

Chem., 1987, 262, 2507-2513.

(3) C. Caporale, C. Sepe, Caruso, P. Pucci and V. Buonocore, FEBS Lett., 1996, 393,

241-247.

(4) D. Fenyo, Bioinformatics 1997, 13, 617-618

(5) R. Craig, O. Krokhin, J. Wilkins and R. C. Beavis, J. Proteome Res., 2003, 2, 657-

661.

(6) P.R. Baker and K.R. Clauser, http://prospector.ucsf.edu

(7) W. Murad, R. Singh and T-Y. Yen, BMC Bioinf., 2011, 12, (Suppl 1):S12

(8) K. Gupta, M. Kumar and Balaram, P. Anal. Chem., 2010, 82, 8313-8319.

14

Electronic Supplementary Material (ESI) for Molecular BioSystemsThis journal is © The Royal Society of Chemistry 2013


Recommended