+ All Categories
Home > Documents > Protein Ladder Sequencing Chait, Wang,lab.rockefeller.edu/chait/pdf/93/93_chait_science.pdfThe...

Protein Ladder Sequencing Chait, Wang,lab.rockefeller.edu/chait/pdf/93/93_chait_science.pdfThe...

Date post: 03-Feb-2021
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
4
'9 '*****f>*,->I position of these maps is accurate to 0.15 arc sec. 33. T. R. Geballe, A. G. G. M. Tielens, L. J. Allaman- dola, A. Moorhouse, P. W. J. L. Brand, Astrophys. J. 341, 278 (1989). 34. M. Haas, D. Hollenbach, E. F. Erickson, Astro- phys. J. 301, L57 (1986). 7 June 1993; accepted 2 August 1993 Protein Ladder Sequencing Brian T. Chait, Rong Wang, Ronald C. Beavis, Stephen B. H. Kent* A new approach to protein sequencing is described. It consists of two steps: (i) ladder- generating chemistry, the controlled generation from a polypeptide chain by wet chemistry of a family of sequence-defining peptide fragments, each differing from the next by one amino acid; and (ii) data readout, a one-step readout of the resulting protein sequencing ladder by matrix-assisted laser-desorption mass spectrometry. Each amino acid was identified from the mass difference between successive peaks, and the position in the data set defined the sequence of the original peptide chain. This method was used to directly locate a phosphoserine residue in a phosphopeptide. The protein ladder sequencing method lends itself to very high sample throughput at very low per cycle cost. Direct experimental determination of the amino acid sequence of a polypeptide chain usually gives partial sequence data only. Partial amino acid sequence data may be used to identify isolated proteins (1), and are useful in cloning genes (2). The com- plete amino acid sequence of a protein is most often determined by nucleic acid se- quencing at the cDNA level. However, posttranslational modifications (3) must be characterized at the polypeptide level. Most direct sequence determination of peptides and proteins is done by automated Edman degradation (4), in which a two- part chemical reaction is used to remove one amino acid at a time from the amino terminal. After release, each amino acid derivative is converted to a stable form and is then identified by analytical reverse- phase high-performance liquid chromatog- raphy. Currently such sequencing is limited to less than -50 residues per day (5). Also, most posttranslational modifications are not identified. Thus, there is a great need for more rapid and versatile protein sequencing methods (6). The recent advent of matrix-assisted la- ser-desorption mass spectrometry (LDMS) (7) and the development of improved ma- trix materials (8) has facilitated the accu- rate measurement of the mass of intact B. T. Chait and R. Wang, The Rockefeller University, New York, NY 10021. R. C. Beavis, Department of Physics, Memorial Uni- versity of Newfoundland, St. John's, Newfoundland, Canada Al B 3X7. S. B. H. Kent, The Scripps Research Institute, La Jolla, CA 92037. *To whom correspondence should be addressed. polypeptide chains. Subpicomole amounts of total sample can be analyzed in seconds with a mass accuracy of up to 1 part in 10,000 (9). Thus the polypeptide itself can be analyzed more readily, with greater speed, sensitivity, and precision, than the amino acid derivative released by stepwise sequencing (10). We describe a new principle in protein sequencing that combines multiple steps of wet degradation chemistry with a final, single-step mass spectrometric (MS) read- out of the amino acid sequence. First, a sequence-defining concatenated set of pep- tide fragments, each differing from the next by a single residue, is chemically generated in a controlled fashion. Second, matrix- assisted LDMS is used to read out the complete fragment set in a single operation, as a "protein sequencing ladder" data set. A concatenated set of peptide fragments can be generated in a controlled fashion (11) by carrying out rapid stepwise degradation in the presence of a small amount of terminat- ing agent, a procedure we call "ladder-gen- erating chemistry" (Fig. 1). A small propor- tion of peptide chain blocked at the amino terminus is generated at each cycle. A pre- determined number of cycles is performed without intermediate separation or analysis of the released amino acid derivatives. The resulting mixture is read out in a single operation by matrix-assisted LDMS (12). The mass spectrum contains molecule ions corresponding to each terminated polypep- tide species present. The mass differences between consecutive peaks each correspond to an amino acid residue (13), and their order of occurrence in the data set defines SCIENCE * VOL. 262 · 1 OCTOBER 1993 the sequence of amino acids in the original peptide chain (14). We sequenced the 14-residue peptide [Glut]fibrinopeptide B (15) to illustrate the new method. Eight cycles of manual ladder- generating chemistry were carried out (16), and the resulting product mixture of termi- nated peptides read out (17) by matrix- assisted LDMS (Fig. 2). All the major com- ponents present in the mass spectrum were readily identified, and the data could be simply interpreted to give the sequence of the eight amino-terminal residues of the peptide. The two consecutive peaks with the highest mass differ by 129.1 daltons, identi- fying the amino-terminal amino acid as a Glu residue (calculated residue mass 129.1). The identities of the next seven residues were read off in a similar fashion (18). Several features of the protein ladder sequencing experiment are immediately ap- parent. The mass accuracy obtained (9) was sufficient to unambiguously distinguish Asp [calculated residue mass 115.1] (13) and Asn (calculated residue mass 114.1); Glu [calculated residue mass 129.1] was also identified with sufficient accuracy to distin- guish it from Gin [calculated residue mass 128.1]. The arbitrary ratio of degradation- to-terminating reagents and the minimal AA1-AA-AA3-AA4-AA5- -AAM PITC+5% PIC PTC-AA,-AA-AA-AA4-AA5- -AAn PC-AA -AA2-AA3-AA-AA5- -AAkq Acid (TFA) ATZ(AA1)+ AA2-AA-AA4-AA5- -AAn PC-AA -AA2-AA-AA4-AA- -AAM Further cycles (without separation) I I After m cycles PC-1 -AA2-AA3-AA4-AA5 -AAn PC-AA2.AA3.AA4-AA5- -AA, npadder PC-AA3-M-AA--AA, -sequence data PC-AA -AAn Fig. 1. Protein ladder sequencing principle exemplified by the generation of a set of se- quence-determining fragments from an intact peptide chain with controlled ladder-generat- ing chemistry. A stepwise degradation (32) is carried out with a small amount of terminating agent present in the coupling step. In this case, 5% phenylisocyanate (PIC) was added to the phenylisothiocyanate (PITC). The phenylcar- bamyl (PC) peptides formed are stable to the trifluoroacetic acid (TFA) used to cyclize and cleave the terminal amino acid (AA) from the phenylthiocarbamyl (PTC) peptide. Successive cycles of ladder-generating chemistry are per- formed without intermediate isolation or analy- sis of released amino acid derivatives. Finally, the mixture of PC peptides is read out in one step by matrix-assisted LDMS. 89 128 by 128; pixel size, 0.75 arc sec; effective resolution, 2 arc sec. 31. Observed with the FAST camera on the European South Observatory-Max Planck Institute 2.2-m telescope at La Silla, Chile (27). In Sb array, 62 by 58; pixel size, 0.78 arc sec; effective resolution, 2 arc sec. 32. Obtained with the Hat Creek interferometer (12). Effective resolution, 7.5 arc sec. The absolute :5. .C.I.P$I on April 22, 2007 www.sciencemag.org Downloaded from
Transcript
  • '9 '*****f>*,->I

    position of these maps is accurate to 0.15 arcsec.

    33. T. R. Geballe, A. G. G. M. Tielens, L. J. Allaman-dola, A. Moorhouse, P. W. J. L. Brand, Astrophys.J. 341, 278 (1989).

    34. M. Haas, D. Hollenbach, E. F. Erickson, Astro-phys. J. 301, L57 (1986).7 June 1993; accepted 2 August 1993

    Protein Ladder SequencingBrian T. Chait, Rong Wang, Ronald C. Beavis,

    Stephen B. H. Kent*A new approach to protein sequencing is described. It consists of two steps: (i) ladder-generating chemistry, the controlled generation from a polypeptide chain by wet chemistryof a family of sequence-defining peptide fragments, each differing from the next by oneamino acid; and (ii) data readout, a one-step readout of the resulting protein sequencingladder by matrix-assisted laser-desorption mass spectrometry. Each amino acid wasidentified from the mass difference between successive peaks, and the position in the dataset defined the sequence of the original peptide chain. This method was used to directlylocate a phosphoserine residue in a phosphopeptide. The protein ladder sequencingmethod lends itself to very high sample throughput at very low per cycle cost.

    Direct experimental determination of theamino acid sequence of a polypeptide chainusually gives partial sequence data only.Partial amino acid sequence data may beused to identify isolated proteins (1), andare useful in cloning genes (2). The com-plete amino acid sequence of a protein ismost often determined by nucleic acid se-quencing at the cDNA level. However,posttranslational modifications (3) must becharacterized at the polypeptide level.

    Most direct sequence determination ofpeptides and proteins is done by automatedEdman degradation (4), in which a two-part chemical reaction is used to removeone amino acid at a time from the aminoterminal. After release, each amino acidderivative is converted to a stable form andis then identified by analytical reverse-phase high-performance liquid chromatog-raphy. Currently such sequencing is limitedto less than -50 residues per day (5). Also,most posttranslational modifications are notidentified. Thus, there is a great need formore rapid and versatile protein sequencingmethods (6).

    The recent advent of matrix-assisted la-ser-desorption mass spectrometry (LDMS)(7) and the development of improved ma-trix materials (8) has facilitated the accu-rate measurement of the mass of intact

    B. T. Chait and R. Wang, The Rockefeller University,New York, NY 10021.R. C. Beavis, Department of Physics, Memorial Uni-versity of Newfoundland, St. John's, Newfoundland,Canada Al B 3X7.S. B. H. Kent, The Scripps Research Institute, La Jolla,CA 92037.*To whom correspondence should be addressed.

    polypeptide chains. Subpicomole amountsof total sample can be analyzed in secondswith a mass accuracy of up to 1 part in10,000 (9). Thus the polypeptide itself canbe analyzed more readily, with greaterspeed, sensitivity, and precision, than theamino acid derivative released by stepwisesequencing (10).We describe a new principle in protein

    sequencing that combines multiple steps ofwet degradation chemistry with a final,single-step mass spectrometric (MS) read-out of the amino acid sequence. First, asequence-defining concatenated set of pep-tide fragments, each differing from the nextby a single residue, is chemically generatedin a controlled fashion. Second, matrix-assisted LDMS is used to read out thecomplete fragment set in a single operation,as a "protein sequencing ladder" data set.A concatenated set of peptide fragments

    can be generated in a controlled fashion (11)by carrying out rapid stepwise degradation inthe presence of a small amount of terminat-ing agent, a procedure we call "ladder-gen-erating chemistry" (Fig. 1). A small propor-tion of peptide chain blocked at the aminoterminus is generated at each cycle. A pre-determined number of cycles is performedwithout intermediate separation or analysisof the released amino acid derivatives. Theresulting mixture is read out in a singleoperation by matrix-assisted LDMS (12).The mass spectrum contains molecule ionscorresponding to each terminated polypep-tide species present. The mass differencesbetween consecutive peaks each correspondto an amino acid residue (13), and theirorder of occurrence in the data set defines

    SCIENCE * VOL. 262 · 1 OCTOBER 1993

    the sequence of amino acids in the originalpeptide chain (14).We sequenced the 14-residue peptide

    [Glut]fibrinopeptide B (15) to illustrate thenew method. Eight cycles of manual ladder-generating chemistry were carried out (16),and the resulting product mixture of termi-nated peptides read out (17) by matrix-assisted LDMS (Fig. 2). All the major com-ponents present in the mass spectrum werereadily identified, and the data could besimply interpreted to give the sequence ofthe eight amino-terminal residues of thepeptide. The two consecutive peaks with thehighest mass differ by 129.1 daltons, identi-fying the amino-terminal amino acid as aGlu residue (calculated residue mass 129.1).The identities of the next seven residueswere read off in a similar fashion (18).

    Several features of the protein laddersequencing experiment are immediately ap-parent. The mass accuracy obtained (9) wassufficient to unambiguously distinguish Asp[calculated residue mass 115.1] (13) andAsn (calculated residue mass 114.1); Glu[calculated residue mass 129.1] was alsoidentified with sufficient accuracy to distin-guish it from Gin [calculated residue mass128.1]. The arbitrary ratio of degradation-to-terminating reagents and the minimal

    AA1-AA-AA3-AA4-AA5- -AAMPITC+5% PIC

    PTC-AA,-AA-AA-AA4-AA5- -AAnPC-AA -AA2-AA3-AA-AA5- -AAkq

    Acid (TFA)

    ATZ(AA1)+ AA2-AA-AA4-AA5- -AAnPC-AA -AA2-AA-AA4-AA- -AAM

    Furthercycles(withoutseparation)

    II

    After m cycles

    PC-1 -AA2-AA3-AA4-AA5 -AAnPC-AA2.AA3.AA4-AA5- -AA, npadderPC-AA3-M-AA--AA,-sequence

    data

    PC-AA -AAnFig. 1. Protein ladder sequencing principleexemplified by the generation of a set of se-quence-determining fragments from an intactpeptide chain with controlled ladder-generat-ing chemistry. A stepwise degradation (32) iscarried out with a small amount of terminatingagent present in the coupling step. In this case,5% phenylisocyanate (PIC) was added to thephenylisothiocyanate (PITC). The phenylcar-bamyl (PC) peptides formed are stable to thetrifluoroacetic acid (TFA) used to cyclize andcleave the terminal amino acid (AA) from thephenylthiocarbamyl (PTC) peptide. Successivecycles of ladder-generating chemistry are per-formed without intermediate isolation or analy-sis of released amino acid derivatives. Finally,the mixture of PC peptides is read out in onestep by matrix-assisted LDMS.

    89

    128 by 128; pixel size, 0.75 arc sec; effectiveresolution, 2 arc sec.

    31. Observed with the FAST camera on the EuropeanSouth Observatory-Max Planck Institute 2.2-mtelescope at La Silla, Chile (27). In Sb array, 62 by58; pixel size, 0.78 arc sec; effective resolution, 2arc sec.

    32. Obtained with the Hat Creek interferometer (12).Effective resolution, 7.5 arc sec. The absolute

    :5. .C.I.P$I

    on

    Apr

    il 22

    , 200

    7 w

    ww

    .sci

    ence

    mag

    .org

    Dow

    nloa

    ded

    from

    http://www.sciencemag.org

  • reaction conditions employed have yieldeda simple, useful sequencing ladder. No ef-fort was made to optimize coupling or cleav-age yields in the chemical degradation be-cause the accuracy of protein ladder se-quencing is unaffected by the relative abun-dance, over a wide range, of individualterminated fragments. Obtaining high reac-tion yields is not critical, and the degrada-tion protocols can be simple and fast. Incontrast, extreme (prolonged and forcing)reaction conditions are used in the standardstepwise Edman degradation (19).

    Fig. 2. Protein ladder sequencing of [Glu1]fibri-nopeptide B (15). The peptide, of sequenceGlu1-Gly-Val-Asn-AspS-Asn-Glu-Glu-Gly-Phe10-Phe-Ser-Ala-Arg14, was subjected to eight cy-cles of ladder-generating chemistry (Fig. 1)(16). The matrix-assisted LDMS readout (17) ofthe resulting sequence-defining set of frag-ments is shown in two forms: A standard inten-sity versus mass (33) plot; the data is plottedfrom high to low mass, so that the amino acidsequence reads from the amino terminal. Theupper horizontal lines show the different lengthsblocked peptide species present and their re-lation to the MS data.

    A second example illustrates the laddesequence analysis of both phosphorylate,and unphosphorylated forms of a 16-residupeptide containing a Ser residue (20). Afte10 cycles of ladder-generating chemistry o0each form of the peptide (21), the tw,separate sequence-defining fragment mixtures were each read out in a single matrixassisted LDMS experiment (Fig. 3). Thprotein ladder sequencing method directlidentified and located a Ser(Pi) at positiolfive in the peptide (22). There was n,detectable loss of phosphate from the phos

    E GV N D N E E..... R

    129.1 56

    *^^'

    180

    9.2113.7 1 14.1

    Ma (dalton.)

    1

    129.0 129.2

    11

    3erde

    *r

    n

    phoserine residue, which has been regardedas the most sensitive and unstable of thephosphorylated amino acids (23).

    The inability to directly identify, locate,and quantify phosphorylated residues is a

    o major shortcoming of standard sequencing:- methods and has imposed major limitations- on currently important areas of biological

    e research, such as mechanisms of signaly transduction. Protein ladder sequencing hasn general application to the direct identifica-o tion of posttranslational modifications pres-

    ent in a peptide chain being sequenced. Amodified amino acid residue that is stable(23) to the conditions used in the ladder-generating chemistry reveals itself as anadditional mass difference at the site of thecovalent modification. Frequently, this willlead to unambiguous identification of thechemical nature of the posttranslationalmodification (3). The utility of proteinladder sequencing in this regard would ap-ply even to large modifying entities, such ascarbohydrate moieties in glycopeptides.

    To explore the capabilities and limita-tions of the ladder sequencing readout bymatrix-assisted LDMS, measurements werecarried out on sets of sequence-definingunblocked synthetic peptides. This set of

    o peptides was obtained during the course of atotal chemical synthesis of the 99-amino

    Fig. 3. (left) Protein 100iAladder sequencing ofthe 16-residue syn-thetic peptide: Leu- L, R a,A. p65.GC L L.L YArg-Arg-Ala-Ser(P,)- a) (I) (

    -Leu-e-Tyr-An-v| 113.4 1559 156. ;7L3 1667 57.0 113. 13.0 :164 11Gly-Leu-lle-Tyr-Asn-:156. 9Asn-Pro-Leu-Met-Ala-Arg.amide. (A) Phos- -phorylated peptide.(B) Unphosphoryl-ated peptide. Each 1 |Ipeptide sample was _subjected to 10 cy- C =icles of ladder-gen- S~ L L llrerating chemistry. 2200 1700 1200Data defining the 11 .o100Bamino-terminal resi-

    mm

    dues (21) are shown.The Ser(P,) residue L, a R.A.S5G. L.L Ywas characterized by (I) * (I) (i)i)I! 113.3 156.2 156.1 '71.'r70: :113.0:113.2: 163.0 1a mass difference of sJ166.7 daltons (Ser, 2. .calculated residuemass 87.1; Ser(P,) cal-culated residue mass167.1) observed in Iposition five. There isino evidence for loss 1of phosphate (35). 0-"-,... . JFig. 4. (right) Extend- 2200 1700 ( 1200ed MS readout of se- Mass (daltons)quence-defining setsof polypeptide fragments. Consecutive samples, after each amino acidaddition, were taken during stepwise solid-phase assembly of the 99-residue monomer sequence of HIV-1 protease (24). After release from thesolid support and deprotection, pooled peptide samples corresponding toresidues 67 to 99 and 33 to 66 were analyzed by matrix-assisted LDMS

    1001

    (17). Observed mass differences for each amino acid residue are given inTable 1.

    SCIENCE * VOL. 262 * 1 OCTOBER 1993

    T . T 7

    III

    I I ,II I II I I

    i

    90

    on

    Apr

    il 22

    , 200

    7 w

    ww

    .sci

    ence

    mag

    .org

    Dow

    nloa

    ded

    from

    http://www.sciencemag.org

  • acid monomer polypeptide chain of thehuman immunodeficiency virus-1 (HIV-1)protease (24). The target sequence wasassembled by solid-phase synthesis in step-wise fashion from the resin-bound carboxyl-terminal residue Phe99. Samples of peptideresin were taken after addition of eachamino acid, from residue 98 to residue 33.The different length peptide resins werepooled in two batches of more than 30consecutive samples, and the two mixtureswere separately deprotected and cleaved(25). The resulting sets of sequence-defin-ing fragments with masses up to 7400 dal-

    Table 1. Measured mass differences betweenadjacent peaks of the protein sequencing lad-ders shown in Fig. 4. The deviation from thecalculated value is given in parentheses; Aba,a-amino-n-butyric acid.

    Amino A Mass Amino A Massacid (daltons) acid (daltons)Leu33 113.3 (0.1) Asp60 114.8 (-0.3)Glu34 129.7 (0.6) Gin61 128.7 (0.6)Glu35 129.5 (0.4) lie62 113.2 (0.0)Met36 130.8 (-0.4) Pro63 97.0 (-0.1)Asn37 115.0 (0.9) Va164 99.4 (0.3)Leu38 112.4 (-0.8) Glu65 128.6 (-0.5)Pro39 97.9 (0.8) lie66 113.3 (0.1)Gly40 56.1 (-0.9) Aba67 84.9(-0.2)Lys41 128.1 (0.0) Gly68 57.0 (0.0)Trp42 186.4 (0.2) His69 137.3 (0.2)Lys43 128.2 (0.0) Lys70 127.8 (-0.4)Pro44 97.1 (0.0) Ala71 71.4 (0.3)Lys45 128.0 (-0.2) lie72 113.4 (0.2)Met46 131.9 (0.7) Gly73 56.8 (-0.2)lie47 112.6 (-0.6) Thr74 101.1 (0.0)Gly48 57.9 (0.9) Va175 99.2 (0.1)Gly49 56.3 (-0.7) Leu76 113.1 (-0.1)lie50 112.4 (-0.8) Va177 99.1 (0.0)Gly51 57.6 (0.6) Gly78 57.1 (0.1)Gly52 57.5 (0.5) Pro79 97.2 (0.1)Phe53 147.3 (0.1) Thr80 101.1 (0.0)lie54 112.5 (-0.7) Pro81 97.1 (0.0)Lys55 128.9 (0.8) Va182 99.2 (0.1)Va156 99.0 (-0.1) Asn83 113.8 (-0.3)Arg57 156.2 (0.0) lie84 113.4 (0.2)Gin58 128.4 (0.3) lie85 113.1 (0.0)Tyr59 162.6 (-0.6) Gly86 57.1 (0.0)

    Fig. 5. High-sensitivity protein lad-der sequencing readout demon-strated by serial dilution (1 to1000) of the sample used in Fig. 2.No more than -25 fmol total pep-tide was present in the mass spec-trometer, that is,

  • 128.2; Met (M), 131.2; Phe (F), 147.2; Pro (P), 97.1;Ser (S), 87.1; Thr (T), 101.1; Trp (W), 186.2; Tyr(Y), 163.2; and Val (V), 99.1. The isomeric residuesLeu and lie have identical mass and cannot bedirectly distinguished. Lys and Gin are readilydistinguished by the modified Lys side-chaine-amino group formed in the chemistry steps.

    14. The most efficient and accurate techniques forbiopolymer sequencing are those that involve thereadout of a sequence-defining data set in asingle experimental operation. The data set canbe examined as a whole, and anomalies can bedetected and resolved. For example, as originallypracticed, DNA sequencing by either (dideoxy)chain-termination [F. Sanger, S. Nicklen, A. R.Coulson, Proc. Natl. Acad. Sci. U.S.A. 74, 5463(1977)] or chain-fragmentation [A. M. Maxam andW. Gilbert, ibid., p. 560] involved one-step read-out followed by simultaneous inspection of thecomplete sequence-defining data set.

    15. [Glul]Fibrinopeptide B was purchased from Sig-ma (St. Louis, MO). The reported sequence was:Glu1-Gly-Val-Asn-Asp5-Asn-Glu-Glu-Gly-Phe1 -Phe-Ser-Ala-Arg14. Matrix-assisted LDMS gavea mass of 1570.6 daltons (calculated, 1570.8dalton) and showed high purity of the startingpeptide.

    16. A mixture of phenylisothiocyanate (PITC) plus 5%v/v phenylisocyanate (PIC) was used in the cou-pling step. Phenylisocyanate reacts with the aNH2group of a polypeptide chain to yield an Na-phenylcarbamyl-peptide, which is stable to theconditions of degradation. A variation of manualEdman chemistry was used [G. E. Tarr, MethodsEnzymol. 47, 335 (1977)]. All reactions were car-ried out in the same 0.5-ml polypropylene mi-crofuge tube under a blanket of dry nitrogen.Peptide (200 pmol to 10 nmol) was dissolved in 20pl of pyridine/water (1:1 v/v; pH 10.1); 20 i1 ofcoupling reagent containing PITC/PIC/pyridine/hexafluoroisopropanol [HFIP] (20:1:76:4 v/v) wasadded to the reaction vial. After reaction at 50°Cfor 3 min, the coupling reagents and nonpeptidecoproducts were extracted by adding 300 ±I1 ofheptane:ethyl acetate (10:1 v/v) and gentle vor-texing. The phases were separated by centrifuga-tion, and the upper phase was aspirated anddiscarded. This washing procedure was repeatedonce, followed by washing twice with heptane/ethyl acetate (2:1 v/v). The remaining solutioncontaining the peptide products was dried on avacuum centrifuge. The cleavage step was car-ried out by addition of 20 il1of anhydrous trifluoro-acetic acid (TFA) to the dry residue in the reactionvial and reaction at 50°C for 2 min, followed bydrying on a vacuum centrifuge. Coupling-wash-cleavage steps were repeated for a predeter-mined number of cycles. The low molecularweight derivatives released at each cycle werenot separated and analyzed. Finally, the totalproduct mixture was subjected to an additionaltreatment with PIC to convert any remaining un-blocked peptides to their phenylcarbamyl deriva-tives. The sample was dissolved in 20 pI oftrimethylamine/water (25% wt/wt) in pyridine (1:1v/v); 20 1I of PIC/pyridine/HFIP (1:76:4 v/v) wasadded to the reaction vial. The coupling reactionwas carried out at 50°C for 5 min. The reagentswere extracted as described above.

    17. The product mixture was dissolved in 0.1% aque-ous TFA/acetonitrile (2:1, v/v). A 1-pl aliquot(-250 pmol total peptide, assuming no losses)was mixed with 9 p1 of a-cyano-4-hydroxycin-namic acid (5 g/liter in 0.1% TFA/acetonitrile, 2:1v/v), and 1.0 pI of this mixture of total peptideproducts (25 pmol) and matrix was applied to theprobe tip and dried in a stream of air at roomtemperature. Mass spectra were acquired in pos-itive ion mode with a time-of-flight LDMS instru-ment constructed at The Rockefeller University [R.C. Beavis and B. T. Chait, Rapid Commun. MassSpectrom. 3, 233 (1989); Anal. Chem. 62, 1836(1990)]. The spectra resulting from 200 15-mJpulses at a wavelength of 355 nm were acquiredover 80 s and added to give a mass spectrum ofthe protein sequencing ladder. Masses were cal-

    92

    culated with matrix peaks of known mass ascalibrants.

    18. Residue assignment was made by computer in aninteractive fashion. First, the intact molecule ion wasselected by the user. The program then searchedthe data for a lower mass peak corresponding to theremoval of a single residue. The mass differencesbetween the adjacent peaks were calculated andcompared with a look-up table of known residuemasses. If the mass difference was within set toler-ances, the residue was assigned from the table,otherwise the userwas asked to label the peak as anunknown, or reject it.

    19. M. W. Hunkapiller, K. Granlund-Moyer, N. W.Whiteley, in Methods of Protein Microcharacter-ization. A Practical Handbook, J. E. Shively, Ed.(Humana, Clifton, NJ, 1986), pp. 223-247; J. E.Shively, R. J. Paxton, T. D. Lee, Trends Biochem.Sci. 14, 246 (1989).

    20. The peptide was prepared by highly optimizedpeptide synthesis [M. Schnolzer, P. Alewood, A.Jones, D. Alewood, S. B. H. Kent, Int. J. PeptideProtein Res. 40, 180 (1992)]. The sequence was:LRRASGLIYNNPLMAR.amide. Matrix-assistedLDMS gave a mass of 1844.3 daltons (calculated,1844.2 daltons) and showed high purity of thestarting peptide. The phosphorylated form wasprepared by enzymatic reaction with 3',5'-cyclicAMP-dependent protein kinase. The phospho-peptide had a mass of 1924.2 daltons (calculat-ed, 1924.2 daltons) and showed high purity.

    21. Although only 10 cycles of ladder-generatingchemistry were performed, sequence-definingfragments corresponding to 11 residues wereobserved, apparently because of a small amountof premature cleavage [W. A. Schroeder, Meth-ods Enzymol. 25, 298 (1972); G. E. Tarr, ibid. 47,335 (1977)]. This side reaction, a potential prob-lem for standard Edman methods, has no delete-rious effect on the ladder sequencing approach.

    22. Serine has a residue mass of 87.1 daltons; addi-tion of -P03H2 in place of a proton results in anadditional mass increment of 80.0 daltons, for aSer(P,) residue mass of 167.1 daltons.

    23. The ladder-generating chemistry used here hasno conversion step, and is therefore considerablymilder than the Edman (degradation + conver-sion) chemistry. The serine phosphate within thepeptide chain is stable to Edman chemistry [D. B.Rylatt and P. Cohen, FEBS Lett. 98, 71 (1979)].However, the conversion step typically involves 1M HCI in methanol or 25% trifluoroacetic acid inwater for 10 min at 55° to 65°C, conditions thatcause extensive decomposition of Ser(Pi) [C. G.Proud, D. B. Rylatt, S. J. Yeaman, P. Cohen, ibid.80, 435 (1977)] and other acid-sensitive residues[R. E. H. Wettenhall, R. Aebersold, L. E. Hood,Methods Enzymol. 201,186 (1991)]. Furthermore,standard extraction techniques used for existingEdman methods do not recover the polar phos-phorylated amino acid derivatives for analysis,even where they are stable to the chemistry used[R. Aebersold, J. D. Watts, H. D. Morrison, J. E.Bures, Anal. Biochem. 199, 51 (1991)].

    24. R. C. deL. Milton, S. C. F. Milton, S. B. H. Kent,Science 256, 1445 (1992).

    25. Stepwise solid-phase peptide synthesis was car-ried out as described on a modified AppliedBiosystems 430A instrument [M. Schnolzer, P.Alewood, A. Jones, D. Alewood, S. B. H. Kent, Int.J. Peptide Protein Res. 40, 180 (1992)]. Samples(-1 ,mol each) of butyloxycarbonyl (Boc)-pep-tide-resins were taken under instrument controlafter each coupling step. These samples werepooled in batches corresponding to residues 67to 99, and 33 to 66. The pooled Boc-peptide-resins were deprotected and cleaved as de-scribed, extracted into aqueous acetic acid, andlyophilized. Aliquots were used for readout exper-iments.

    26. Measurements were also obtained for peptidefragments between 7500 to 11,000 daltons (corre-sponding to residues 32 to 1), but the quality of thedata was not sufficiently high for extended stretch-es of unambiguous sequence determination.

    27. Although higher mass accuracy is desirable and

    SCIENCE * VOL. 262 * 1 OCTOBER 1993

    may be achieved in the future, in many cases it isalready possible to resolve ambiguities in the3500- to 7500-dalton range by looking at the massdifferences on either side of the residue in ques-tion. For example, residue 40 has a measuredmass of 56.1 daltons and residue 39 has a mea-sured mass of 97.9 daltons. This latter value hasan uncertainty of roughly ±+1 dalton, and thuscould correspond to either Pro (97.1 daltons) orVal (99.1 daltons). There is no such ambiguity inthe identity of residue 40, which must correspondto Gly (calculated mass 57.1 daltons). Correctingthe measured mass difference for Gly40 by +1.0leads to a corrected mass difference for residue39 of 97.9 - 1.0 = 96.9. This corresponds closelyto the mass of a Pro residue. Similar correctionscan be applied to the pairs Va164/Glu65, Gln58/Tyr59, lle54/Lys55, and Asn37/Leu38. In this way,the accessible length of peptide can be extendedto more than 60 residues.

    28. Some limitations of the protein ladder sequencingmethod are shared with existing chemical se-quencing methods. For example, amino-terminal-blocked samples will not be amenable to theladder-generating chemistries.

    29. P. Tempst and L. Riviere, Anal. Biochem. 183,290(1989); S. C. Wong, C. Grimley, A. Padua, J. H.Bourell, W. J. Henzel, in Techniques in ProteinChemistry IV, R. H. Angeletti, Ed. (AcademicPress, New York, 1993), pp. 371-378.

    30. The utility of a variety of other chemical reactionsystems to generate sets of sequence-definingpeptide fragments has been investigated. Inparticular, low-level side reactions or chronicincomplete reaction in the stepwise degradationchemistry can be used to generate useful ladderdata sets. Further, sequencing of samples im-mobilized on a solid support reduces handlinglosses and increases recoveries, and is partic-ularly useful for small amounts of total sample. Inthis way, data have been obtained from totalsample amounts as little as 10 pmol. The se-quence-defining fragment sets were generatedby accelerated automated Edman degradation,carried out on an ABI 471A, of a peptide sampleimmobilized on an ion exchange membrane.Multiple samples have been processed simulta-neously. After up to 10 cycles of degradation,the peptide products were extracted from themembrane, and an aliquot (5%) used for MSreadout (R. Wang, in preparation).

    31. A potential application of the protein ladder prin-ciple includes carboxyl-terminal sequence analy-sis, where existing stepwise chemistries [A. S.Inglis, Anal. Biochem. 195, 183 (1991)] haveproven inadequate because of low reaction yieldsthat lead to confusing overlaps within the first fewcycles of degradation.

    32. P. Edman, Arch. Biochem. 22, 475 (1949); ActaChem. Scand. 4, 283 (1950).

    33. Strictly, mass-to-charge ratio. However, under theconditions used, all significant components weresingly charged species. The additional intensepeak * at 1552.6 daltons corresponds to theblocked peptide [(pyrrolidonecarboxylic acid]1]fibrinopeptide B, an artifact formed from cycliza-tion of the amino-terminal Glu. Such side reac-tions are readily identifiable and do not interferewith the sequence determination.

    34. The digitized time-intensity data were convertedto this graphical form with an 8-bit gray scale.

    35. Loss of phosphate by hydrolysis (-80 daltons) or byelimination to form dehydro-alanine (-98 daltons)was not detected. A low-level side reaction, unrelat-ed to the modified amino acid, was observed inthese experiments; this gave rise to a series ofpeaks at -93 daltons relative to the main series. Theorigin of this artifact is under investigation.

    36. Supported by grants from the National Institutes ofHealth [NIH RR00862 and GM38274), and byfunds from the Markey Foundation. We thank R.deL. Milton and S. Milton for the HIV-1 PR-derivedpeptide-resin samples, and S. Walker for synthesisof the unphosphorylated peptide used in Fig. 3B.

    11 May 1993; accepted 27 July 1993

    on

    Apr

    il 22

    , 200

    7 w

    ww

    .sci

    ence

    mag

    .org

    Dow

    nloa

    ded

    from

    http://www.sciencemag.org

Recommended