Contents lists available at ScienceDirect
Data in Brief
Data in Brief 6 (2016) 68–76
http://d2352-34(http://c
DOIn CorrE-m
journal homepage: www.elsevier.com/locate/dib
Data article
Glycosylation characterization of therapeutic mAbsby top- and middle-down mass spectrometry
Bao Quoc Trana, Christopher Bartonb, Jinhua Fengb, Aimee Sandjongb,Sung Hwan Yoona, Shivangi Awasthi a, Tao Liang a, Mohd M. Khana,David P.A. Kilgour a, David R. Goodlett a,n, Young Ah Gooa,n
a Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, Baltimore, MD, USAb MedImmune, LLC, Gaithersburg, MD, USA
a r t i c l e i n f o
Article history:Received 20 October 2015Received in revised form4 November 2015Accepted 10 November 2015Available online 24 November 2015
x.doi.org/10.1016/j.dib.2015.11.03109/& 2015 The Authors. Published by Elsereativecommons.org/licenses/by/4.0/).
of original article: http://dx.doi.org/10.1016esponding authors.ail addresses: [email protected]
a b s t r a c t
A reference monoclonal antibody IgG1 and a fusion IgG proteinwere analyzed by top- and middle-down mass spectrometry withmultiple fragmentation techniques including electron transferdissociation (ETD) and matrix-assisted laser desorption ionizationin-source decay (MALDI-ISD) to investigate heterogeneity of gly-cosylated protein species. Specifically, glycan structure, sites,relative abundance levels, and termini structural conformationwere investigated by use of Fourier transform ion cyclotron reso-nance (FT-ICR) or high performance liquid chromatography elec-trospray ionization (HPLC-ESI) linked to an Orbitrap. Incorporatinga limited enzymatic digestion by immunoglobulin G-degradingenzyme Streptococcus pyogenes (IdeS) with MALDI-ISD analysisextended sequence coverage of the internal region of the proteinswithout pre-fractionation. The data in this article is associatedwith the research article published in Journal of Proteomics (Tranet al., 2015) [1].
& 2015 The Authors. Published by Elsevier Inc. This is an openaccess article under the CC BY license
(http://creativecommons.org/licenses/by/4.0/).
vier Inc. This is an open access article under the CC BY license
/j.jprot.2015.10.021
(D.R. Goodlett), [email protected] (Y.A. Goo).
www.sciencedirect.com/science/journal/23523409www.elsevier.com/locate/dibhttp://dx.doi.org/10.1016/j.dib.2015.11.031http://dx.doi.org/10.1016/j.dib.2015.11.031http://dx.doi.org/10.1016/j.dib.2015.11.031http://dx.doi.org/10.1016/j.jprot.2015.10.021http://dx.doi.org/10.1016/j.jprot.2015.10.021http://dx.doi.org/10.1016/j.jprot.2015.10.021http://crossmark.crossref.org/dialog/?doi=10.1016/j.dib.2015.11.031&domain=pdfhttp://crossmark.crossref.org/dialog/?doi=10.1016/j.dib.2015.11.031&domain=pdfhttp://crossmark.crossref.org/dialog/?doi=10.1016/j.dib.2015.11.031&domain=pdfmailto:[email protected]:[email protected]://dx.doi.org/10.1016/j.dib.2015.11.031
B.Q. Tran et al. / Data in Brief 6 (2016) 68–76 69
Specifications Table
M
TH
DE
E
D
Subject area
Biology, Chemistryore specific sub-ject area
Top- and middle-down mass spectrometric analysis of protein species andproteoforms
ype of data
Table, Figure
ow data wasacquired
Mass spectrometry data were collected on a 12TFT-ICR XR (Bruker) and aLTQ-Orbitrap Elite (Thermo Scientific)
ata format
MS spectra
xperimentalfactors
Top- and middle-down analyses with multiple fragmentation techniquesincluding electron transfer dissociation (ETD) and matrix-assisted laserdesorption ionization in-source decay (MALDI-ISD) for characterization of areference monoclonal antibody IgG1 and a fusion IgG protein
xperimentalfeatures
Glycan profiling including structure, sites, relative abundance levels, andtermini structural conformation were investigated
ata sourcelocation
Baltimore and Gaithersburg, Maryland, USA
ata accessibility
Data is provided within this articleDValue of the data
� Application of top-down and middle-down mass spectrometry for characterization of comprehen-sive glycosylated protein species from recombinant IgG and IgG-fusion proteins.
� Detection of glycan structure, sites, and relative abundance levels.� Investigation of termini structural conformation; N-terminal glutamine (Q) to pyroglutamate
(pyroGlu, aka. pE) conversion and C-terminal lysine truncation.� Improvement on protein sequencing by MALDI ISD, ETD in combination with middle-down
analysis by use of an enzyme, IdeS.
1. Data
Glycosylation profiling of recombinant IgG and IgG-fusion proteins and sequence coverageinformation.
2. Experimental design, materials and methods
The recombinant human mAb (IgG1) and IgG1-fusion protein were manufactured using standardcell culture, purification and formulation processes [2]. The IgG1 has two identical light chains andtwo identical heavy chains with molecular weight of 150 kDa and N-glycosylation site at N297. TheIgG fusion protein, approximately 90 kDa, was comprised of two chains each with a human proteindomain fused onto the N-terminus of a human IgG1 constant domain (Fc). The protein has three N-glycosylation sites at N76, N108 and N207.
3. Analysis of the IgG protein
The IgG protein was analyzed 1) in reduced form by MALDI-ISD in a 12 T Fourier Transform IonCyclotron Resonance (FT-ICR XR) mass spectrometer (Bruker, Bremen, Germany), and 2) after IdeSdigestion and dithiothreitol (DTT) reduction to generate three smaller fragments; light chain, Fab, andFc/2 after which analyzed by LC–MS/MS on an Orbitrap Elite (Thermo Scientific, Saint Jose, CA, USA).
Fig. 1. MALDI-ISD mass spectrum of the DTT-reduced IgG1 mAb. Amino acid residues underlined were identified after phasecorrection by c or zþ2 ions with a mass error threshold of 50 ppm, covering (A) 37.6% of the light chain and (B) 14.3% of theheavy chain.
LC
LCLC
LC
Fd
Fd Fd Fd
Fig. 2. LC-MS/MS mass spectrum of IdeS/DDT treated IgG mAb. The charge state distribution of light chain (LC) and Fd domainare shown. Inset shows the full mass spectrum collected after accumulation of 20 scans over an m/z range 600–3500 at aresolving power of 120,000.
B.Q. Tran et al. / Data in Brief 6 (2016) 68–7670
D1 I Q M T Q S P S T L S A S V G D R V T I T C S A S S R V G 184
Y31 M H W Y Q Q K P G K A P K L L I Y D T S K L A S G V P S R 154
F61 S G S G S G T E F T L T I S S L Q P D D F A T Y Y C F Q G 124
S91 G Y P F T F G G G T K V E I K R T V A A P S V F I F P P S 94
D121 E Q L K S G T A S V V C L L N N F Y P R E A K V Q W K V D 64
N151 A L Q S G N S Q E S V T E Q D S K D S T Y S L S S T L T L 34
S181 K A D Y E K H K V Y A C E V T H Q G L S S P V T K S F N R 4
G211 E 1C 1
[1-213] Mono : 23113.3039346 Average : 23127.83128Total# c :19 z :15
0
5000
10000
15000
20000
Inte
nsity
0
20
40
60
80
100
1442.0570 1942.0570 2442.0570 2942.0570 3442.0570m/z
11062014_011-qb.raw
Q1 V T L R E S G P A L V K P T Q T L T L T C T F S G F S L S 210
T31 A G M S V G W I R Q P P G K A L E W L A D I W W D D K K H 180
Y61 N P S L K D R L T I S K D T S K N Q V V L K V T N M D P A 150
D91 T A T Y Y C A R D M I F N F Y F D V W G Q G T T V T V S S 120
A121 S T K G P S V F P L A P S S K S T S G G T A A L G C L V K 90
D151 Y F P E P V T V S W N S G A L T S G V H T F P A V L Q S S 60
G181 L Y S L S S V V T V P S S S L G T Q T Y I C N V N H K P S 30
N211 T 1K 1V 1D 1K 1R 1V 1E 1P 1K 1S 1C 1D 1K 1T 1H 1T 1C 1P 1P 1C 1P 1A 1P 1E 1L 1L 1G 1
[1-239] Mono : 25672.8062256 Average : 25689.230331
Total# c :33 z :17
Q1: -17.026549da
0
5000
10000
15000
20000
25000
30000
35000
Inte
nsity
0
20
40
60
80
100
1006.6417 1506.6417 2006.6417 2506.6417 3006.6417m/z
11062014_015-qb.raw
Fig. 3. (A) ETD mass spectrum of precursor m/z 1446 (charge 16þ) was searched against IgG1 mAb light chain sequence withmass tolerance of 100 ppm. ETD mass spectrumwas produced from accumulation of 49 scans. Identified c- and z- fragment ionseries are indicated in diamonds in the spectrum and relevant sequence identification is shown in the sequence. (B) ETD massspectrum of precursor m/z 1512 (charge 17þ) was searched against Fd sequence of IgG1 mAb with mass tolerance of 100 ppm.ETD mass spectrum was produced from accumulation of 50 scans. Identified c- and z- fragment ion series are indicated indiamonds in the spectrum and relevant sequence identification is shown in the sequence. N-terminal glutamine (Q) to pyr-oglutamate (pyroGlu) conversion was confirmed with precursor mass error 3.9 ppm (0.10 Da).
B.Q. Tran et al. / Data in Brief 6 (2016) 68–76 71
B.Q. Tran et al. / Data in Brief 6 (2016) 68–7672
Intact protein mass measurement for detection of protein species of the IgG was performed usingESI FT-ICR MS as demonstrated in the related study [1].
3.1. MALDI-ISD analysis of reduced protein
The IgG protein in 1 mg/mL in 50 mM ammonium bicarbonate buffer was reduced to light andheavy chain by 20 mM dithiothreitol (DTT, product 43815, Sigma-Aldrich, St. Louis, MO, USA) at 80 °Cfor 15 min. The reduced protein sample was buffer exchanged to 5% acetonitrile (ACN) /0.1% formicacid (FA) using a 10 kDa molecular weight cut off (MWCO) centrifugal filter (product UFC201024,Merck Millipore, Tullagreen, Carrigtwohill, Ireland) and then diluted to 17 mg/mL in 50% ACN/0.1% FA.1 mL sample and 1 mL of 1,5-diaminonaphthalene (1,5-DAN, product 56451, Fluka, St. Louis, MO, USA)saturated in ACN were spotted on a stainless steel MALDI plate. MALDI-ISD experiment was per-formed with a transient domain of 1 mega-word data points (estimated resolving power of 270,000 atm/z 400) covering m/z range 400–10,000. MALDI-ISD mass spectra after phase correction [3] wereprocessed to identify c- and zþ2 ions from light chain and heavy chain with mass tolerance 50 ppmusing DataAnalysis 4.2 and Biotools software 3.2 (Bruker) (Fig. 1).
3.2. LC-MS/MS analysis of IdeS-digested protein
The IgG protein was digested with IdeS (product A0-FR1-020, Genovis AB, Lund, Sweden) at aratio of 1 unit of enzyme per 1 mg protein for 2 h at 37 °C, followed by DTT treatment. Approximately14 mg of the IdeS/DTT treated sample was separated on a 100 mm�20 mm column packed withJupiter 5 mm C5 particles (product 04A-4052, Phenomenex, Torrance, CA, USA) by a fast mobilegradient of 5–50% ACN/0.1% FA in 15 min at a flow rate of 0.3 mL/min. Light chain and Fd domainwere detected in a range of 350–3500 m/z at resolving power of 120,000 in positive ion mode on theOrbitrap Elite (Fig. 2). Precursor masses were selected for ETD fragmentation. ETD spectra wereacquired with 5 microscans at resolving power of 120,000 and processed by MASH Suite version1.0 software [4]. The fragment ions and corresponding sequence coverage are shown in Fig. 3A for thelight chain and Fig. 3B for the Fd domain. Glutamine (Q) to pyroglutamate (pyroGlu) conversion wasconfirmed on the N-terminal of heavy chain (Fig. 3B).
Fig. 4. Fourier transform ion cyclotron resonance (FT-ICR) mass spectrum of the intact fusion protein peaks (charge 38þ). Themass spectrum was produced from accumulation of 1650 scans collected in a time domain of 128 K words data points. Majorpeaks demonstrate a heterogeneous combination of glycosylation, predominantly by sialic acids (NANA). Additional infor-mation regarding the glycan nomenclature and structure are provided in Fig. 5.
B.Q. Tran et al. / Data in Brief 6 (2016) 68–76 73
4. Analysis of the IgG fusion protein
The fusion protein was analyzed 1) by accurate mass measured in FT-ICR for protein speciesdetection 2) by MALDI-ISD following DTT reduction in FT-ICR, and 3) by accurate mass measurementfollowing IdeS cleavage in FT-ICR or ETD MS/MS in the Orbitrap Elite.
4.1. Accurate mass measurement of intact protein
The fusion protein sample was buffer exchanged to 5%ACN and 0.1%TFA using Amicon 50 kDaMWCO centrifugal filters (product UFC505024, Merck Millipore). Since the fusion protein sample wasstored in buffer with detergent, detergent removal was performed by Pierce detergent removal kit(product 87778, Thermo Scientific, Rockford, CA, USA). The detergent-free protein sample wasdiluted to 10 mg/mL final concentration in 50% ACN/0.1% FA and introduced to electrospray ionization(ESI)-FT-ICR for intact protein mass measurement for detection of protein species. The experimentwas carried out in positive ion mode on the 12 T FT-ICR at flow rate 2 mL/min. Mass spectra werecollected in magnitude mode with a transient domain of 128,000 word data points to cover m/z range400–4000, with an accumulation of 1650 scans. Fig. 4 shows the protein species detected in m/zregion 2365–2450 with high heterogeneity of glycosylation. Corresponding glycan nomenclature andstructures are shown in Fig. 5.
4.2. MALDI-ISD analysis of reduced protein
The protein sample was diluted to 1 mg/mL in in 50 mM ammonium bicarbonate buffer and reducedwith 20 mM DTT at 80 °C for 15 min. The reduced sample was then detergent removed. The
Fig. 5. The glycan nomenclature and structure.
Fig. 6. MALDI-ISD mass spectrum of the DTT reduced fusion protein. Underlined sequence indicates the amino acid residuesidentified by phase-corrected c and/or zþ2 ions, which showed an overall sequence coverage of 25.3% of the full sequence.
Table 1Glycosylated protein species detected by Fourier transform ion cyclotron resonance (FT-ICR) accurate mass measurement ofDTT/IdeS treated fusion protein from the 17þ charge state. Highlighted in bold are the most abundance glycoforms detected onthe C-terminal fragment of the protein. Additional information regarding the glycan nomenclature and structure are providedin the Fig. 6.
Potential glycosylatedprotein species
Observed m/z Charge (z) Observed monoisotopicmass [MþH]þ
Theoretical monoisotopicmass [MþH]þ
Mass errorbetweenobserved andtheoretical
(Da) (ppm)
G0 1479.64124 17þ 25137.77353 25139.397044 �1.62 �64.58G0f 1488.28997 17þ 25284.83767 25282.454953 2.38 94.24G1f 1497.76750 17þ 25445.88614 25444.507776 1.38 54.17G2f 1507.29477 17þ 25607.90766 25606.560599 1.35 52.61G1fþNANA 1514.88175 17þ 25736.94787 25735.063193 1.88 73.23G2fþNANA 1524.41738 17þ 25899.02107 25897.656016 1.37 52.71G2fþNANA2 1541.54188 17þ 26190.11349 26188.751433 1.36 52.01
B.Q. Tran et al. / Data in Brief 6 (2016) 68–7674
detergent-free protein was diluted in 50% ACN/0.1% FA at a final concentration of 7 mg/mL. 1 mL ofsample and 1 mL of DAN matrix were spotted on stainless steel MALDI plate. MALDI-ISD mass spectrawere recorded for m/z range 400–10,000 with a transient domain of 1 mega-word data points andestimated resolving power of 270,000 (at m/z 400). MALDI-ISD spectra were phased corrected andprocessed using DataAnalysis and Biotools software. Fragment ions were identified with mass tol-erance 50 ppm and sequence coverage of the fusion protein is shown in Fig. 6.
Fig. 7. LC-MS mass spectrum of IdeS/DDT treated fusion protein. Charge state distribution of glycosylated protein species fromof C-terminal fragment is shown. The mass spectrum was produced from accumulation of 20 scans over an m/z range600–3500 at a resolving power of 120,000.
B.Q. Tran et al. / Data in Brief 6 (2016) 68–76 75
4.3. ETD MS/MS analysis of IdeS-digested protein
The fusion protein sample was diluted to 1 mg/mL in 50 mM ammonium bicarbonate solution anddigested with enzyme IdeS at ratio of 1 enzyme unit to 1 mg protein for 2 h at 37 °C. 20 mM DDT wasadded to cleave disulfide bonds at 80 °C for 15 min. The sample was further concentrated using10 kDa MWCO filter and subject to detergent removal. The sample buffer was exchanged to 5% ACN/0.1% FA and diluted to final concentration of approximately 7 mg/mL in 50%ACN/0.1% FA. IdeS digestionand DTT reduction cut the fusion protein to half the size producing N-terminal and C-terminalfragments.
Accurate mass measurement of the IdeS/DTT treated sample was performed by ESI FT-ICR. Massspectra were collected in magnitude mode with a transient domain of one mega-word data points tocover m/z range 400–4000. Seven glycoforms were identified from the C-terminus at multiple chargestates. Table 1 represents glycosylated protein species detected from 17þ charge peaks and theirglycan nomenclature and structures are shown in Fig. 5.
The sample was also directly infused into the Orbitrap Elite. Data were collected in positive ionmode for 350–3500 m/z range at resolving power of 120,000 (Fig. 7). Precursor masses of the gly-coforms were isolated for ETD fragmentation. ETD mass spectra were acquired with 5 microscans atresolving power of 120,000 and processed by MASH Suite for sequence coverage analysis. Manualinterpretation of the ETD spectra confirmed that glycosylation by G0f or G1f occurred on a site N207and the protein has a C-terminal lysine truncation (Fig. 8).
G0f
G0f
G1f
Fig. 8. LC-MS/MS ETD mass spectra of (A) precursor m/z 1266, and (B and C) m/z 1055 were searched against C-terminalfragment of DTT/IdeS treated fusion protein. Identified c- and z- fragment ion series are indicated in diamonds in the massspectra and relevant sequence identifications are shown in the associated sequences to the right of each mass spectrum.Detection of glycosylated protein species, G0f and G1f was confirmed on N207 (in green) with C-terminal lysine truncation.
B.Q. Tran et al. / Data in Brief 6 (2016) 68–7676
Acknowledgments
The data collection was supported in part by collaborative grant from MedImmune, LLC (2014-2312 RSA), and University of Maryland, Baltimore, School of Pharmacy Mass Spectrometry Center(SOP1841-IQB2014).
Appendix A. Supplementary material
Supplementary data associated with this article can be found in the online version at http://dx.doi.org/10.1016/j.dib.2015.11.031.
References
[1] B.Q. Tran, C. Barton, J. Feng, A. Sandjong, S.H. Yoon, S. Awasthi, T. Liang, M.M. Khan, D.P.A. Kilgour, D.R. Goodlett, Y.A. Goo,Comprehensive glycosylation profiling of IgG and IgG-fusion proteins by top-down MS with multiple fragmentationtechniques, J. Proteomics. (2015), pii: S1874-3919(15)30164-0 http://dx.doi.org/10.1016/j.jprot.2015.10.021 [Epub ahead ofprint].
[2] P.A. Marichal-Gallardo, M.M. Álvarez, State-of-the-art in downstream processing of monoclonal antibodies: process trendsin design and validation, Biotechnol. Prog. 28 (2012) 899–916. http://dx.doi.org/10.1002/btpr.1567.
[3] D.P.A. Kilgour, R. Wills, Y. Qi, P.B. O’Connor, Autophaser: an algorithm for automated generation of absorption mode spectrafor FT-ICR MS, Anal. Chem. 85 (2013) 3903–3911. http://dx.doi.org/10.1021/ac303289c.
[4] H. Guner, P.L. Close, W. Cai, H. Zhang, Y. Peng, Z.R. Gregorich, Y. Ge, MASH Suite: a user-friendly and versatile softwareinterface for high-resolution mass spectrometry data interpretation and visualization, J. Am. Soc. Mass Spectrom. 25 (2014)464–470. http://dx.doi.org/10.1007/s13361-013-0789-4.
http://dx.doi.org/10.1016/j.dib.2015.11.031http://dx.doi.org/10.1016/j.dib.2015.11.031http://dx.doi.org/10.1016/j.jprot.2015.10.021http://dx.doi.org/10.1002/btpr.1567http://dx.doi.org/10.1002/btpr.1567http://dx.doi.org/10.1002/btpr.1567http://dx.doi.org/10.1021/ac303289chttp://dx.doi.org/10.1021/ac303289chttp://dx.doi.org/10.1021/ac303289chttp://dx.doi.org/10.1007/s13361-013-0789-4http://dx.doi.org/10.1007/s13361-013-0789-4http://dx.doi.org/10.1007/s13361-013-0789-4
Glycosylation characterization of therapeutic mAbs by top- and middle-down mass spectrometryDataExperimental design, materials and methodsAnalysis of the IgG proteinMALDI-ISD analysis of reduced proteinLC-MS/MS analysis of IdeS-digested protein
Analysis of the IgG fusion proteinAccurate mass measurement of intact proteinMALDI-ISD analysis of reduced proteinETD MS/MS analysis of IdeS-digested protein
AcknowledgmentsSupplementary materialReferences