Date post: | 03-Jan-2016 |
Category: |
Documents |
Upload: | nora-webster |
View: | 39 times |
Download: | 1 times |
High Throughput and Large Scale Proteomics Analysis
Austin Yang, Ph.D.Department of Pharmaceutical Sciences, University of Southern California
Overview
1. Shotgun proteomics and ESI mass spectrometry
2. Proteomic data mining and data visualization
12,000 proteins
Metabolism 0.1 mM, 1x 108
Ribosomes
10 1x 107
Kinases
Cyclins
1 1x 106
0.1 1x 105
Transcription factors 10 nM, 1x 104
Synaptic Markers 0.1 nM, 1x 103
Cytoskelatal Proteins mM, 1x 109 copies/cell
2-D GelShotgun
Proteomics
Are We Ready for Mammalian Proteomics ?
Advantages of Proteomics Using LC-MS/MS
• No pre-selection of biased targets(hypothesis-free, open approach)
• Protein variants are detected simultaneously
• Protein isolation and detection are on a small scale (~ 10 fmol from complex mixtures – subcellular fractions, whole cells, or tissue)
• Obtain sequence information of peptides (not just masses) and can sequence ~4,000 proteins in a single experiment
Liquid Chromatography Quadrupole Ion Trap Tandem Mass Spectrometer
Electrospray vs Nanospray
Splitless Nano-Liquid Chromatography
Five Independent Loop Injections
SCX (NH4OAc)
RF#1 RF#2
100 mM MSM Wash200 mM Wash MS300 mM MS Wash400 mM Wash MSM500 mM MSS Wash600 mM Wash MSM700 mM MSM Wash800 mM Wash MSS900 mM MS Wash1000 mM Wash MS
10-cycle MudPIT Analysis
100
200
300
400
500
0-500 mM NH4OAc
SCX Column
100200
300
500400
RP #1 RP #2
1,000-2,000 Sequencing Attempts in 60 Minutes
20,000 MS/MS spectra/day
Digested protein complexes Multidimensional Protein Identification Technology (MudPIT)
Isotope-Coded Affinity Tags (ICAT)
Electrospray Ionization (ESI)
Ions in solution
Ions in gaseous phase
LC
Spray tip
Ion sourceopeningfor the MS
b1
b2
b3
y1
y2
y3
LF G K
Rela
t ive I
nte
nsit
y
m/z
F L G K
++
F L G K
++
F L G K
++
CID
F L G K++
F L G K
++
F L G K
++
b1
b2
b3
y3
y2
y1 F L G K
++
F L G K
+
Theoretical CID of a Tryptic Peptide
K G L F
MS/MSSpectrum
Parentions
(464.29)
Daughter ionsNon-dissociatedParent ions
SequestQueue (6,000 dta x50 = 300,000 ms/ms scans)
Data Mining through SEQUEST and PAULA
Database Search Time•Yeast ORFs (6,351 entries) 52 sec: 0.104 sec/s•Non-redundant protein (100k entries) 3500 min: •EST (100K entries, 3-frames) 5-10,000 min:
SEQ 1
SEQ 2
SEQ 3
SEQ 4
STEP 1.
STEP 3.
SEQUEST Algorithm
(Experimental MS/MS Spectrum)
500 peptides with masses closest to that of the parent ion are retrieved from a protein database. Computer generates a theoretical MS/MS Spectrum for each peptide sequence (SEQ1, 2, 3, 4, …)
(Experimental MS/MS Spectrum)
Theoretical MS/MSspectra
Step 1.Determine Parent
Ion molecular mass
Step 2.
Step 3.Experimental Spectrum is compared with each theoretical spectra and correlation scores are assigned.
Step 4.Scores are ranked andProtein Identifications are made based on these cross correlation scores.
ZSA-charge assignment
Unified Scoring Function
One spectrum TWO protein identifications
Spectrum A was used to search againstNCBI human database: Macrophage inhibitory factor was identified
Same spectrum was used to search againstnon-redundant database. Bovine G-proteingamma was identified. Since the primary amino acid sequence of human G-protein gamma is almost identical to bovine, this protein was later identified as human G-proteinGamma. The initial false ID was due to an entry missing of human g-protein in humandatabase. The sequence was later reenteredInto the human database and the third searchyielded correct ID.
Fragment ions match both sequences are indicated by *Spectrum B has two additional ions matched to G-protein gamma
Mol Cell Proteomics. 2003 Jul;2(7):428-42.
Distribution of Xcorr from correctly and incorrectly identified peptides
X-correlation vs Peptide length
Distribution of Xcorr vs Charge State
F-score and probability-based peptide assignment
Identification of modified LRP in APP/PS1 Transgenic Mice
Tg Peptide
A) 1. (Q9WV18) Gamma-aminobutyric acid type B receptor, subunit 1 precursor (GABA-B-R1)
2. (NP_032102.1) gamma-aminobutyric acid (GABA-A) receptor, subunit rho 2
3. (NP_034382.1) gamma-aminobutyric acid A receptor, gamma 1
4. (NP_033733.1) cholinergic receptor, nicotinic, epsilon polypeptide; acetylcholine receptor
5. (NP_150372.1) cholinergic receptor, muscarinic 3, cardiac; AChR M3
6. (S28058) serotonin receptor 5
7. (NP_031903.1) dopamine receptor 3; D3 receptor
8. (Q60934) Glutamate receptor, ionotropic kainate 1 precursor (Glutamate receptor 5)
9. (I49696) glutamate receptor chain B (version flip)
B) 1. (NP_038589.1) 5-hydroxytryptamine (serotonin) receptor 3A
2. (P30545) Alpha-2B adrenergic receptor (Alpha-2B adrenoceptor)
3. (NP_032195.1) glutamate receptor, ionotropic, NMDA1 (zeta 1)
4. (NP_032198.1) glutamate receptor, ionotropic, NMDA2D (epsilon 4); GluRepsilon4
5. (I49696) glutamate receptor chain B (version flip)
C) 1 (NP_034428.1) glycine receptor, beta subunit
2. (JC4262) glutamate transporter 2
Neurotransmitter Receptors
Proteomic Data Visualization and Future Directions
• information overload• data integration• ease of visualization
Network for NMDA and glutamate receptors
Network for NMDA and glutamate receptors(Zoom-in)
Scoring Algorithm for Spectral Analysis
Raw Unidentified Spectra(~10,000-100,000)
Identified Sequence
SEQUEST
SALSA
• SALSA is a tool for identifying MS-MS spectra in Xcalibur analysis files that display specific user-defined characteristics. Because these characteristics correspond to structural features of a peptide, SALSA allows the user to selectively locate MS-MS spectra of specific peptides or their variant or modified forms.
SALSA Overview
*
product ion
neutral loss Mass difference
T W D G A
ion series
charged loss
Construction of SALSA ruler GAIIGLMGGVV
GAIIGLMG
GAIIGLMGG
GAIIGLMGGV
GAIIGLMGGVV
GAIIGLM
GAIIGLGAIIG
GA GAIGAII
GAIIGLGAIIG
GA GAIGAII
GAIIGLMG
GAIIGLMGG
GAIIGLMGGV
GAIIGLMGGVVGAIIGLM
Methionine Oxidation16 amu (one oxygen atom)
m/z
GAIIGLMVGGVVGAIIGLMVGGVV: +7 amu
y3 y5
b4
b6
[b11*]+2
y6*
b11*
y9*b7*
y8*
b12*y7*
b9*
y3
b9*
y5y6*y7*y8*
G A I
y9*
I
b4 b7*b6
MLG G G VV
b11*
V
b12*
A.
y3 y5
b4
b6
[b11*]+2
y6*
b11*
y9*b7*
y8*
b12*y7*
b9*
y3
b9*
y5y6*y7*y8*
G A I
y9*
I
b4 b7*b6
MLG G G VV
b11*
V
b12*
A.
y3 y5
b4
b6
[b11*]+2
y6*
b11*
y9*b7*
y8*
b12*y7*
b9*
y3
b9*
y5y6*y7*y8*
G A I
y9*
I
b4 b7*b6
MLG G G VV
b11*
V
b12*
y3
b9*
y5y6*y7*y8*
G A I
y9*
I
b4 b7*b6
MLG G G VV
b11*
V
b12*
A. [Aß29-40+1O]+1
[Aß29-40+2O]+1
[Aß29-40]+1
B. [Aß29-40+1O]+1
[Aß29-40+2O]+1
[Aß29-40]+1
B.
Quantification of Methionine OxidationAbsolute Quantification Analysis