High Throughput and Large Scale Proteomics Analysis

High Throughput and Large Scale Proteomics Analysis

Austin Yang, Ph.D.Department of Pharmaceutical Sciences, University of Southern California

http://rds.yahoo.com/S=96062883/K=trojan+and+usc/v=2/l=IVI/*-http://www2.sjsu.edu/orgs/sigma_delta_alpha/images/icons/trojan.gif

Overview

1. Shotgun proteomics and ESI mass spectrometry

2. Proteomic data mining and data visualization

12,000 proteins

Metabolism 0.1 mM, 1x 108

Ribosomes

10 1x 107

Kinases

Cyclins

1 1x 106

0.1 1x 105

Transcription factors 10 nM, 1x 104

Synaptic Markers 0.1 nM, 1x 103

Cytoskelatal Proteins mM, 1x 109 copies/cell

2-D GelShotgun

Proteomics

Are We Ready for Mammalian Proteomics ?

Advantages of Proteomics Using LC-MS/MS

• No pre-selection of biased targets(hypothesis-free, open approach)

• Protein variants are detected simultaneously

• Protein isolation and detection are on a small scale (~ 10 fmol from complex mixtures – subcellular fractions, whole cells, or tissue)

• Obtain sequence information of peptides (not just masses) and can sequence ~4,000 proteins in a single experiment

Liquid Chromatography Quadrupole Ion Trap Tandem Mass Spectrometer

Electrospray vs Nanospray

Splitless Nano-Liquid Chromatography

Five Independent Loop Injections

SCX (NH4OAc)

RF#1 RF#2

100 mM MSM Wash200 mM Wash MS300 mM MS Wash400 mM Wash MSM500 mM MSS Wash600 mM Wash MSM700 mM MSM Wash800 mM Wash MSS900 mM MS Wash1000 mM Wash MS

10-cycle MudPIT Analysis

100

200

300

400

500

0-500 mM NH4OAc

SCX Column

100200

300

500400

RP #1 RP #2

1,000-2,000 Sequencing Attempts in 60 Minutes

20,000 MS/MS spectra/day

Digested protein complexes Multidimensional Protein Identification Technology (MudPIT)

Isotope-Coded Affinity Tags (ICAT)

Electrospray Ionization (ESI)

Ions in solution

Ions in gaseous phase

LC

Spray tip

Ion sourceopeningfor the MS

b1

b2

b3

y1

y2

y3

LF G K

Rela

t ive I

nte

nsit

y

m/z

F L G K

++

F L G K

++

F L G K

++

CID

F L G K++

F L G K

++

F L G K

++

b1

b2

b3

y3

y2

y1 F L G K

++

F L G K

+

Theoretical CID of a Tryptic Peptide

K G L F

MS/MSSpectrum

Parentions

(464.29)

Daughter ionsNon-dissociatedParent ions

SequestQueue (6,000 dta x50 = 300,000 ms/ms scans)

Data Mining through SEQUEST and PAULA

Database Search Time•Yeast ORFs (6,351 entries) 52 sec: 0.104 sec/s•Non-redundant protein (100k entries) 3500 min: •EST (100K entries, 3-frames) 5-10,000 min:

SEQ 1

SEQ 2

SEQ 3

SEQ 4

STEP 1.

STEP 3.

SEQUEST Algorithm

(Experimental MS/MS Spectrum)

500 peptides with masses closest to that of the parent ion are retrieved from a protein database. Computer generates a theoretical MS/MS Spectrum for each peptide sequence (SEQ1, 2, 3, 4, …)

(Experimental MS/MS Spectrum)

Theoretical MS/MSspectra

Step 1.Determine Parent

Ion molecular mass

Step 2.

Step 3.Experimental Spectrum is compared with each theoretical spectra and correlation scores are assigned.

Step 4.Scores are ranked andProtein Identifications are made based on these cross correlation scores.

ZSA-charge assignment

Unified Scoring Function

One spectrum TWO protein identifications

Spectrum A was used to search againstNCBI human database: Macrophage inhibitory factor was identified

Same spectrum was used to search againstnon-redundant database. Bovine G-proteingamma was identified. Since the primary amino acid sequence of human G-protein gamma is almost identical to bovine, this protein was later identified as human G-proteinGamma. The initial false ID was due to an entry missing of human g-protein in humandatabase. The sequence was later reenteredInto the human database and the third searchyielded correct ID.

Fragment ions match both sequences are indicated by *Spectrum B has two additional ions matched to G-protein gamma

Mol Cell Proteomics. 2003 Jul;2(7):428-42.

Distribution of Xcorr from correctly and incorrectly identified peptides

X-correlation vs Peptide length

Distribution of Xcorr vs Charge State

F-score and probability-based peptide assignment

Identification of modified LRP in APP/PS1 Transgenic Mice

Tg Peptide

A) 1. (Q9WV18) Gamma-aminobutyric acid type B receptor, subunit 1 precursor (GABA-B-R1)

2. (NP_032102.1) gamma-aminobutyric acid (GABA-A) receptor, subunit rho 2

3. (NP_034382.1) gamma-aminobutyric acid A receptor, gamma 1

4. (NP_033733.1) cholinergic receptor, nicotinic, epsilon polypeptide; acetylcholine receptor

5. (NP_150372.1) cholinergic receptor, muscarinic 3, cardiac; AChR M3

6. (S28058) serotonin receptor 5

7. (NP_031903.1) dopamine receptor 3; D3 receptor

8. (Q60934) Glutamate receptor, ionotropic kainate 1 precursor (Glutamate receptor 5)

9. (I49696) glutamate receptor chain B (version flip)

B) 1. (NP_038589.1) 5-hydroxytryptamine (serotonin) receptor 3A

2. (P30545) Alpha-2B adrenergic receptor (Alpha-2B adrenoceptor)

3. (NP_032195.1) glutamate receptor, ionotropic, NMDA1 (zeta 1)

4. (NP_032198.1) glutamate receptor, ionotropic, NMDA2D (epsilon 4); GluRepsilon4

5. (I49696) glutamate receptor chain B (version flip)

C) 1 (NP_034428.1) glycine receptor, beta subunit

2. (JC4262) glutamate transporter 2

Neurotransmitter Receptors

Proteomic Data Visualization and Future Directions

• information overload• data integration• ease of visualization

Network for NMDA and glutamate receptors

Network for NMDA and glutamate receptors(Zoom-in)

Scoring Algorithm for Spectral Analysis

Raw Unidentified Spectra(~10,000-100,000)

Identified Sequence

SEQUEST

SALSA

http://www.pharmacy.arizona.edu/faculty/liebler/salsa.shtml

• SALSA is a tool for identifying MS-MS spectra in Xcalibur analysis files that display specific user-defined characteristics. Because these characteristics correspond to structural features of a peptide, SALSA allows the user to selectively locate MS-MS spectra of specific peptides or their variant or modified forms.

SALSA Overview

*

product ion

neutral loss Mass difference

T W D G A

ion series

charged loss

Construction of SALSA ruler GAIIGLMGGVV

GAIIGLMG

GAIIGLMGG

GAIIGLMGGV

GAIIGLMGGVV

GAIIGLM

GAIIGLGAIIG

GA GAIGAII

GAIIGLGAIIG

GA GAIGAII

GAIIGLMG

GAIIGLMGG

GAIIGLMGGV

GAIIGLMGGVVGAIIGLM

Methionine Oxidation16 amu (one oxygen atom)

m/z

GAIIGLMVGGVVGAIIGLMVGGVV: +7 amu

y3 y5

b4

b6

[b11*]+2

y6*

b11*

y9*b7*

y8*

b12*y7*

b9*

y3

b9*

y5y6*y7*y8*

G A I

y9*

I

b4 b7*b6

MLG G G VV

b11*

V

b12*

A.

y3 y5

b4

b6

[b11*]+2

y6*

b11*

y9*b7*

y8*

b12*y7*

b9*

y3

b9*

y5y6*y7*y8*

G A I

y9*

I

b4 b7*b6

MLG G G VV

b11*

V

b12*

A.

y3 y5

b4

b6

[b11*]+2

y6*

b11*

y9*b7*

y8*

b12*y7*

b9*

y3

b9*

y5y6*y7*y8*

G A I

y9*

I

b4 b7*b6

MLG G G VV

b11*

V

b12*

y3

b9*

y5y6*y7*y8*

G A I

y9*

I

b4 b7*b6

MLG G G VV

b11*

V

b12*

A. [Aß29-40+1O]+1

[Aß29-40+2O]+1

[Aß29-40]+1

B. [Aß29-40+1O]+1

[Aß29-40+2O]+1

[Aß29-40]+1

B.

Quantification of Methionine OxidationAbsolute Quantification Analysis

Date post:	03-Jan-2016
Category:	Documents
Upload:	nora-webster
View:	39 times
Download:	1 times

High Throughput and Large Scale Proteomics Analysis

Documents