Introduction to Protein Chemistry October 2013 Gustavo de Souza IMM, OUS.

Post on 13-Jan-2016

242 views 23 download

Tags:

transcript

Introduction to Protein Chemistry

October 2013

Gustavo de SouzaIMM, OUS

Relevance of the Proteome

Relevance of the Proteome

«The recipe of life»

X

Chocolate cake:-Egg-Flour-Sugar-Baker’s yeast-Chocolate

Biological relevance lies on how genes are expressed and translated to proteins, not if genes are present or not

Amino acid structure

AA side chains

Protein Translation

Peptide Bond

Primary Structure

Primary Structure

>sp|F2Z333|CA233_HUMAN Fibronectin type-III domain-containing transmembrane protein C1orf233

MRAPPLLLLLAACAPPPCAAAAPTPPGWEPTPDAPWCPYKVLPEGPEAGGGRLCFRSPAR

GFRCQAPGCVLHAPAGRSLRASVLRNRSVLLQWRLAPAAARRVRAFALNCSWRGAYTRFP

CERVLLGASCRDYLLPDVHDSVLYRLCLQPLPLRAGPAAAAPETPEPAECVEFTAEPAGM

QDIVVAMTAVGGSICVMLVVICLLVAYITENLMRPALARPGLRRHP

Folding

Primary Structure - Folding

>sp|F2Z333|CA233_HUMAN Fibronectin type-III domain-containing transmembrane protein C1orf233

MRAPPLLLLLAACAPPPCAAAAPTPPGWEPTPDAPWCPYKVLPEGPEAGGGRLCFRSPAR

GFRCQAPGCVLHAPAGRSLRASVLRNRSVLLQWRLAPAAARRVRAFALNCSWRGAYTRFP

CERVLLGASCRDYLLPDVHDSVLYRLCLQPLPLRAGPAAAAPETPEPAECVEFTAEPAGM

QDIVVAMTAVGGSICVMLVVICLLVAYITENLMRPALARPGLRRHP

Folding

Proteins can adopt only a limited number of different protein folds

Secondary Structure

Tertiary Structure

Quaternary Structure

Primary to Quaternary

Primary to Quaternary

What is a «protein sample» in proteomics?

RNA-binding protein modules

Take home message

1. Proteins are the functionally active molecule in a cell. 2. They possess a high degree of chemical and

structural heterogeneity.

3. Heterogeneity interfere in how a protein sample can be analyzed

Challenges in Protein and Proteomic Analysis

October 2013

Gustavo de SouzaIMM, OUS

A dangerous idea…

One gene, one protein

Homo sapiens

Complexity of Protein Samples in Eukaryotes

Complexity of Protein Samples in Eukaryotes

A less dangerous idea

One gene, some proteins (let’s say average 5 per gene)

Homo sapiens

Complexity of Protein Samples in Eukaryotes

PTMs

(modifications thatcontrol conformationchanges in histones)

An even less dangerous idea

One protein, possible 8 modification sites

Homo sapiens

An even less dangerous idea

But in reality…

One specific cell does NOT express all genes at once!

-Several transcriptomics studies indicated that the cells under study have ~14000 transcripts at a certain time

Homo sapiens

Proteome Dynamics

A B C

Genome is a relatively static element of an organism, the proteome is changing accordingly to cell type, cell

stage developmet, response to stress, etc.

Proteome dynamics within the same cell

Proteome can change with the least of the stimuli within a cell

Proteome chemical heterogeneity

DNA - Negatively charged molecule

Has the same phisico-chemical featuresregardless of: its nucleotide sequence,its tissue source, its donor source, thespecies of the donor, etc.

Amino acid structure

AA side chains

Proteome chemical heterogeneity

Membrane proteins

Proteome dynamic range

GenomeMostly, individual genes are observed equimolar amounts in a DNA molecule

Transcription/Translation

Protein concentration within a cellis unique to each individual protein

Difference between most and leastabundant molecule = dynamic range

Proteome dynamic range

Proteome dynamic range

Geiger et al., MCP 2012

Dynamic range of a proteome estimated to be around 10e8 (in serum isbelieved to be over 10e10)

Difference between the most and lowest abundant proteins

Pro

tein

ab

unda

nce

Protein GO classification

Cytoskeleton (Actin, tubulin, vimentin)

Chaperons (hsp60, hsp70, calreticulin)

Mytochondria (respiratory chain)

Metabolism (glycolisis, ribosomal)

Structure Nucleus (histones)

OrganellesSignalling pathway proteins, transcription factors, etc

Proteome dynamic range

Instrumentation

Aebersold & Mann, Nature 2003

Instrumentation

- Instrumentations with different hardware generate different types of raw data.

- Different brands developed different computer formats, with need for different libraries to read the file.

- Which lead to development of a whole bunch of specific software usingspecific computational protocols.

- Lack of standard routine.

Take home message

1. Proteomic composition is at least 6x more complex than the genomic composition of a cell, if only number of entities is considered.

2. It is an ever changing feature, limited by spatial and

time constrains.

3. Chemical properties and dynamic range has an relevant impact in success rate of identification using proteomic methods.

4. Instrumentation and Analysis is not standardized.

Introduction to Mass Spectrometry

Interpreting peptide/protein data

October 2013

Gustavo de SouzaIMM, OUS

3D Quadrupole ion trapLinear Quadrupole ion trap

Lets talk about…physics

What is it?

-Instrument which can detect the mass-to-charge (m/z)

of ions (or ionized molecules).

a) Ionization must generate ions in gas-phase

b) Ion detection is proportional to its abundance in the sample

c) MS performs at extremely low pressures (vacuum)

- Any molecule is ionizable: small organic/inorganic chemicals(less than 300 Da), average sized peptides or DNAfragments, intact proteins.

Mass Spectrometry Scheme

InletIon

SourceMass

AnalyzerDetector

MALDIES

Time-of-FlightQuadrupole

Ion Trap

LC

Ion Intensity = Ion abundance

Isotopes

Normally observed in nature.

Mass difference = 1 Da

What to expect from a mass spectrum

m/z

Inte

nsity

Avogadro number = 6.022x10e23 /mol

100331_Gustavo_Tuberculosis_179rif_Rep1_07 #2435 RT: 38.32 AV: 1 NL: 4.95E5T: FTMS + p NSI Full ms [300.00-2000.00]

1033.0 1033.5 1034.0 1034.5 1035.0 1035.5 1036.0 1036.5 1037.0 1037.5m/z

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100R

ela

tive

Ab

un

da

nce

1034.49

1034.99

1035.49

1035.99

- Isotopes (12C, 13C, 14N, 15N)

Peptide mass spectrum

Mass Spectrometry Scheme

InletIon

SourceMass

AnalyzerDetector

MALDIES

Time-of-FlightQuadrupole

Ion Trap

LC

How is a sample ionized?

-Electron ionization-Chemical ionization-Fast Atom/Ion Bombardment-Field desorption-Plasma Desorption-Laser Desorption and MALDI-Thermospray-Electrospray-Atmospheric pressure chemical ionization

Matrix Assisted Laser Desorption Ionization

Peptide spectrum on MALDI

Protein spectrum on MALDI

A little history…

1985 – First use: up to a 3 kDa peptide could be ionized

1987 – Method to ionize intact proteins (up to 34 kDa) described

Instruments have no sequence capability

1989 – ESI is used for biomolecules (peptides)

Sequence capability, but low sensitivity

1994 – Term «Proteome» is coined

1995 – LC-MS/MS is implemented

«Gold standard» of proteomic analysis

A little history…

- Laborious- Low reproducibility- Time consuming- Low sensitivity- Limited amount of

identifications

Gradient elution:200 nl/min

Column (75 mm)/spray tip (8 mm)

Reverse-phase C18 beads, 3 mm

Platin-wire2.0 kV

Sample Loading:500 nl/min

No precolumn or split

ESI

15 cm

Fenn et al., Science 246:64-71, 1989.

Electrospray Ionization

ESI multiple charged elements

Peptides

+ + (-NH2)

+ ++

Proteins

+ + + + + + ++ + + + + + + +

+ ++ +

+ + +

+ +

ESI multiple charged elements

+ ++ +

+ + +

+ +

m/z

Inte

nsity

500.5 (+2)

334.0 (+3)

250.75 (+4)

1000 Da

100331_Gustavo_Tuberculosis_179rif_Rep1_09 #3828 RT: 56.72 AV: 1 NL: 1.53E7T: FTMS + p NSI Full ms [300.00-2000.00]

400 600 800 1000 1200 1400 1600 1800 2000m/z

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

Re

lativ

e A

bu

nd

an

ce

766.72

867.95

653.33

578.641149.57

709.06

557.31 1063.09

728.39

1227.11921.51 1891.351682.72483.80 1453.231346.65343.21

100331_Gustavo_Tuberculosis_179rif_Rep1_09 #3828 RT: 56.72 AV: 1 NL: 2.36E6T: FTMS + p NSI Full ms [300.00-2000.00]

1148.0 1148.5 1149.0 1149.5 1150.0 1150.5 1151.0 1151.5 1152.0m/z

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

Re

lativ

e A

bu

nd

an

ce

1149.57

1149.07

1150.07

1150.57

1151.08

1151.571152.08

100331_Gustavo_Tuberculosis_179rif_Rep1_09 #3828 RT: 56.72 AV: 1 NL: 1.53E7T: FTMS + p NSI Full ms [300.00-2000.00]

765.0 765.5 766.0 766.5 767.0 767.5 768.0 768.5 769.0 769.5m/z

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

Re

lative

Ab

un

da

nce

766.72

766.38

767.05

767.38

767.72

768.05768.39765.39764.90 766.04

767.43

0.5 Da (+2) 0.33 Da (+3)

Mr = 2297.14 Da

Peptides on ESI

ESI of intact protein *

Mass Spectrometry Scheme

InletIon

SourceMass

AnalyzerDetector

MALDIES

Time-of-FlightQuadrupole

Ion Trap

LC

How is an ion mass measured?

Time-of-flight

m/z

How is a ion mass measured?

Quadrupoles (RF)

How is a ion mass measured?

Orbitraps

Tandem Mass Spectrometry

InletIon

SourceMass

AnalyzerDetector

IonSource

MassAnalyzer

DetectorMass

AnalyzerMass

Analyzer

Collision cell

Data Dependent Acquisition

899.013

899.013

899.013

MS1 (or MS)

MS2 (or MS/MS)

*

Important Parameters in MS

- Resolution- Sensitivity- Dynamic Range…

m/zm/z

m/z m/z

2+ 2+

High resolution in MS

1. mass accuracy

Expected mass

Observed mass

High resolution in MS

1. mass accuracy

Ion trap (LTQ) Mass accuracy

-600-400-200

0200400600

0 1000 2000 3000

Mass [Da]

Err

or

[pp

m]

Av. = 65.8 ppm ± 71.5

FTICR MS SIM (LTQ-FT, 50K)

-3

-2

-1

0

1

2

0 1000 2000 3000 4000

Mass [Da]

Err

or

[pp

m]

qTOF Mass Accuracy (QSTAR)

-60-40-20

0204060

0 1000 2000 3000

Mass [Da]

Err

or

[pp

m]

Av. = 16.5 ppm ± 11.2

FTICR MS (LTQ-FT, 500K)

-30

-20

-10

0

10

20

30

500 1000 1500 2000 2500 3000

Mass [Da]

Err

or [p

pm]

Av. = 2.1 ppm ± 1.9 Av. = 0.68 ppm ± 0.47

RT

m/z

RT

m/z

2+ 2+

3+ 3+

2. Peak separation

High resolution in MS

LC-MS/MS

With all we (hopefully) learned so far

1) Use strong detergent for cell lysis and protein solubization (SDS,Triton, NP40, Tween)

2) LysC (cuts C-terminal side of K) and/or Trypsin (C-terminal of K and R)

With all we (hopefully) learned so far

ADFFFSTTHAASRMSHHHGTYYPPHKRFSDDDDT

ADFFFSTTHAASRMSHHHGTYYPPHKFSDDDT

+ +

Arg Lys

With all we (hopefully) learned so far

3) Nano-LC (300nL/min)

5) Quadrupole-Orbitrap (QExactive)

With all we (hopefully) learned so far

Mobile phase

A

A = 5% organic solvent in waterB = 95% organic solvent in water

B

C18 column, 25cm long

Time

20 s

899.013

899.013

899.013

With all we (hopefully) learned so far

MS1 (or MS)

MS2 (or MS/MS)

With all we (hopefully) learned so far

Quadrupole

Orbitrap

From Michalski et al., MCP 10, 2011.

With all we (hopefully) learned so far

172,800

Take home message

- Great diversity of hardware and principles. Different formsof Ionization and Mass measurement.

- For protein ID, information regarding the mass of a integral peptide and the mass of its fragments is enoughto provide identification

- Mass spectrometry is used to analyze the molecular massof molecules.