Essential Guide to Reading Biomedical Papers (Recognising and Interpreting Best Practice) ||...

CH24 09/26/2012 12:15:0 Page 215

24Experimental Proteomics

Thierry Le BihanSchool of Biological Sciences, University of Edinburgh, UK

24.1 Basic ‘how-to-do’ and ‘why-do’ section24.1.1 What is proteomics used for?

Proteomics can be defined as the analysis of:

1. howmuch of a given protein or proteins is expressed (under specific conditions);

2. how proteins interact with each other; and

3. what is the nature and the dynamics of their modification (post-translational

modification)?

Although it is often considered that the proteome is to proteins what the genome

is to genes, several major divergences emerge, due to the differences both in their

nature and in their respective roles.

24.1.2 How do proteomics and genomics differ?

Amajor difference between genomics (see Primer 23) and proteomics is that there is

no protein equivalent to the polymerase chain reaction used for DNA amplification

(PCR; Primer 20). Moreover, cellular proteins have expression levels that vary over

a very large dynamic range (i.e. very low to very high levels), which is a significant

challenge to proteomic techniques. Essentially, the dynamic range of a mass

spectrometer must be able to cover at least four to five orders of magnitude in

protein abundance, knowing that, within a simple cell like yeast, this difference can

reach more than six orders of magnitude. At the extreme, this can mean attempting

Essential Guide to Reading Biomedical Papers: Recognising and Interpreting Best Practice, First Edition.

Edited by Phil Langton.

� 2013 by John Wiley & Sons, Ltd. Published 2013 by John Wiley & Sons, Ltd.

CH24 09/26/2012 12:15:0 Page 216

to detect a single molecule (copy) of one protein in a cell that may contain over a

million copies of several other proteins and smaller quantities or many hundreds

of others.

This difference, between what can be technically achieved in terms of detection

and what is actually present within a sample, is often referred to in the literature as

being simply the ‘tip of the iceberg’. Having such a vast dynamic range to be

covered is probably one of the most important challenges to tackle in the proteomics

field, and it ultimately defines the chosen experimental design. In addition, proteins

are continuously being synthesized and degraded, which adds the important variable

of ‘time’ to the protein detection equation. This characteristic is not captured at the

mRNA level and explains why, in some cases, a poor correlation between mRNA

and protein level exists.

This last example clearly supports the notion that for protein quantitation, it is

better to use a direct approach (i.e. protein measurement) instead of an indirect

approach based on mRNA transcript measurement.

Although proteomics does have its limitations, it has some advantages over other

techniques as well. For example, it provides the ability to acquire, relatively quickly,

either a global or a specific snapshot of the protein composition within a sample.

This ‘snapshot’ can be obtained either with or without prior knowledge of cellular or

tissue-specific protein abundance.

To some extent, some of the above-mentioned challenges in proteomics have

been tackled by significant improvements in sample preparation and separation

techniques and by improving the mass spectrometer instrument itself, as well as

improved data analysis methods and bioinformatics.

24.2 Important considerations24.2.1 Sample preparation and separation techniques

A typical proteomic mass spectrometry-based analysis is roughly based on protein

or peptide separation, followed by analysis using a mass spectrometer with the

peptide sequence often being confirmed from an already available database of

known protein sequences.

In a typical proteomic approach (often referred to as ‘bottom-up’), the protein

samples are digested with a protease such as trypsin. The peptides extracted are then

analyzed using a mass spectrometer, where peptide masses are measured (in MS

mode) and using isolation and collisional activation energy in tandem MS (often

described as MSMS or MS2).

The peptide sequence can be deduced from the mass/charge of its different

fragments. Although a bottom-up approach is appropriate for protein identification

in complete mixtures, not all the peptides will be identified in MS mode. Further-

more, not all of the peptide fragments generated by ionization will be detected by

the mass spectrometer, due to some of the physico-chemical properties of the

216 CH24 EXPERIMENTAL PROTEOMICS

CH24 09/26/2012 12:15:0 Page 217

peptide itself. Therefore, the incomplete protein coverage which is typical of a

bottom-up approach often leaves gaps in the information obtained for a given

protein (e.g. information may be missing regarding a protein’s post-translational

modification, splice isoform, specific site mutations). In addition, information about

the mature protein sequence or its potential degradation may not be captured. All of

these pieces of missing information can be important elements of a protein’s

biological function.

In most cases, bottom-up approaches fall into two main categories, both of which

are designed to reduce the complexity of the sample being analyzed:

1. Protein separation by 2-Dimensional gel-electrophoresis (2DE): Protein

mixtures are first separated in two orthogonal dimensions. Typically, proteins

are separated based on their net charge (their isoelectric point). Next they are

transferred to a SDS-PAGE gel (see Primer 15 for more detail), where they are

separated again, this time based on their molecular weight. Although this is

both a robust and valuable approach, it is being slowly excluded from the

‘mainstream’ proteomic field as other techniques become more popular.

Nevertheless, the 2DE approach is often used in cases where an organism’s

genome has not been fully sequenced (e.g. in the fields of agronomy and

environmental sciences). Following separation on gel, the proteins are visu-

alized by the use of specific protein stains. The gel area containing a given

protein is then excised from the rest of the gel, and the gel spot is subjected to

proteolytic digestion and the peptides analysed by MS. Even though 2DE is a

well-established approach, it is characterized by a very limited dynamic range

and often poor transfer of hydrophobic membrane proteins from one dimen-

sion to the other.

2. Multidimensional chromatography: Another approach, developed in order to

tackle the complexity challenge is the multidimensional protein identification

technology (MudPIT). This approach is based on the protease digestion of the

entire protein extract. This is separated first on a strong cation-exchange column

and, subsequently, each fraction is separated by reversed-phase chromatography

coupled to mass spectrometry. The MudPITapproach is now considered to be a

minimum standard for any shotgun proteomic study. In studies where each

sample is fractionated to generate between 3–24 fractions, each of those

fractions can take between 60–120 minutes on average to be analyzed on

LC-MS.Therefore, each sample in a given shotgun proteomic studymay require

a complete day of LC-MS time to be analyzed. The ability of MudPIT to dig

deeper into a givenproteome comes at the cost of time invested into a given study

In addition to these main categories, there are specialist technical approaches that

reflect a desire to understand a variety of specific modifications that are made to

proteins and which have functional significance.

24.2 IMPORTANT CONSIDERATIONS 217

CH24 09/26/2012 12:15:0 Page 218

For example, post- translational modifications (PTMs) are the ‘alteration’ of

specific amino acids within a protein sequence which change the properties of the

protein. Protein structure, activity, function and stability are all influenced by

various forms of PTMs. A few examples of post-translational modifications and

their effects on proteins are:

� Acetylation for protein stability and regulation.

� S-nitrosylation and phosphorylation for signal transduction.

� Ubiquitination for proteolysis and protein sorting.

� Disulphide bond formation for stability or redox sensing.

Due to their low abundance and, in some cases, their instability, detecting any

PTM using one of the previously mentioned methods mostly relies on chance and is

very inefficient. A more pragmatic approach consists of enriching for the PTM in a

sufficiently large quantity, which is only possible for few PTMs.

The various methods for PTM enrichment can be divided into several classes.

They are:

1. chemical/physical affinity, such as immobilized metal affinity chromatogra-

phy, which is based on coordinate binding between a metal ion (Fe, Ga) and a

phosphate group on a peptide;

2. a depletion-based approach such as a CnBr column which, under specific

conditions, will capture free N-terminal peptides (acetylated N-terminal

peptides which do not bind to the column are found in the flow-through);

3. immunoaffinity purification techniques such as those that have been developed

for ubiquitin and phosphotyrosine, for example;

4. other ‘hybrid’ or ‘tandem’ methods which are a combination of the above-

mentioned methods.

A typical post-translational modification analysis could have the following

pattern: PTM modified peptides are enriched using one of the methods described

above and then analysed by mass spectrometry. The experimental peptide mass

observed in MS mode are matched against theoretical potential peptides with and

without mass difference associated to a given PTM. For a specific PTM, the

modified amino acid remains unchanged (for example acetylation) and in MSMS

mode the peptide fragments in the presence of a collisional gas (refers to

collision-induced dissociation CID) in a similar manner as its unmodified

counterpart except that a mass shift is observed for the modified amino acid.


CH24 09/26/2012 12:15:0 Page 219

Finally, in the case of peptide PTM enrichment, interpretation of the relative

comparisons between samples can be ambiguous, as one cannot distinguish

between variations in the levels of protein expression and the degree of PTM of

a given protein.

24.2.2 The mass spectrometer: different ionizationmodes and instrument design

Ionization modes To this point, we have described how to reduce sample

complexity (2D GE, MudPit) and how to ensure the sample is compatible with

the MS analysis (protease digestion). In this section, we will describe the two main

approaches used for protein/peptides ionization and a brief word on the different

types of mass spectrometers. Proteins and peptides are non-volatile polar molecules

requiring a soft ionization method for their transfer into a gaseous phase. The two

main approaches used to achieve this are MALDI and ESI:

1. Matrix-assisted laser desorption ionization (MALDI). This involves mixing

the sample with a matrix which absorbs laser energy that is transferred to the

peptides. The laser heat simultaneously induces both the desorption of matrix

and the transfer of singly positively charged ions of peptides into the gas phase.

Some of the known drawbacks of this approach are the generation of single-

charge ions that are often difficult to fragment for sequencing purposes, and

variation of the signal intensity due to the sample preparation.

2. Electrospray ionization (ESI). In this approach, ions are generated from a

solution and are produced by applying a high voltage (2–6 kV) between the

end of the solution separation device (commonly an HPLC column) and the

inlet of the mass spectrometer. Under these conditions, an electrically charged

droplet is created, which results in the formation and desolvation of analyte-

solvent droplets. Under these ionization conditions, peptides often carry

multiple charges and there is some interdependence between ion intensity,

the ion concentration and the flow rate.

Mass spectrometers Specific types of proteomic applications are better suited to

specific types of mass spectrometers. The main characteristics that differentiate by

instrument design are:

� mass accuracy;

� resolving power;

� sensitivity (limit of detection, LOD);


CH24 09/26/2012 12:15:0 Page 220

� sampling rate;

� dynamic range.

These parameters are summarized in Figure 24.1. In general, for most proteomic

applications, hybrid or tandem mass spectrometers are often used because

information regarding the exact peptide mass needs to be extracted (obtained

in MS mode), as well as the isolation and fragmentation which is performed in

MSMS mode.

Figure 24.1 Visual representation of the different major characteristics that distinguishthe different mass spectrometers. (a): the mass accuracy. (b): resolving power. (c and d):Sensitivity and limit of detection (LOD). (e): dynamic ranges. Courtesy of Dr. Thierry Lebihan.A full colour version of this figure appears in the colour plate section.


CH24 09/26/2012 12:15:0 Page 221

In general, MS instruments fall into three broad types:

1. Time of flight (TOF). These instruments are based on the principle that, for

a given charge state, the time ions will take to reach the detector (under

a potential) depends on, and is inversely proportional to, their mass (i.e. low

mass ions will reach the detector before high mass ions). TOF instruments

are often used in a hybrid configuration with a quadrupole (Q-Q-TOF), as

well as in tandem (TOF-TOF), or even in a triple-TOF configuration. These

instruments have a good mass accuracy and good quantitation capability.

Furthermore, they are often used in discovery proteomics (Q-TOF), although

the triple TOF from AB-Sciex, according to the vendor’s claim, has been

designed as a single platform instrument for both discovery and targeted

quantitative proteomics.

2. Quadrupole-based instruments. A quadrupole is a structure composed of

four parallel rods, where a radio frequency (RF) quadrupole field is

generated which stabilizes the path of an ion having a given m/z ratio.

This RF field can be adjusted incrementally, allowing the analysis of a wider

m/z range. As for the TOF-based instrument, some of the quadrupole-based

instruments are found in a combination of three quadrupoles, where the first

and third ‘quads’ are mass filters and the second quadrupole is a collision

chamber (ion fragmentation). These instruments have a lower mass accuracy

and a lower resolving power than a TOF. Depending on how they are used,

they can have a high dynamic range, and they are often used in targeted

proteomics.

3. Ion traps. There are several types of ion traps. They are:

� The three-dimensional quadrupole ion trap, which is based on the same

principle as the quadrupole mass spectrometer described above. In this case,

however, the ions are instead trapped and consecutively ejected.

� The linear ion trap (LIT) differs slightly from the 3-D quadrupole ion trap

because it is a two-dimensional instead of a three-dimensional quadrupole

field. This allows the trapping of a higher number of ions (and increases

the dynamic range of the instrument).

� The Orbitrap, in which ions are trapped in an electrostatic field and rotate

around a central electrode. Ions are characterized by two different oscilla-

tions; one is around the electrode, while the other is a back-and-forth

movement along the electrode. The latter generates an image current which

is dependent on the mass-to-charge ratios. As a result, Orbitrap instruments

have high mass accuracy and sensitivity, as well as a better dynamic range,

compared to the two previous ion traps.


CH24 09/26/2012 12:15:1 Page 222

� Fourier transform ion cyclotron resonance instruments share some com-

mon features with the Orbitrap. In this case, the ions are trapped and

oscillate within a magnetic field instead. Although they are more tedious

instruments, they have high mass accuracy and sensitivity as well as a good

dynamic range.

Figure 24.2 illustrates the strength of each type of mass spectrometer. None are

perfect.

24.2.3 Issues of quantitation

Proteomics has moved from global analysis of organisms to the possibility of

providing more information about proteins – for example, protein abundance,

either in a relative manner (i.e. comparing control vs. disease state or drug

treatment) or absolute quantitation. However, it is quite surprising to see

quantitative measurements still being reported without any form of confidence

in the measurements (either a standard deviation or a p value associated to the

group comparisons). Quantitative proteomics often involves a sample preparation

component as well as a bioinformatics one. In this section, we will concentrate on

sample preparation.

Relative quantitation can be divided into three main groups:

1. In vivo metabolic labelling.

2. In vitro labelling.

3. Label-free.

Labelling refers to the use of a reagent, composed of light monoisotope (based on12C and 14N) combined with stable heavy isotope (commonly based on 13C and15N). Peptides from one sample in an experiment are labelled with light isotope, and

peptides from another sample are labelled with heavy isotope. Both of the now-

labelled peptide samples are mixed together and are similar enough to behave

likewise for the overall procedure, whereas, for a given peptide, both forms (light

Figure 24.2 Main types of mass spectrometers and their characteristics. Courtesy of Dr. ThierryLebihan. A full colour version of this figure appears in the colour plate section.


CH24 09/26/2012 12:15:1 Page 223

and heavy) will separated at the mass detection level. The peak intensity ratio

provides information about the relative abundance of the corresponding protein.

In vivo labelling strategies These can be performed by supplying cells with

labelled amino acid (SILAC) or, in the case of autotrophic organism, using 15N

through nitrate or ammonium salt or 13C (glucose, acetate and even CO2). The Stable

Isotope Labelling by Amino acids in Cell culture (SILAC) has become a gold

standard in the field of quantitative proteomics. Several amino acids can be used, but

a ‘classical’ SILAC experiment is often based on amedium containing 13C6-arginine

and 13C6-lysine. In this way, all tryptic peptides should have at least one labelled

amino acid.

In vitro labelling strategies These are quite numerous in the field of mass

spectrometry-based proteomics. One of the very first approaches developed was the

Isotope-Coded Affinity Tag (ICAT). This method is based on the modification of

Cys-containing peptides with reagents of a different isotopic composition that yield

a pair of ions 8 Da apart and subsequently to enrich for them by affinity. Several

other in vitro labelling approaches have been developed, mostly targeting primary

amine modification (which is the peptide N-terminal and the lysine side-chain).

Several issues with in vitro labelling have been identified, such as possible side

reactions with amino acids other than the ones being targeted or incomplete

reactions. Another main drawback of these labelling strategies is that they suffer

from an increased sample complexity after the different samples have been mixed

together (a reminder that complexity is the main challenge in the proteomic field). It

is also difficult in some cases, if not impossible, to compare more than 2–8 samples

at the same time. This imposes some constraints in terms of experimental design

using these labelling strategies.

Label-free approaches A label-free differential approach compares LC-MS data-

sets based on relative peptide peak intensities, or by comparing the number of

spectra acquired. A label-free quantitation strategy has fewer limitations in terms of

the number of runs to compare. Using this method, either a few runs or up to one

hundred runs can be technically compared.

Although label-free quantitation is a robust quantitative method, its performance

depends on temporal LC-MS alignment of the different runs, which can be

challenging. Since none of the samples being analyzed are encoded by isotopic

labelling or mixed at any level, label-free quantitation is also strongly dependent on

the reproducibility of the overall platform from the sample preparation to the LC-

MS analysis. As more and more free label-free platforms are being made available

and are priced in order to make them accessible to standard proteomics labs, this

approach will undoubtedly increase in popularity. A comparison of the three main

quantitative proteomics strategy and at which point samples can be combined is

illustrated in Figure 24.3.


CH24 09/26/2012 12:15:1 Page 224

Figure 24.3 Typical quantitative mass spectrometry workflows. The left part is a generic processpresented for a bottom up proteomics analysis. Mixed red and blue lines indicate when the twosamples are normally combined together. The longer the process is run in parallel prior mixingthe sample, the more technical variations are introduced, which can affect both samplesindependently. Courtesy of Dr. Thierry Lebihan. A full colour version of this figure appears inthe colour plate section.

24.3 Required controls24.3.1 Control definition

Depending on the type of experiments (immunoprecipitation or global proteomic

survey), as well as the number of replicates and fractions per sample replicates, the

type of control and the experimental design can vary significantly. For an immuno-

precipitation experiment, where antibodies are attached to beads and protein

complexes are captured, it is important to use well defined control (see Primers 13

and 14). The nature of the control can be, for example, using a non-specific antibody

with the same type of samples as used for the experiments. If the effect of a

perturbation is studied, then performing the immunoprecipitation enrichment both

prior to the perturbation and after could highlight significant differences. As

the performance of the mass spectrometer will vary with time, and some of the

experiment could necessitate several days of mass spectrometry time, it can be

important to randomize the samples to be run.

24.3.2 Sample normalization

Normalization has to happen at least at the following two steps:

1. Protein assay: to ensure that the same amount of sample is analyzed by LC-MS.


CH24 09/26/2012 12:15:1 Page 225

2. At the LC-MS trace: each sample signal output has to be normalized for proper

quantitation.

In the next section, I discuss in more detail the importance of replicates and their

nature.

24.4 Common problem or errors in literature andpitfalls in execution or interpretation

New and cutting-edge techniques in mass spectrometry are often limited to the

inventor’s research laboratories (and their close collaborators). Moreover, the newly

introduced methods are often verified using rather simple models (e.g. BSA digests,

or casein digests as a model for phospho-protein enrichment or analysis). In the best

cases, the results obtained along these lines of experimentation should be consid-

ered simply as interesting proofs of principle. Extrapolating their performance to

more realistic samples is erroneous and needs to be proven experimentally on

relevant samples.

Even with the current improvements in instrumentation, recent MS-based

techniques can only cover a fraction of a given proteome. Any small increase in

proteome coverage is often accomplished by a concomitant increase in the amount

of sample fractionation, which is a time-consuming operation. In these circum-

stances, the choice of digging deeper into a given proteome is often achieved at the

expense of acquiring higher number of replicates. Ultimately, this has an impact on

the conclusions drawn from the observed data.

How to define the number of replicates, as well as the type of replicates (technical

versus biological), is important (see Primers 2 and 3). A technical replicate consists

of repeating the analysis of the same sample several times. Such an approach

allows for the evaluation of the variability inherent to the technique being applied.

While this may be useful to extract, has it given some information about the

technique used? Ideally, no biological interpretation should been drawn in these

circumstances.

The definition of biological replicates in the proteomic context has not been

adequately addressed as of yet, and is currently broadly defined. For example, a

different flask of the same culture can be considered a biological replicate.

Furthermore, different mice of the same genetic background are also considered

to be biological replicates. It is essential to define the question needing to be

answered by the proteomic study before deciding upon the experimental design.

Only then can an appropriate experimental design and adequate sampling be

performed in order to allow for the correct conclusions to be drawn (see Primers 2

and 3).

The higher the similarity between the different ‘biological replicates’ (i.e. simply

different flasks of the same cells cultured in exactly the same way or same mouse

24.4 COMMON PROBLEM OR ERRORS IN LITERATURE 225

CH24 09/26/2012 12:15:1 Page 226

genotype) will probably allow for the observation and reporting of nice, well-

defined trends. However, the conclusions drawn should ideally be limited to the

observed specific culture conditions. Using two different mouse genotypes also

generates a wide range of different background proteins which, under certain

circumstances, render data analysis almost impossible.

In the case of the number of replicates to consider in a study, triplicates are often

arbitrarily considered ‘safe’. However, a more robust approach, such as the use of a

power analysis, which allows for the evaluation of an adequate sample size, should

be considered. This step can be important, as the sample size may be either too low

or potentially too high (rarely the case in the proteomic field). If too low, the

experiment will ultimately lack the precision to give reliable answers in relation to

the questions. On the other hand, if the sample size is too large, then resources and

time will be unnecessarily lost, with little information gained. A power analysis can

also be an extremely useful tool for determining ratio cut-offs between two groups

being compared, instead of using the often employed and arbitrarily defined ratio

cut-off of 1.5–2. In time, some standardization rules, not defined arbitrarily, will

emerge in the field of quantitative proteomics. For example, proper power analysis

should be mandatory in order to justify the chosen experimental design.

The reader would perhaps have appreciated a clear and definitive description of

which methods should be used for mass spectrometry-based proteomic studies.

However, different proteomic platforms provide solutions to different proteomic

questions. Several major challenges still exist, while new ones are continuously

emerging. Therefore, the overall state of the field of proteomics is far from being

stagnant. In this primer, we haven’t proposed a single solution; however, we

have raised some key points to consider when reading or writing a proteomics

research paper.

With time, a proteomic experiment will become easier to perform and more

reproducible. Digging down into a proteome will hopefully become attainable, with

a greater ease, and thereby allow researchers to explore other dimensions of

biological significance, such as time series, several perturbations and a higher

number of biological replicates. Under those circumstances, the field of proteomics

will be able to realize its full potential and deliver on its tremendous promise.

24.5 Complementary techniquesA transcriptomic study offers several advantages over a proteomics one, as it is

slightly easier to perform; often, more complex experimental design and number of

replicates can be defined. However, as mentioned earlier, there are some time

differences between the levels of protein and their corresponding mRNA. The

transcriptomic approach is often done with prior knowledge, where a proteomics

approach can be used as a non-hypothesis driven approach.

Protein array, based on antibody capture, is a technology which is developing fast

and will soon be a serious competitor to the classical proteomic approaches; the


CH24 09/26/2012 12:15:1 Page 227

expression and purification of diversified population of proteins will get easier, and

there will be a larger bank of antibodies available.

Primer 13: Immunocytochemistry

Primer 14 Immunopreciptation

Primer 15: Immunoblotting

Primer 19: Viral vector transgenesis

Primer 26: Genetically modified models

AcknowledgementsSynthSys is a Centre for Integrative Systems Biology (CISB) funded by BBSRC and

EPSRC, reference BB/D019621/1.

Further reading and resourcesSeveral good reviews on the topic exist, as well as good web links.

Domon, B. & Aebersold, R. (2006). Mass spectrometry and protein analysis. Science 312,

214–217.

Elliott, M.H., Smith, D.S., Parker, C.E. & Borchers, C. (2009). Current trends in quantitative

proteomics. Journal of Mass Spectrometry 44(12), 1637–1660.

Some tutorial on internet: http://www.i-mass.com/guide/tutorial.html

Some tools for proteomics analysis: http://expasy.org/proteomics

FURTHER READING AND RESOURCES 227

Date post:	04-Dec-2016
Category:	Documents
Upload:	phil
View:	213 times
Download:	1 times

Essential Guide to Reading Biomedical Papers (Recognising and Interpreting Best Practice) ||...

Documents