+ All Categories
Home > Documents > EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan...

EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan...

Date post: 22-Dec-2015
Category:
Upload: lucas-sutton
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
61
EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics Services Group PANDA group European Bioinformatics Institute Hinxton, Cambridge United Kingdom
Transcript
Page 1: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI is an Outstation of the European Molecular Biology Laboratory.

MS Identification

Dr. Juan Antonio VIZCAINO

PRIDE Group coordinatorPRIDE team, Proteomics Services Group

PANDA group

European Bioinformatics Institute

Hinxton, Cambridge

United Kingdom

Page 2: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Overview …

• Search engines: peptide identification

• Protein inference

• De novo and spectral searches

• Choosing the right protein sequence DB

• You need to learn many things…

Page 3: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

It should not be a black box…

From: Lilley et al., Proteomics, 2011

Page 4: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

MS proteomics: Shot-gun/bottom-up approaches

300 400 500 600 700 800 900 1000 1100m/z0

100

%

300 400 500 600 700 800 900 1000 1100m/z0

100

%

MS analysis

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

MS/MS analysis

fragmentation

PROTOCOL

peptides

proteins

sequencedatabase

Page 5: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

PMF IDENTIFICATION

Page 6: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Peptide Mass Fingerprinting (MS)

300 400 500 600 700 800 900 1000 1100m/z0

100

%

300 400 500 600 700 800 900 1000 1100m/z0

100

%

MS analysis

Peptide MassFingerprinting

(PMF)MW

- Each peak in the spectrum represents a peptide (or mixture of peptides)

- Information about the Mass and Charge

Not very used at present except forGel Based approaches

(in this case the Molecular Weight of the protein is known)

Page 7: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Peptide Mass Fingerprinting (MS) in the webAldente (Phenyx): http://www.expasy.org/tools/aldente/

ASCQ_ME: https://www.genopole-lille.fr/logiciel/ascq_me/

Bupid: http://zlab.bu.edu/Amemee/

Mascot: http://www.matrixscience.com/search_form_select.html

MassSearch: http://www.cbrg.ethz.ch/services/MassSearch

MS-Fit (Protein Prospector):http://prospector.ucsf.edu/prospector/mshome.htm

PepMAPPER: http://www.nwsr.manchester.ac.uk/mapper/

Profound (Prowl): http://prowl.rockefeller.edu/prowl-cgi/profound.exe

XProteo: http://xproteo.com:2698/

Page 8: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

MS/MS IDENTIFICATION

Page 9: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

300 400 500 600 700 800 900 1000 1100m/z0

100

%

300 400 500 600 700 800 900 1000 1100m/z0

100

%

MS analysis

Peptide MassFingerprinting

(PMF)

MS/MS

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%MS/MS analysis

Peptide sequence information

(on top of Mass and Charge)

Fragmentation

Page 10: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Protein database based comparison

Sequential comparison: de novo approaches

Spectral comparison

database sequence theoreticalspectrum

experimentalspectrum

compare

database sequence experimentalspectrum

compare de novosequence

Spectrallibrary

experimentalspectrum

experimentalspectrum

compare

Modified From: Eidhammer, Flikka, Martens, Mikalsen – Wiley 2007

Three types of MS/MS identification

Page 11: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

MS proteomics: peptide IDs and protein IDs

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

MS/MS spectra

proteins

Page 12: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

MS proteomics: peptide IDs and protein IDs

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

MS/MS spectra

proteins

Page 13: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

MS proteomics: peptide IDs and protein IDs

proteins

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

MS/MS spectra

peptides

Searchengine

sequencedatabase

UniProtIPI

RefSeq

TDMDNQIVVSDYAQMDR

LFDQAFGLPRAKPLMELIER

DESTNVDMSLAQRDIVVQETMEDIDK

NGMFFSTYDRGTAGNALMDGASQL

Page 14: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

SEARCH ENGINES

Page 15: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Search engines

Sequence database matching

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

Experimental Spectra

Proteins

Peptides

Spectra

TDMDNQIVVSDYAQMDRLFDQAFGLPRAKPLMELIER

DESTNVDMSLAQRDIVVQETMEDIDK

NGMFFSTYDRGTAGNALMDGASQL

VDMSLAQRDIVVQETMEDIDK

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

Theoretical Spectra

UniProtIPI

RefSeq

sequencedatabase

Page 16: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Search engines

How good is the correlation?- Scores are generated by search engines- Usually the best match is kept

Theoretical Spectra

m / z

800 1200 1600 2000 2400

Experimental Spectra

m / z

800 1200 1600 2000 2400

Page 17: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Search engines

Taken from Nesvizhskii, J Proteomics, 2010

Page 18: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Search engines

Taken from Nesvizhskii, J Proteomics, 2010

Page 19: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

• MASCOT (Matrix Science)http://www.matrixscience.com

• SEQUEST (Scripps, Thermo Fisher Scientific)http://fields.scripps.edu/sequest

• X!Tandem (The Global Proteome Machine Organization)http://www.thegpm.org/TANDEM

• OMSSA (NCBI)http://pubchem.ncbi.nlm.nih.gov/omssa/

The most popular algorithms

Page 20: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Incorrect identifications

Correctidentifications

False positivesFalse negatives

Threshold score

Adapted from: www.proteomesoftware.com – Wiki pages

Overall concept of scores and cut-offs

Page 21: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

0%

1%

2%

3%

4%

5%

6%

p=0.05 p=0.01 p=0.005 p=0.0005

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

false positives

identifications

higher stringency

Playing with probabilistic cut-off scores

Page 22: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

• Very well established search engine

• Can be used for MS/MS (PFF) identifications

• Based on a cross-correlation score (includes experimental peak height)

• Published core algorithm (patented, licensed to Thermo Fisher Scientific)

• Provides preliminary (Sp) score, rank, cross-correlation score (XCorr),

and score difference between the top tow ranks (deltaCn, Cn)

• Thresholding is up to the user, and is commonly done per charge state

• Many extensions exist to perform a more automatic validation of results

SEQUEST

CrossCorr

avg AutoCorr offset=-75 to 75 XCorr =

deltaCn=

XCorr1 XCorr 2

XCorr1

Page 23: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Search engines: Sequest

The XCorr is high if the direct comparison is significantly greater than

the background

It measures how good the XCorr is relative to the next best match.

Page 24: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

• Very well established search engine

• Can do MS (PMF) and MS/MS (PFF) identifications

• Based on the MOWSE score

• Unpublished core algorithm (trade secret)

• Predicts an a priori threshold score that identifications need to pass

• From version 2.2, Mascot allows integrated decoy searches

• Provides rank, score, threshold and expectation value per identification

• Customizable confidence level for the threshold score

Search engines: Mascot

Page 25: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Search engines: Mascot

www.matrixscience.com

Page 26: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Search engines: X!Tandem

• Open source search engine• Can be used for MS/MS experiments• Based on a hyperscore, than only takes into account

b and y ions.• Published core algorithm and it is freely available• Fast and able to handle PTMs in an iterative fashion• Used as an auxiliary search engine

by-Score= Sum of intensities of peaks matchingB-type or Y-type ions

HyperScore= by-Score Ny! Nb!

Page 27: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Search engines: OMSSA

• Open source search engine• Can be used for MS/MS experiments• Relies on a Poisson distribution• Published core algorithm and it is freely available• Provides an expectancy score, similar to the BLAST

E-value• Very good performance in comparison with the

others• Used as an auxiliary search engine

Page 28: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

MS proteomics: peptide IDs and protein IDs

proteins

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

MS/MS spectra

peptides

Searchengine

sequencedatabase

UniProtIPI

RefSeq

TDMDNQIVVSDYAQMDR

LFDQAFGLPRAKPLMELIER

DESTNVDMSLAQRDIVVQETMEDIDK

NGMFFSTYDRGTAGNALMDGASQL

So far, we have actually identified peptides, not proteins

Page 29: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

MS proteomics: peptide IDs and protein IDs

peptides proteins

TDMDNQIVVSDYAQMDRTW

LFDQAFGLPRAKPLMELIER

DESTNVDMSLAQRDIVVQETMEDIDK

NGMFFSTYDRGTAGNALMDGASQL

IPI00302927IPI00025512IPI00002478IPI00185600IPI00014537IPI00298497IPI00329236IPI00002232

Protein Inference is complex!!

Page 30: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

PROTEIN INFERENCE

Page 31: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Intermezzo: Protein inference

The minimal and maximal explanatory sets

peptide a b c d

proteinsprot X x xprot Y xprot Z x x x

Minimal setOccam {

peptide a b c d

proteinsprot X x xprot Y xprot Z x x x

Maximal setanti-Occam {

The Truth

Page 32: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Intermezzo: Protein inference

Slide from J. Cottrell, Matrix Science

Page 33: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Protein inference

B

A

D

C

Page 34: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Protein inference

B

A

D

C

Page 35: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Protein inference

B

A

D

C

Page 36: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Protein inference

B

A

D

C

Page 37: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Protein inference

B

A

D

C

Page 38: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Protein inference

B

A

D

C

Page 39: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Protein inference

B

A

D

C

Page 40: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Protein inference

B

A

D

C

Page 41: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Protein inference

B

A

D

C

Unambiguous peptide

Page 42: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

OTHER APPROACHES TO PERFORM MS/MS IDENTIFICATION

Page 43: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Protein database based comparison

Sequential comparison: de novo approaches

Spectral comparison

database sequence theoreticalspectrum

experimentalspectrum

compare

database sequence experimentalspectrum

compare de novosequence

Spectrallibrary

experimentalspectrum

experimentalspectrum

compare

Modified From: Eidhammer, Flikka, Martens, Mikalsen – Wiley 2007

Three types of MS/MS identification

Page 44: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Example of a manual de novo of an MS/MS spectrumNo more database necessary to extract a sequence!

Algorithms

LutefiskSherengaPEAKS

PepNovo…

References

Dancik 1999, Taylor 2000Fernandez-de-Cossio 2000

Ma 2003, Zhang 2004Frank 2005, Grossmann 2005

De novo approaches

Page 45: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Protein database based comparison

Sequential comparison: de novo approaches

Spectral comparison

database sequence theoreticalspectrum

experimentalspectrum

compare

database sequence experimentalspectrum

compare de novosequence

Spectrallibrary

experimentalspectrum

experimentalspectrum

compare

Modified From: Eidhammer, Flikka, Martens, Mikalsen – Wiley 2007

Three types of MS/MS identification

Page 46: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Spectral searching

• Concept: To compare experimental spectra to other experimental spectra.

• There are many spectral libraries publicly available (for instance, from NIST)

• Custom ‘search engines’ have been developed:• SpectraST (TPP)• X!Hunter (GPM)

• It has been claimed that the searches have more sensitivity that with sequence database approaches

Page 47: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Spectral searching (2)

http://peptide.nist.gov/

Page 48: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

COMBINING DIFFERENT SEARCH APPROACHES

Page 49: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Multi-stage peptide identification strategy

Taken from Nesvizhskii, J Proteomics, 2010

Goal: “Squeeze” your good quality experimental spectra

Page 50: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

PROTEIN SEQUENCE DATABASES

Page 51: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

1. Comprehensive (whatever is not in the DB will not be included in your results).

2. Not too redundant at the protein sequence level- Protein inference gets easier- It is not very good if the database is too big.

3. Quality of annotation

4. Stability of identifiers

What is needed from a protein database

Page 52: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

a) UniProt Knowledgebase (UniProtKB): SWISS-PROT (manually curated)/ TrEMBL.

b) NCBI non-redundant database: It compiles all protein sequences available from the following databases: ‘GenBank’ translations, the Protein Data Bank (PDB), UniProtKB/Swiss-Prot, PIR and PRF.

c) Ensembl: Genomics centric resource. Integration of the information with genomics is easy.

d) IPI (International Protein Index): It has been discontinued (9/2012). Different builds for different species (Human, Mouse, Cow, Rat, Zebrafish, Dog, Arabidopsis).

e) Model organisms DBs (for instance, TAIR for Arabidopsis).

Main databases used

Page 53: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

- If the species is not well represented in the protein databases, there is a much stronger need to search ESTs or genomic databases.

- The search engine will translate the 6 possible ORFs for each nucleotide sequence.

- ESTs are not suitable for PMF approaches (incomplete proteins).

- The alternative is to filter comprehensive databases like UniProt by species or genus, or to use a protein DB from a close organism.

Databases for non-model organisms

Page 54: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

- Since each database has a different focus, the databases can vary in terms of completeness, degree of redundancy, and quality of annotations.

- More inclusive bigger protein databases will take longer to search

- For the bigger resources, it may also result on more false-positive identifications and reduced statistical significance (the probability of random match is higher).

Importance of choosing the right DB

Page 55: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

POST-VALIDATION OF RESULTS

Page 56: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

- Concepts of peptide and protein FDR

- Decoy databases

- Softwares like PeptideProphet, ProteinProphet, …

- Influence of PTMs in the search

- Scoring of PTM positioning

…..

Other concepts that would be nice to learn…

Page 57: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Recommended reading….

Nesvizhskii, J Proteomics, 2010

and many more…

Page 58: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Conclusions

• Approaches to perform peptide and protein identification

• Sequence database based approaches: search engines

• The protein inference problem

• Importance of choosing the right protein database

• Many things to be learnt…

Page 59: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Remember: it should not be a black box…

From: Lilley et al., Proteomics, 2011

Page 60: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

And still… we haven’t touched quantification at all

From: Vaudel et al., Proteomics, 2010

Page 61: EBI is an Outstation of the European Molecular Biology Laboratory. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics.

EBI Bulgaria RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Questions?


Recommended