Date post: | 31-Dec-2015 |
Category: |
Documents |
Upload: | daniel-horton |
View: | 221 times |
Download: | 0 times |
Mass Spectrometry I
Basic Data Processing
Mass spectrometry
• A mass spectrometer measures molecular masses.
• The mass unit is called dalton, which is 1/12 of the mass of a carbon atom, and is about the mass of one hydrogen atom.
• If there is a mixture of different molecules in a sample, all the masses are measured simultaneously. So you get a spectrum.
Some PicturesMALDI-R Q-Tof Micro
FT-ICR LTQ-Orbitrap
Each peak corresponds to a different type of molecule in your sample
1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000m/z0
100
%
2790.22
1324.60
1265.62
1179.41
2789.22
1325.62
2466.18
2465.20
1759.931326.60
1477.62
1327.611460.59
1748.86
1478.611540.63
1974.94
1760.93
1761.92
1975.93
2356.102355.111976.92
2179.87
2467.19
2468.20
2469.172746.23
2791.23
2792.23
2793.23
3104.412794.20
3103.432795.06 3106.42
...…2789.22 3597.02790.22 5018.02791.23 4406.02792.23 2868.02793.23 1234.0……
peak list
Three Components of an MS• A typical mass spectrometer contains
– Ionizer– Mass analyzer– Detector
• Ion source charges the to-be-measured molecules.– Charge can be negative but often positive.– Two common types: MALDI and ESI.– John B. Fenn & Koichi Tanaka 2002 Nobel Prize in Chemistry
for Electrospray and MALDI
• Mass analyzer separates ions according to the mass to charge ratio (m/z) of the ions.– Iontrap, TOF, Quadrupole, FTICR.
• Detector detects the ions.
Matrix Assisted Laser Desorption/Ionization
Formation of singly charged ions
Sample is co-crystallized with matrix (solid)
Koichi Tanaka, Nobel Prize 2002
Ionization (1): MALDI
Other ionization method exists.
Mass Analyzer (1) – TOF
• Time of Flight.
+ -
+
Detector
Time of flight is proportional to sqrt(m/z)
Other mass analyzer exists.
Drift region (D)
MALDI Time-of-flight
Putting Them Together
MALDI TOF
Average time in TOF: 10-7 sec : average speed 1-2 x 105 km/h
MALDI-TOF Linear
Mass range = 800-200,000
Sensitivity and accuracy decrease rapidly with size !
MALDI-TOF Linear vs Reflectron Mode
Reflectron gives much better resolution for mass < 6,000
• Linear = poor resolution due to velocity variation of ions with the same m/z •Reflectron = Contact lens for a near sighted machine!
Protein “identification” with intact mass
• We measure the intact mass of the protein.• Then search in the protein database to find a
protein with the same mass.• Good idea but there are too many proteins
with the same mass.• In the rest of the lecture we study more
sophisticated methods and why protein ID is important.
Complications
isotopes
widened peaks
profile
Centroiding
Another example with lower resolution
Isotopes
Chemical Composition of Living Matter27 of 92 natural elements are essential. Elements in biomolecules (organic matter): H, C, N, O, P, S These elements represent approximately 92% of
dry weight.
Organic Matter Organized in "building blocks"
amino acids polypeptides ( proteins)
monosaccharides starch, glycogen
nucleic acids DNA, RNA
Back to Basics…
element nominal exact Percent average
mass mass abundance mass
C 12 12.00000 98.9%
13 13.00335 1.1% 12.00115
H 1 1.00783 99.98%
2 2.0140 0.02% 1.008665
O 16 15.99491 99.8% 18 17.9992 0.02% 15.994
N 14 14.00307 99.63%
15 15.00011 0.37% 14.0067
S 32 31.97207 94.93%
33 32.97146 0.76%
34 33.96787 4.29% excercise
Mass (Weights) of Atoms and Molecules
Ethyl acetate C4H8O2
4 C12 4 x 12.0000 48.0000 8 H1 8 x 1.0078 8.064 2 O16 2 x 15.99949 31.9898
Nominal Mass: 48 + 8 + 32 = 88
Monoisotopic Mass: 88.0555
Average Mass: 48.04446 + 8.06932 + 31.988 = 88.10178
Mass or Molecular Weight of molecules
Amino Acids• There are 20 amino acids. All have the
same basic structure but with different side chains:
• Examples: side chain group
H
Glycine, or Gly, or G
Arginine, or Arg, or R
All the 20 Structures
* Picture copied from Dr. R.J. Huskey’s website: http://www.people.virginia.edu/~rjh9u/aminacid.html
Peptides and Proteins
H
Glycine, or Gly, or G
Arginine, or Arg, or R
GR
peptide bonds
N-terminal C-terminal
Exact Mass of Amino Acid Residues in Proteins
Gly G 57.02150Ala A 71.03720Gln Q 128.05860Lys K 128.09500Glu E 129.04270
Note: Leu (L) = Ile (I) = 113.08410
Mass of Amino Acids Residues
Amino Acid Table
AA Codes Mono.
IONSOURCE.COM
AA Codes Mono.Gly G 57.021464 Asp D 115.02694Ala A 71.037114 Gln Q 128.05858Ser S 87.032029 Lys K 128.09496Pro P 97.052764 Glu E 129.04259Val V 99.068414 Met M 131.04048Thr T 101.04768 His H 137.05891Cys C 103.00919 Phe F 147.06841Leu L 113.08406 Arg R 156.10111Ile I 113.08406 CMC 161.01467
Asn N 114.04293 Tyr Y 163.06333
- - - Trp W 186.07931
Cysteine
Proteins are often treated so that cysteine becomes carboxyamidomethyl cysteine (CamC) or Carboxymethyl (CmC) in order to break the disulphide bonds.CamC = 160.03
tripeptide (MW 71.04+87.03+147.07+18.01)=323.15More precisely: monoisotopic mass 323.1481 average mass 323.3490
Ala-Ser-Phe (ASF)
Mass of Peptides and Proteins
In a mass spectrum
323.15 324.15 325.15
Deconvolution adds all the isotopic peaks to the monoisotopic peak. So, the later process does not need to worry about the isotopes.Monoisotope peak
isotope peaks
Check the difference
ESI and Multiply Charged Ions
Electrospray
Electrospray Ionization: Formation of Charged Droplets
Formation of multiply charged ions
Ionization (2) – ESI
Multiply Charged Ions• The same molecules may be charged
differently, and therefore form a few peaks in the spectrum.
323.15
324.15
325.15
162.08
162.58
163.08
m/z(M+2)/2(M+3)/3
For protein/peptide with positive charges, the charge is obtained from adding protons (which has mass approx. 1 dalton. As a result, a molecule with mass M will have peaks at (M+Z)/Z
(M+1)/1
How to determine charge states?
• Isotope ions when resolution is enough.• Check different charge states when resolution
is not enough.
Exercise
395.73
396.22
397.24
Exercise
Exercise
(A) “Multi-charge envelope” (B) After “Charge-deconvolution algorithm”
1541.9
1413.21304.7
1211.9
Baseline
Baseline correction
Convex Hull Method
convex
not convex
Convex Hull
• A convex hull is such that all the data points are above the lines and their extensions.
How to calculate convex hull?
• Stack S contains all the data points that form the convex hull so far.• Data point D[i] = (D[i].x, D[i].y).
Algorithm:
1. S.push( D[0] ); s.push(D[1])2. for i from 2 to n2.1 while D[i], S.top(), S.secondtop() are concave2.1.1 S.pop();2.2 S.push(D[i]);3. return S
S.top()
S.secondtop() D[i]
Analyze the convex hull algorithm
• Correctness– The algorithm finishes.– The output is a convex hull.– The proof will be included in an assignment.
• Time complexity– O(n) time.– Proof: each point is checked only once, and added
to (and therefore removed from) the stack at most once.
Summarize of spectrum preprocessing
• Baseline correction• Centroiding• Charge recognition and deconvolution• Noise removal