Chapter 3: Amino Acids, Peptides, and Proteins
Dr. ClowerChem 4202
Outline (part I)
Sections 3.1 and 3.2
Amino Acids
Chemical structure
Acid-base properties
Stereochemistry
Non-standard amino acids
Formation of Peptide Bonds
The building blocks of proteins Also used as single molecules in biochemical
pathways 20 standard amino acids (-amino acids) Two functional groups:
carboxylic acid group amino group on the alpha () carbon
Have different side groups (R) Properties dictate behavior of AAs
Amino Acids
R side chain
| H2N— C —COOH
|H
Both the –NH2 and the –COOH groups in an amino acid
undergo ionization in water.
At physiological pH (7.4), a zwitterion forms Both + and – charges
Overall neutral
Amphoteric Amino group is protonated
Carboxyl group is deprotonated
Soluble in polar solvents due to ionic character
Structure of R also influence solubility
Zwitterions
Classification of Amino Acids
Classify by structure of R Nonpolar
Polar
Aromatic
Acidic
Basic
Nonpolar Amino Acids
Hydrophobic, neutral, aliphatic
Polar Amino Acids
Hydrophilic, neutral, typically H-bond
Disulfide Bonds
Formed from oxidation of cysteine residues
Aromatic Amino Acids
Bulky, neutral, polarity depend on R
Acidic and Basic Amino Acids
Acidic R group = carboxylic
acid Donates H+ Negatively charged
Basic R group = amine Accepts H+
Positively charged His ionizes at pH 6.0
Remember H3PO4 (multiple pKa’s)
AAs also have multiple pKa’s due to multiple ionizable
groups
Acid-base Properties
pK1 ~ 2.2(protonated below 2.2)
pK2 ~ 9.4(NH3
+ below 9.4)
pKR
(when applicable)
Table 3-1
Note 3-letter and 1-letter
abbreviations
Amino acid organization chart
pH and Ionization
Consider glycine:
Note that the uncharged species never forms
H3N CH C
H
OH
O
H3N CH C
H
O
O
H2N CH C
H
O
O
OH-
H3O+
OH-
H3O+
Glycine ion at acidic pH
(charge = 1+)
Zwitterion of glycine (charge = 0)
Glycine ion at basic pH
(charge = 1-)
Titration of Glycine
pK1
[cation] = [zwitterion]
pK2
[zwitterion] = [anion]
First equivalence point Zwitterion Molecule has no net charge pH = pI (Isoelectric point)
pI = average of pKa’s = ½ (pK1 + pK2)
pIglycine = ½ (2.34 + 9.60) = 5.97
Animation
pI of Lysine
For AAs with 3 pKa’s, pI = average of two relevant pKa values
Consider lysine (pK1 = 2.18, pK2 = 8.95, pKR = 10.53):
Which species is the isoelectric form?
So, pI = ½ (pK2 + pKR)
= ½ (8.95 + 10.53) = 9.74
Note: pKR is not always higher than pK2 (see Table 3-1 and Fig. 3-12)
H3N CH C
CH2CH2CH2CH2NH3+
OH
O
H3N CH C
CH2CH2CH2CH2NH3+
O
O
H2N CH C
CH2CH2CH2CH2NH3+
O
O
H2N CH C
CH2CH2CH2CH2NH2
O
O
pK1 pK2 pKR
Learning Check
Would the following ions of serine exist at a pH above, below, or at pI?
H3N CH C
CH2
O
O
OH
H3N CH C
CH2
OH
O
OH
H2N CH C
CH2
O
O
OH
Stereochemistry of AAs
All amino acids (except glycine) are optically active
Fischer projections:
D and L Configurations
d = dextrorotatory l = levorotatory D, L = relative to glyceraldehyde
Importance of Stereochemistry
All AA’s found in proteins are L geometry
S enantiomer for all except cysteine
D-AA’s are found in bacteria
Geometry of proteins affects reactivity (e.g
binding of substrates in enzymes)
Thalidomide
Non-standard Amino Acids
AA derivatives Modification of AA after
protein synthesized
Terminal residues or R
groups
Addition of small alkyl
group, hydroxyl, etc.
D-AAs Bacteria
CHEM 2412 Review
Carboxylic acid + amine = ?
Structure of amino acid
R C OH
O
+ H2N R R C NH
O
+ H2Oheat
R
H2N C CO2H
H
R
The Peptide Bond
Chain of amino acids = peptide or protein Amino acid residues connected by peptide bonds Residue = AA – H2O
The Peptide Bond
Non-basic and non-acidic in pH 2-12 range due to delocalization of N lone pair
Amide linkage is planar, NH and CO are anti
C
O
N
H
O
N
HRigid
restricted rotation
Polypeptides
Linear polymers (no branches) AA monomers linked head to tail Terminal residues:
Free amino group (N-terminus) Draw on left
Free carboxylate group (C-terminus) Draw on right
pKa values of AAs in polypeptides differ slightly from pKa values of free AAs
Naming Peptides
Name from the free amine (NH3+)
Use -yl endings for the names of the amino acids The last amino acid with the free carboxyl group (COO-)
uses its amino acid name
(GA)
Amino Acid Ambiguity
Glutamate (Glu/E) vs. Glutamine (Gln/Q) Aspartate (Asp/D) vs. Asparagine (Asn/N) Converted via hydrolysis Use generic abbreviations for either
Glx/Z Asx/B
X = undetermined or nonstandard AA
Write the name of the following tetrapeptide using amino acid names and three-letter abbreviations.
Learning Check
CH CH3
CH3
H3N CH C
O
N
H
CH C
O
N
H
CH C
O
N
H
CH C O-OCH CH2
CH2
S
CH3
CH2
SH
CH3
Learning Check
Draw the structural formula of each of the following peptides.A. Methionylaspartic acid
B. Alanyltryptophan
C.Methionylglutaminyllysine
D.Histidylglycylglutamylalanine
Outline (part II)
Sections 3.3 and 3.4 Separation and purification Protein sequencing
Analysis of primary structure
Protein size
In general, proteins contain > 40 residues Minimum needed to fold into tertiary structure
Usually 100-1000 residues Percent of each AA varies Proteins separated based on differences in
size and composition Proteins must be pure to analyze, determine
structure/function
Factors to control
pH Keep pH stable to avoid denaturation or chemical degradation
Presence of enzymes May affect structure (e.g. proteases/peptidase)
Temperature Control denaturation (0-4°C) Control activity of enzymes
Thiol groups Reactive Add protecting group to prevent formation of new disulfide bonds
Exposure to air, water Denature or oxidize Store under N2 or Ar Keep concentration high
General Separation Procedure
Detect/quantitate protein (assay) Determine a source (tissue) Extract protein
Suspend cell source in buffer Homogenize
Break into fine pieces Cells disrupted Soluble contents mix with buffer Centrifuge to separate soluble and insoluble
Separate protein of interest Based on solubility, size, charge, or binding ability
Solubility
Selectively precipitate protein Manipulate
Concentration of salts Solvent pH Temperature
Concentration of salts
Adding small amount of salt increases [Protein]
Salt shields proteins from each other, less
precipitation from aggregation Salting-in
Salting out Continue to increase [salt] decreases [protein]
Different proteins salt out at different [salt]
Other Solubility Methods
Solvent Similar theory to salting-out Add organic solvent (acetone, ethanol) to interact with
water Decrease solvating power
pH Proteins are least soluble at pI Isoelectric precipitation
Temperature Solubility is temperature dependent
Chromatography
Mobile phase Mixture dissolved in liquid or
solid
Stationary phase Porous solid matrix
Components of mixture
pass through the column
at different rates based on
properties
Types of Chromatography
Paper Stationary phase = filter paper
Same theory as thin layer chromatography (TLC)
Components separate based on polarity
High-performance liquid (HPLC) Stationary phase = small uniform particles, large surface area
Adapt to separate based on polarity, size, etc.
Hydrophobic Interaction Hydrophobic groups on matrix
Attract hydrophobic portions of protein
Types of Chromatography
Ion-exchange Stationary phase =
chemically modified to
include charged groups
Separate based on net
charge of proteins
Anion exchangers Cation groups (protonated
amines) bind anions
Cation exchangers Anion groups (carboxylates)
bind cations
Types of Chromatography
Gel-filtration Size/molecular exclusion
chromatography Stationary phase = gels
with pores of particular size
Molecules separate based on size
Small molecules caught in pores
Large molecules pass through
Types of Chromatography
Affinity Matrix chemically
altered to include a molecule designed to bind a particular protein
Other proteins pass through
UV-Vis Spectroscopy
Absorbance used to
monitor protein
concentrations of each
fraction
= 280 nm Absorbance of aromatic
side groups
Electrophoresis
Migration of ions in an electric field
Electrophoretic mobility (rate of movement) function of
charge, size, voltage, pH
The positively charged proteins move towards the negative
electrode (cathode)
The negatively charged proteins move toward the positive
electrode (anode)
A protein at its pI (neutral) will not migrate in either direction
Variety of supports (gel, paper, starch, solutions)
Protein Sequencing
Determination of primary structure Need to know to determine 3D structure Gives insight into protein function Approach:
Denature protein Break protein into small segments Determine sequences of segments
Animation
End group analysis
Identify number of terminal AAs Number of chains/subunits
Identify specific AA
Dansyl chloride/dabsyl chloride Sanger method (FDNB) Edman degradation (PITC)
Bovine insulin
Dansyl chloride
Reacts with primary amines N-terminus
Yields dansylated polypeptides Dansylated polypeptides
hydrolyzed to liberate the modified dansyl AA
Dansyl AA can be identified by chromatography or spectroscopy (yellow fluorescence)
Useful method when protein fragmented into shorter polypeptides
N
SO2
Cl
+
HN CH
R
C
O
N
SO2
H2N CH
R
C
O
HCl +H3O+
HN CH
R
C
O
OH
N
SO2
+ other free AAs
Dabsyl chloride and FDNB
Same result as dansyl chloride
Dabsyl chloride
1-Fluoro-2,4-dinitrobenzene (FDNB) Sanger method
SN
NN O
O
Cl
Edman degradation
Phenylisothiocyanate (PITC) Reacts with N-terminal AA to produce a phenylthiocarbamyl (PTC) Treat with TFAA (solvent/catalyst) to cleave N-terminal residue Does not hydrolyze other AAs Treatment with dilute acid makes more stable organic compound
Identify using NMR, HPLC, etc. Sequenator (entire process for proteins < 100 residues)
Fragmenting Proteins
Formation of smaller segments to assist with
sequencing
Process: Cleave protein into specific fragments
Chemically or enzymatically
Break disulfide bonds
Purify fragments
Sequence fragments
Determine order of fragments and disulfide bonds
Cleaving Disulfide Bonds
Oxidize with performic acid
Cys residues form cysteic acid
Acid can oxidize other
residues, so not ideal
H C
O
O OH
Cleaving Disulfide Bonds
Reduce by mercaptans (-SH) 2-Mercaptoethanol
HSCH2CH2OH
Dithiothreitol (DTT)
HSCH2CH(OH)CH(OH)CH2SH
Reform cysteine residues
Oxidize thiol groups with
iodoacetete (ICH2CO2-) to
prevent reformation of disulfide
bonds
Hydrolysis
Cleaves all peptide bonds Achieved by
Enzyme Acid Base
After cleavage: Identify using chromatography Quantify using absorbance or fluorescence
Disadvantages Doesn’t give exact sequence, only AAs present Acid and base can degrade/modify other residues Enzymes (which are proteins) can also cleave and affect results
Enzymatic and Chemical Cleavage
Enzymatic Enzymes used to break
protein into smaller peptides
Endopeptidases Catalyze hydrolysis of
internal peptide bonds
Chemical Chemical reagents used to
break up polypeptides Cyanogen bromide (BrCN)
An example
Fundamentals of Protein Fundamentals of Protein StructureStructure
Our life is maintained by Our life is maintained by molecular network systemsmolecular network systems
Molecular network system in a cell
(From ExPASy Biochemical Pathways; http://www.expasy.org/cgi-bin/show_thumbnails.pl?2)
Proteins play key roles in a Proteins play key roles in a living systemliving system
Three examples of protein functions
Catalysis:Almost all chemical reactions in a living cell are catalyzed by protein enzymes.
Transport:Some proteins transports various substances, such as oxygen, ions, and so on.
Information transfer:For example, hormones.
Alcohol dehydrogenas
e oxidizes alcohols to
aldehydes or ketones
Haemoglobin carries oxygen
Insulin controls the
amount of sugar in the
blood
Amino acid: Basic unit of Amino acid: Basic unit of proteinprotein
COO-NH3+ C
R
HAn amino
acid
Different side chains, R,
determin the properties of 20
amino acids.
Amino group Carboxylic acid group
20 20 Amino acidsAmino acids
Glycine (G)
Glutamic acid (E)Asparatic acid (D)
Methionine (M)
Threonine (T)
Serine (S)
Glutamine (Q)
Asparagine (N)
Tryptophan (W)Phenylalanine (F)
Cysteine (C)
Proline (P)
Leucine (L)Isoleucine (I)Valine (V)
Alanine (A)
Histidine (H)Lysine (K)
Tyrosine (Y)
Arginine (R)
White: Hydrophobic, Green: Hydrophilic, Red: Acidic, Blue: Basic
Proteins are linear polymers of Proteins are linear polymers of amino acidsamino acids
R1
NH3+
C CO
H
R2
NH C CO
H
R3
NH C CO
H
R2
NH3+
C COOー
H+
R1
NH3+
C COOー
H+
H2OH2O
Peptide bond
Peptide bond
The amino acid sequence is called
as primary structure A AF
NGG
S TS
DK
A carboxylic acid condenses with an
amino group with the release of a water
Amino acid sequence is encoded Amino acid sequence is encoded by DNA base sequence in a geneby DNA base sequence in a gene
・CGCGAATTCGCG・
・GCGCTTAAGCGC・
DNA molecule
=
DNA base sequence
Amino acid sequence is encoded Amino acid sequence is encoded by DNA base sequence in a geneby DNA base sequence in a gene
Second letterT C A G
Firs
t lette
r
T
TTTPhe
TCT
Ser
TATTyr
TGTCys
T
Th
ird le
tter
TTC TCC TAC TGC CTTA
LeuTCA TAA
StopTGA Stop A
TTG TCG TAG TGG Trp G
C
CTT
Leu
CCT
Pro
CATHis
CGT
Arg
TCTC CCC CAC CGC CCTA CCA CAA
GlnCGA A
CTG CCG CAG CGG G
A
ATTIle
ACT
Thr
AATAsn
AGTSer
TATC ACC AAC AGC CATA ACA AAA
LysAGA
ArgA
ATG Met ACG AAG AGG G
G
GTT
Val
GCT
Ala
GATAsp
GGT
Gly
TGTC GCC GAC GGC CGTA GCA GAA
GluGGA A
GTG GCG GAG GGG G
Gene is protein’s blueprint, Gene is protein’s blueprint, genome is life’s blueprint genome is life’s blueprint
Gene
GenomeDNA
Protein
Gene GeneGene
Gene
GeneGeneGeneGene
GeneGeneGeneGene
GeneGene
Protein Protein
ProteinProtein
Protein
ProteinProtein
Protein
Protein
Protein
Protein
ProteinProtein
Protein
Gene is protein’s blueprint, Gene is protein’s blueprint, genome is life’s blueprint genome is life’s blueprint
Genome
Gene GeneGene
Gene
GeneGeneGeneGene
GeneGeneGeneGene
GeneGene
Protein Protein
ProteinProtein
Protein
ProteinProtein
Protein
Protein
Protein
Protein
ProteinProtein
Protein
Glycolysis network
3 billion base pair => 6 G letters &
1 letter => 1 byte
The whole genome can be recorded in just 10 CD-ROMs!
In 2003, Human genome In 2003, Human genome sequence was deciphered!sequence was deciphered! Genome is the complete set of genes of a living thing.
In 2003, the human genome sequencing was completed. The human genome contains about 3 billion base pairs. The number of genes is estimated to be between 20,000 to
25,000. The difference between the genome of human and that of
chimpanzee is only 1.23%!
Each Protein has a unique Each Protein has a unique structurestructure
Amino acid sequence
NLKTEWPELVGKSVEEAKKVILQDKPEAQIIVLPVGTIVTMEYRIDRVRLFVDKLDNIAE
VPRVGFolding!
Basic structural units of proteins: Basic structural units of proteins: Secondary structureSecondary structure
α-helix β-sheet
Secondary structures, α-helix and β-sheet, have
regular hydrogen-bonding patterns.
Three-dimensional structure of Three-dimensional structure of proteinsproteins
Tertiary structure
Quaternary structure
Hierarchical nature of protein Hierarchical nature of protein structurestructure
Primary structure (Amino acid sequence)↓
Secondary structure ( α-helix, β-sheet )↓
Tertiary structure ( Three-dimensional structure formed by assembly of secondary structures )
↓Quaternary structure ( Structure formed by more
than one polypeptide chains )
Close relationship between Close relationship between protein structure and its functionprotein structure and its function
enzyme
A
B
A
Binding to A
Digestion of A!
enzyme
Matching the shape
to A
Hormone receptor AntibodyExample of enzyme reaction
enzyme
substrates
Protein structure prediction has Protein structure prediction has remained elusive over half a remained elusive over half a centurycentury“Can we predict a protein structure from
its amino acid sequence?”
Now, impossible!
SummarySummary Proteins are key players in our living systems. Proteins are polymers consisting of 20 kinds of amino
acids. Each protein folds into a unique three-dimensional structure
defined by its amino acid sequence. Protein structure has a hierarchical nature. Protein structure is closely related to its function. Protein structure prediction is a grand challenge of
computational biology.