1
Part I : Introduction to Protein Structure
A/P Shoba RanganathanKong Lesheng
National University of Singapore
Overview
n Why protein structure?
n The basics of protein
n Levels of protein structure
n Structural classification of protein structure
2
Why protein structure?
n In the factory of living cells, proteins are the workers, performing a variety of biological tasks.
n Each protein has a particular 3-D structure that determines its function.”Structure implies function”.
n Structure is more conserved than sequence.n Protein structure is central for understanding
protein functions.
Sequence Structure Function
To understand functions, we need structures
a - conotoxin ImI and its three mutants
Rogers et al., 2000, JMB 304, 911
3
Anfinsen’s thermodynamic hypothesis
“The three-dimensional structure of a native proteinin its normal physiological milieu (solvent, pH, ionicstrength, presence of other components such as metal ions or prosthetic groups, temperature, etc.)is the one in which the Gibbs free energy of thewhole system is lowest; that is, that the native conformation is determined by the totality of interatomic interactions and hence by the aminoacid sequence, in a given environment.”
--- Anfinsen’s Nobel lecture, 1972
What drives protein folding?
n Hydrophobic effectsn Hydrophobic residues tend to be buried insiden Hydrophilic residues tend to be exposed to solvent
n Hydrogen bonds help to stabilize the structure.
4
Overview
n Why protein structure?
n The basics of protein
n Levels of protein structure
n Structural classification of protein structure
The basics of protein
n Proteins have one or more polypeptide chainsn Building blocks: 20 amino acids.n Length range from 10 to 1000 residues.n Proteins fold into 3-D shape to perform
biological functions.
5
Common structure of Amino Acid
HH
H H
N
R
C α
O
OC
+
-
Amino
Carboxylate
Side chain = H,CH3,…Atoms numbered β,γ,δ,ε,ζ..
Backbone
Ca is the chiralcenter
Atom lost duringpeptide bondformation
Aliphatic residues
6
Aromatic residues
Charged residues
7
Polar residues
S
The odd couple
Side chain = H
Cα
CβCγ
CδCα
8
The peptide bonds
Coplanar atoms
9
Backbone torsion angles
Ramachandran / phi-psi plot
α-helix (right
handed)
β-sheet
α-helix (left handed)
φ
ψ
10
Overview
n Why protein structure?
n The basics of protein
n Levels of protein structure
n Structural classification of protein structure
Primary structure
n The amino acid sequences of polypeptides chains.
11
Secondary structure
n Local organization of protein backbone: α-helix, β-strand (which assemble into β-sheet), turn and interconnecting loop.
Ramachandran / phi-psi plot
α-helix (right
handed)
β-sheet
α-helix (left handed)
φ
ψ
12
The α-helix
n First structure to be predicted (Pauling, Corey, Branson, 1951) and experimentally solved (Kendrew et al., 1958) –myoglobin
n One of the most closely packed arrangement of residues.
n 3.6 residues per turn
n 5.4 Å per turn
The β-sheet
n Backbone almost fully extended, loosely packed arrangement of residues.
13
Topologies of β-sheets
Tertiary structure
n Packing the secondary structure elements into a compact spatial unit.
n “Fold” or domain– this is the level to which structure prediction is currently possible.
14
Quaternary structure
n Assembly of homo or heteromeric protein chains.
n Usually the functional unit of a protein, especially for enzymes
Structure comparison facts
n Proteins adopt only a limited number of folds.
n Homologous sequences show very similar structures: variations are mainly in non-conserved regions.
n There are striking regularities in the way in which secondary structures are assembled (Levitt & Chothia, 1976).
15
Overview
n Why protein structure?
n The basics of protein
n Levels of protein structure
n Structural classification of protein structure
n There are two major databases for protein structural classification: SCOP and CATH.
n They have different classification hierarchy and domain definitions.
16
SCOP
n http://scop.mrc-lmb.cam.ac.uk/scop/
n Structural Classification Of Proteins database
n Classification is done manually
n All nodes are annotated
SCOP at the top of the hierarchy
17
The hierarchy in SCOP
Root
Class
Fold
Superfamily
Family
Protein
Clear evolutionary relationship
Probable common ancestry
Have the same major secondary structure & topological connections
5 classes: All-α, All-ß, α/ ß, α+ ß, multi-domain
CATH
n http://www.biochem.ucl.ac.uk/bsm/cath
n Class-Architecture-Topology-Homologous superfamily
n Manual classification at Architecture level but automated at Topology level
18
ClassClass
ArchitectureArchitecture
TopologyTopology
Homologous Homologous SuperfamilySuperfamily
SequenceSequence
3 classes: Mainly-α, Mainly-ß, α-ß
Classified based on sequence identity
Share a common ancestor
Both the overall shape & connectivity of secondary structure
Overall shape as determined by orientations of secondary structures
The hierarchy in CATH