The Structure and Function of Proteins Bioinformatics Ch 7.

Post on 28-Mar-2015

218 views 0 download

Tags:

transcript

The Structure andFunction of Proteins

Bioinformatics Ch 7

The many functions of proteins

• Mechanoenzymes: myosin, actin• Rhodopsin: allows vision• Globins: transport oxygen• Antibodies: immune system• Enzymes: pepsin, renin, carboxypeptidase A• Receptors: transmit messages through membranes• Vitelogenin: molecular velcro

– And hundreds of thousands more…

Complex Chemistry Tutorial

• Molecules are made of atoms!

• There is a lot of hydrogen out there!

• Atoms make a “preferred” number of covalent (strong) bonds– C – 4

– N – 3

– O, S – 2

• Atoms will generally “pick up” enough hydrogens to “fill their valence capacity” in vivo.

• Molecules also “prefer” to have a neutral charge

Biochemistry

• In the context of a protein…– Oxygen tends to exhibit a slight negative charge

– Nitrogen tends to exhibit a slight positive charge

– Carbon tends to remain neutral/uncharged

• Atoms can “share” a hydrogen atom, each making “part” of a covalent bond with the hydrogen– Oxygen: H-Bond donor or acceptor

– Nitrogen: H-Bond donor

– Carbon: Neither

Proteins are chains of amino acids

• Polymer – a molecule composed of repeating units

Amino acid composition

• Basic Amino AcidStructure:– The side chain, R,

varies for each ofthe 20 amino acids

– Amino & Carboxyl groups, plus Carbon make the “Backbone” of the amino acid

C

RR

C

H

NO

OHH

H

Aminogroup

Carboxylgroup

Side chain

The Peptide Bond

• Dehydration synthesis

• Repeating backbone: N–C –C –N–C –C

– Convention – start at amino terminus and proceed to carboxy terminus

O O

Peptidyl polymers

• A few amino acids in a chain are called a polypeptide. A protein is usually composed of 50 to 400+ amino acids.

• Since part of the amino acid is lost during dehydration synthesis, we call the units of a protein amino acid residues.

carbonylcarbonylcarboncarbon

amideamidenitrogennitrogen

Side chain properties

• Recall that the electronegativity of carbon is at about the middle of the scale for light elements– Carbon does not make hydrogen bonds with water easily

– hydrophobic– O and N are generally more likely than C to h-bond to

water – hydrophilic

• We group the amino acids into three general groups:– Hydrophobic– Charged (positive/basic & negative/acidic)– Polar

The Hydrophobic Amino Acids

Proline severelyProline severelylimits allowablelimits allowableconformations!conformations!

The Charged Amino Acids

The Polar Amino Acids

More Polar Amino Acids

And then there’s…And then there’s…

Planarity of the peptide bond

Phi () – the angle of rotation about the N-C bond.

Psi () – the angle of rotation about the C-C bond.

The planar bond angles and bond lengths are fixed.

Phi and psi

= = 180° is extended conformation

: C to N–H : C=O to C

C

C=O

N–H

The Ramachandran Plot

• G. N. Ramachandran – first calculations of sterically allowed regions of phi and psi

• Note the structural importance of glycine

Observed(non-glycine)

Observed(glycine)Calculated

Primary & Secondary Structure

• Primary structurePrimary structure = the linear sequence of amino acids comprising a protein:

AGVGTVPMTAYGNDIQYYGQVT…• Secondary structureSecondary structure

– Regular patterns of hydrogen bonding in proteins result in two patterns that emerge in nearly every protein structure known: the -helix and the-sheet

– The location of direction of these periodic, repeating structures is known as the secondary structuresecondary structure of the protein

The alpha helix 60°

Properties of the alpha helix 60°

• Hydrogen bondsHydrogen bondsbetween C=O ofresidue n, andNH of residuen+4

• 3.6 residues/turn

• 1.5 Å/residue rise

• 100°/residue turn

Properties of -helices

• 4 – 40+ residues in length• Often amphipathic or “dual-natured”

– Half hydrophobic and half hydrophilic– Mostly when surface-exposed

• If we examine many -helices,we find trends…– Helix formers: Ala, Glu, Leu,

Met– Helix breakers: Pro, Gly, Tyr,

Ser

The beta strand (& sheet) 135° +135°

Properties of beta sheets

• Formed of stretches of 5-10 residues in extended conformation

• Pleated – each C a bitabove or below the previous

• Parallel/aniparallelParallel/aniparallel,contiguous/non-contiguous

Parallel and anti-parallel -sheets• Anti-parallel is slightly energetically favored

Anti-parallelAnti-parallel ParallelParallel

Turns and Loops• Secondary structure elements are connected by

regions of turns and loops• Turns – short regions

of non-, non-conformation

• Loops – larger stretches with no secondary structure. Often disordered.– “Random coil”– Sequences vary much more than secondary structure

regions

Levels of Protein

Structure

• Secondary structure elements combine to form tertiary structure

• Quaternary structure occurs in multienzyme complexes– Many proteins are active

only as homodimers, homotetramers, etc.

Secondary Structure Prediction

• Based on backbone flexibility

• Various methods– Statistical, neural networks, evolutionary

computation.– Conserved aligned sequences as input (degree

calculated)– PHD can get 70-75% accuracy

Chou-Fasman ParametersName Abbrv P(a) P(b) P(turn) f(i) f(i+1) f(i+2) f(i+3)Alanine A 142 83 66 0.06 0.076 0.035 0.058Arginine R 98 93 95 0.07 0.106 0.099 0.085Aspartic Acid D 101 54 146 0.147 0.11 0.179 0.081Asparagine N 67 89 156 0.161 0.083 0.191 0.091Cysteine C 70 119 119 0.149 0.05 0.117 0.128Glutamic Acid E 151 37 74 0.056 0.06 0.077 0.064Glutamine Q 111 110 98 0.074 0.098 0.037 0.098Glycine G 57 75 156 0.102 0.085 0.19 0.152Histidine H 100 87 95 0.14 0.047 0.093 0.054Isoleucine I 108 160 47 0.043 0.034 0.013 0.056Leucine L 121 130 59 0.061 0.025 0.036 0.07Lysine K 114 74 101 0.055 0.115 0.072 0.095Methionine M 145 105 60 0.068 0.082 0.014 0.055Phenylalanine F 113 138 60 0.059 0.041 0.065 0.065Proline P 57 55 152 0.102 0.301 0.034 0.068Serine S 77 75 143 0.12 0.139 0.125 0.106Threonine T 83 119 96 0.086 0.108 0.065 0.079Tryptophan W 108 137 96 0.077 0.013 0.064 0.167Tyrosine Y 69 147 114 0.082 0.065 0.114 0.125Valine V 106 170 50 0.062 0.048 0.028 0.053

Chou-Fasman Algorithm

• Identify -helices– 4 out of 6 contiguous amino acids that have P(a) > 100– Extend the region until 4 amino acids with P(a) < 100

found– Compute P(a) and P(b); If the region is >5 residues

and P(a) > P(b) identify as a helix

• Repeat for -sheets [use P(b)]• If an and a region overlap, the overlapping

region is predicted according to P(a) and P(b)

Chou-Fasman, cont’d

• Identify hairpin turns:– P(t) = f(i) of the residue * f(i+1) of the next residue *

f(i+2) of the following residue * f(i+3) of the residue at position (i+3)

– Predict a hairpin turn starting at positions where:• P(t) > 0.000075• The average P(turn) for the four residues > 100 P(a) < P(turn) > P(b) for the four residues

• Accuracy 60-65%

Chou-Fasman Example

• CAENKLDHVRGPTCILFMTWYNDGP• CAENKL – Potential helix (!C and !N)

• Residues with P(a) < 100: RNCGPSTY

– Extend: When we reach RGPT, we must stop– CAENKLDHV: P(a) = 972, P(b) = 843– Declare alpha helix

• Identifying a hairpin turn– VRGP: P(t) = 0.000085– Average P(turn) = 113.25

• Avg P(a) = 79.5, Avg P(b) = 98.25

Protein Structure Examples

Views of a protein

Wireframe Ball and stick

Views of a proteinSpacefill Cartoon CPK colors

Carbon = green, black, or grey

Nitrogen = blue

Oxygen = red

Sulfur = yellow

Hydrogen = white