Post on 06-Feb-2018
transcript
Protein-Protein Interactions:
Stability, Function and Landscape
Structural Aspects of Protein-Protein Interactions
Agenda
• Understand the importance of studying protein-protein interactions at the structural level
• Classify the various types of interactions• Look at one structure-based method for predicting protein-protein
interactions
LINK
Protein interaction
• Definition• Specific interactions between two or more
proteins.
• Examples• Enzyme-inhibitor complex; antibody-
antigen complex; receptor-ligand interactions, multiprotein complexes such as ribosomes or RNA polymerases.
Homocomplexes are usuallypermanent and optimized (e.g., thehomodimer cytochrome c9 (1)) (Fig. 1a).Heterocomplexes can also have suchproperties, or they can be non-obligatory,being made and broken according to theenvironment or external factors and involveproteins that must also exist independently[e.g., the enzyme–inhibitorcomplex trypsin with the inhibitor frombitter gourd (2) (Fig. 1b) and the antibody–protein complex HYHEL-5 with lysozyme(3) (Fig. 1c)].
It is important to distinguish between the different types of complexes when analyzing the intermolecularinterfaces that occur within them.
Characteristics
Classification: Protein-protein interactions can be arbitrarily classified based on theproteins involved (structural or functional groups) or based on their physicalproperties (weak and transient, “non-obligate” vs. strong and permanent). Proteininteractions are usually mediated by defined domains, hence interactions can also beclassified based on the underlying domains.
Universality: All of molecular biology is about protein-protein interactions (Alberts etal. 2002, Lodish et al. 2000). Protein-protein interactions affect all processes in a cell:structural proteins need to interact in order to shape organelles and the whole cell,molecular machines such as ribosomes or RNA polymerases are hold together byprotein-protein interactions, and the same is true for multi-subunit channels orreceptors in membranes.
Specificity distinguishes such interactions from random collisions that happen byBrownian motion in the aqeous solutions inside and outside of cells. Note that manyproteins are known to interact although it remains unclear whether certain interactionshave any physiological relevance.
Number of interactions: It is estimated that even simple single-celled organisms suchas yeast have their roughly 6000 proteins interact by at least 3 interactions per protein,i.e. a total of 20,000 interactions or more. By extrapolation, there may be on the orderof ~100,000 interactions in the human body.
The protein-protein interaction network in yeast.An interaction map of the yeast proteome assembled from published interactions. The map contains 1,548 proteins (boxes) and 2,358 interactions (connecting lines).
Homo- and hetero-oligomeric complexes
Protein-protein interactions (PPIs) occur between identical or non-identical chains (i.e. homo- or hetero-oligomers). (A-B)
Oligomers of identical or homologous protein units can be organized in an isologous or heterologous way (Monod et al., 1965) with structural symmetry (Goodsell and Olson, 2000).
An isologous association involves the same surface on both monomers (e.g. Arc repressor and lysin; Figure 1A and C), related by a 2-fold symmetry axis.
In contrast to an isologous association that can only further oligomerize using a different interface (e.g. form a dimer of dimers with three 2-fold axes of symmetry), heterologous assemblies use different interfaces that, without a closed (cyclic) symmetry, can lead to infinite aggregation.
Non-obligate and obligate complexesAs well as composition, two different types of complexescan be distinguished on the basis of whether a complex isobligate or non-obligate. In an obligate PPI, the protomersare not found as stable structures on their own in vivo.Such complexes are generally also functionally obligate;for example, the Arc repressor dimer (Figure 1A) isessential for DNA binding. Many of the hetero-oligomeric structures in the Protein Data Bank involve non-obligateinteractions of protomers that exist independently, such asintracellular signalling complexes (e.g. RhoA±RhoGAP;Figure 1D) and antibody±antigen, receptor±ligand andenzyme±inhibitor (e.g. thrombin±rodniin; Figure 1E) complexes.The components of such protein±protein complexesare often initially not co-localized and thus need tobe independently stable. However, some homo-oligomers,which by definition are co-localized, can also form nonobligateassemblies (e.g. sperm lysin; Figure 1C).
Transient and permanent complexesPPIs can also be distinguished based on the lifetime of thecomplex. In contrast to a permanent interaction that isusually very stable and thus only exists in its complexedform, a transient interaction associates and dissociatesin vivo. We distinguish weak transient interactions thatfeature a dynamic oligomeric equilibrium in solution,where the interaction is broken and formed continuously(e.g. lysin; Figure 1C), and strong transient associationsthat require a molecular trigger to shift the oligomericequilibrium. For example, the heterotrimeric G protein(Figure 1F) dissociates into the Ga and Gbg subunitsupon guanosine triphosphate (GTP) binding, but forms astable trimer with guanosine diphosphate (GDP) bound.Structurally or functionally obligate interactions areusually permanent, whereas non-obligate interactionsmay be transient or permanent.
Types of protein-protein interactions (PPI)
Obligate PPI
the protomers are notfound as stable
structures on their own in vivo
Non-obligate PPI
Obligate homodimer
P22 Arc repressor
DNA-binding
Obligate heterodimer
Human cathepsin D
1LYB
Non-obligate homodimer
Sperm lysin
Non-obligate heterodimer
RhoA and RhoGAP signaling complex
Types of protein-protein interactions (PPI)
Obligate PPI
usually permanent
the protomers are not found as stable structures on their
own in vivo
Non-obligate PPI
Obligate heterodimer
Human cathepsin D
Non-obligate transient homodimer, Sperm lysin(interaction is broken and
formed continuously)
Permanent(many enzyme-inhibitor
complexes)
dissociation constant Kd=[A][B] / [AB]
10-7 - 10-13 M
Transient
Weak(electron transport
complexes)
Kd mM-µM
Non-obligate permanent
heterodimerThrombin and rodniin
inhibitor
Intermediate(antibody-antigen, TCR-MHC-peptide, signal transduction PPI), Kd µM-nM
Strong(require a molecular
trigger to shift the oligomeric
equilibrium)
Kd nM-fMBovine G protein dissociates into Gα and Gβγ subunits upon GTP, but forms a stable trimer upon GDP
Types of protein-protein interactions (PPI)
Obligate PPI
usually permanent
the protomers are not found as stable structures on their
own in vivo
Non-obligate PPI
Obligate heterodimer
Human cathepsin D
Non-obligate transient homodimer, Sperm lysin(interaction is broken and
formed continuously)
Permanent(many enzyme-inhibitor
complexes)
dissociation constant Kd=[A][B] / [AB]
10-7 ÷ 10-13 M
Transient
Weak(electron transport
complexes)
Kd mM-µM
Non-obligate permanent
heterodimerThrombin and rodniin
inhibitor
Intermediate(antibody-antigen, TCR-MHC-peptide, signal transduction PPI), Kd µM-nM
Strong(require a molecular
trigger to shift the oligomeric
equilibrium)
Kd nM-fMBovine G protein dissociates into Gα and Gβγ subunits upon GTP, but forms a stable trimer upon GDP
Structural features of protein-interaction sites
• The contact area between two proteins is almost always bigger than 1100 Å2 with each of the interacting partners contributing at least 550 Å2 of complementary surface.
• On average each partner loses about 800 Å2 of solvent-accessible surface upon contact, contributed by some 20 amino acid residues of each partner, i.e. the average interface residue covers some 40 Å2.
• NACCESS
• The Accessible surface area (ASA) of the complexes is calculated using an implementaion of the Lee and Richards (1971) algorithm devloped by Hubbard (1992). With a probe sphere, of radius 1.4 angstroms, the ASA was defined as the surface mapped out by the centre of the probe as if it were rolled around the van der Waals surface of the protein. The program is used to calculate the ASA of each protomer in the complex and then the complete complex.The ASA shown in the results table is for a single subunit (chain1 as designated by the user on the submission form (this subunit is indiacted at the top of the table and coloured purple)).
• Forces that mediate protein-protein interactions include electrostatic interactions, hydrogen bonds, the van der Waals attraction and hydrophobic effects.
• The average protein-protein interface is not less polar or more hydrophobic than the surface remaining in contact with the solvent. Water is usually excluded from the contact region.
• Non-obligate complexes tend to be more hydrophilic in comparison, as each component has to exist independently in the cell.
• It has been proposed that hydrophobic forces drive protein-protein interactions and hydrogen bonds and salt bridges confer specificity.
Shape: Independent studies showed that 83-84% of interfaces are more or less flat. With few exceptions, the interfaces are approximately circular areas on the protein surface in both permenant and non-obligate complexes. Interfaces in permanent associations tend to be larger, less planar, more highly segmented (in terms of sequence), and closer packed than interfaces in non-obligate associations.
Complementarity: can be measured in terms of “fitting surface shape”. Interfaces in homodimers, enzyme-inhibitor complexes, and permanent heterocomplexes are the most complementary, whilst the antibody-antigen complexes and the non-obligate heterocomplexes are the least complementary.
Secondary structure: In one study the loop interactions contributed, on average, 40% of the interface contacts. In another study (involving 28 homodimers), 53% of the interface residues were a-helical, 22% beta sheets, and 12% ab, with the rest being coils.
Amino acid composition: Interfaces have been shown to be more hydrophobic thanthe exterior but less hydrophobic than the interior of a protein. In one study, 47% ofinterface residues were hydrophobic, 31% polar and 22% charged. Permanentcomplexes have interfaces that contain hydrophobic residues, whilst the interfaces in 5 non-obligate complexes favour the more polar residues. Site-directed mutagenesisshowed that in many cases a large majority (i.e. > 50%) of interface residues can bemutated to alanine with little effect on Kd: i.e. the functional epitope is a subset of thestructural epitope.
Clinical relevance and applications of protein-protein interaction analysis
Biologically active proteins such as peptide hormones or antibodies act by interactingwith other proteins such as receptors or antigens, respectively. Knowing theirinteraction sites allows the modification of the activity of such proteins or changingtheir specificity. In addition, small molecules may be designed that block interactionssuch as the binding of virus coat proteins to their cellular receptors, thereby blockinginfection. Proteins and their interactions are therefore potential drug targets.Sometimes, protein-protein interactions are disadvantageous, such as in insulin thattends to form dimers and hexamers which are less active than monomers. Geneticallyengineered insulin molecules retain biological activity without oligomerizing.
What Is the Preferred Way for Proteins to Interact?
• An ultimate goal in molecular and cellular biology is to predict the preferred mode of protein associations
- Similar protein structures can associate in different ways
- Different protein structures can associate in similar ways
Binding Is Still NotEntirely Understood!
• We usually observe one or two interaction sites; However a large portion of the surface is probably involved in binding
• Some associations are stable; others are low affinity
• Binding reactions are often cooperative events
• Binding strength is condition-dependent
Possible reasons:
Interfaces Are Variable
• Different relative contributions of the hydrophobic effect versus electrostatic interactions
• Wide range of motifs, with no prevailing architectures
A Dataset ofProtein-Protein Interfaces
• A nonredundant dataset provides diversity
• The clusters allow studies of
- interface structures vs function
- residue conservation
Definition Of Interfaces:• An interface is the region between two polypeptide chains not
covalently linked• Residue selection is based on how close this residue is to
residues of the second chain. If two residues (one from each chain) are in contact, they are interacting residues
• Residues in the vicinity of interacting residues are nearby residues. They provide the structural scaffold of the interfaces
A protein complex forming an interface
Magenta:Interacting residues
Cyan:Nearby residues
Generation of the Dataset of Interfaces
• We started the generation of the dataset by extracting the interfaces between chains from the PDB coordinates
• On July 18, 2002, there were 18,687 entries in the PDB which included 35,112 single chains including all individual chains in dimers, trimers and so on. The dataset of interfaces contains 21,686two-chain interfaces
An Interface Between Two Chains
Interface Composition: Example
The interface between the two chains:In green 'nearby' residues and in blue contact residues in chain A.In red nearby' residues and in magenta contact residues in chain B.
A magnification of the interface:Balls depict C-alpha. The numbers refer to the residue positions. Green and blue are nearby and contact C-alpha's in chain A. Red and magenta are nearby and contact C-alpha's in chain B.
Residue Order Independence
A B A B
C
DE E D
C
Similar arrangement in space;Different sequential order
Representation of Proteins As Sets of Points in the Three Dimensional Space
Each ball is a C-alpha
• Experimentally, a hot spot is a residue that, when mutated to alanine, gives rise to a distinct drop in the binding constant (tenfold or more)
• All data are deposited inhttp://www.asedb.org
DeLano, W.L., Unraveling hot spots in binding interfaces:progress and challenges. Curr. Opin. Struct. Biol. 2002
Hot Spots In The Interfaces
Computationally, Hot Spots Distinguish Between Binding Sites and Exposed
Protein Surfaces
(B. Ma, T. Elkayam, H. Wolfson, R. NussinovPNAS | May 13, 2003 | vol. 100 | no. 10 | 5772-5777)
Multiple Structure Alignment (MUSTA)
Interface
Exposed surface
Conserved hot spots
• Align all structures in each cluser• Find structurally conserved residues (hot spot)
• Clustering is iterative• At each iteration strict criteria are used • At the end of the first cycle, the number of interface clusters
decreased from 21,686 to 16,446• Members in each cluster share at most 90% connectivity score with
at most 90% sequence identity• All have exact number of interface residues on their interfaces
5000.5F40100.6E20250.7D10500.8C3800.9B0900.9A
Maximal amino acid size difference
between interfaces
Minimal %aminoAcid
identity
Minimal connec-tivityscore
Number of interfacesCycle
Clustering
The parameters used during the clustering of the interfaces are:
21686 → 1644616446 → 96379637 → 66476647 → 53325332 → 44294429 → 3799
Further Filtering
• Each cluster should at least have 5 members
• None of the members should share a sequence similarity score of 50% or higher
(Sequence alignments are done with CLUSTALW)
Cluster Categories• This filtering reduced the number of clusters from 3799 to 103
- Library construction carried out through pair-wise comparison
• Based on multiple structure alignment, the clusters are divided into two categories1. Category I interface are clusters which share only ONE
similar side. These clusters allow us to address the problem of how a given binding site can bind somewhat different protein surfaces
Cluster Category II2. Interface clusters which share TWO similar sides
- Type I: Clusters with similar interfaces and similar functions
- Type II: Clusters with similar interfaces but dissimilar functions
Sample List of Some of the Two-chain Interface ClustersThe dataset contains functional dimers, and others as receptor/ligands,
antibody/antigens, enzyme/inhibitors, coat/capsid proteins
6261a0nAB, 1aboAC, 1azeAB, 1gcqAC, 1io6AB, 1jegABSH3-domain proteins
6411d3bAB, 1i4k12, 1i8fAG, 1d3bAB, 1i4kZ1, 1i4k12SM-LIKE RIBONUCLEOPROTEINS, SNRNP
6671as4AB, 1c8oAB, 1d5sAB, 1hleAB, 1jjoC, 1paiABSERPINS
51111cydAD, 1e3sAC, 1e92AC, 1hdcAD, 1i01ABNAD(P)-BINDING PROTEINSRossmann-fold domains: Tyrosine-dependent oxidoreductases
6841aoiAB, 1aoiCD, 1b67AB, 1bh8AB, 1jfiAB, 1tafABHISTONE-FOLD PROTEINS
51101al212,1aym12,1bev12,1cov12,1hri12VIRAL COAT & CAPSID PROTEINS
5931al223, 1aym23, 1b35BC, 1bev23, 1tme23VIRAL COAT & CAPSID PROTEINS
5181b77AC, 1axcAC, 1axcAE, 1b77AB, 1a2pBCDNA clamp Family1: DNA polymerase processivity factor & Microbial ribonucleases
7611jh5AB, 1iqaAB, 1d0gAB, 1cdaAB, 1c28AC, 1bziBC, 1a8mAB
APOPTOSIS PROTEINS(Superfamily:TNF-like, Family: TNF-like)
10331cd0AB, 1a2yAB, 1a6uLH, 1ac6AB, 1akjDE, 1ao7DE, 1a14HL, 1d9kAB, 1fo0AB, 1tvdAB
ANTIBODIES Immunoglobulinantibody variable domain like
106710gsAB, 1axdAB, 1b48AB, 1c72AB, 1f2eAB, 1gnwAB, 1gwcBC, 1jlvAB, 1ljrAB, 1pd212
TRANSFARASESGlutathione S-transferases,C-terminal domain
# of members
aligned residues
Members of the cluster(proteincomplexes in the cluster)
Family Name (from SCOP database)
First Type: Interface Clusters WithSimilar Interfaces; Similar Functions
10gsAB 1b48AB
AB
A B
Glutathione s-transferases
Human glutathione S-transferase p1-complexwith ter117
Crystal structure of mgsta4-4 in complex with GSH conjugate of 4-hydroxynonenal in one subunit and GSH in the other
Type II: Structure and Function
• A well known paradigm states that proteins with similar structures can have different functions
• The type II interface clusters similarly illustrates that interfaces sharing the same cluster can belong to functionally different families
Extending theStructure-Function Paradigm
• The clusters extend and generalize this striking structure-function paradigm
• Not only does it apply to monomers, it further applies to protein-protein interfaces
Extending the Sequence-Structure Postulate
• For monomers it has been well known that different amino acid sequences can fold into similar structures; Since the sequences are different, it is not surprising that the function can also be different
• The clusters illustrate that in all such similar interfaces different function cases, the structures of the monomers are also different
Examples of Cases of Similar Interfaces and Different Functions
In all such casesthe monomer structures are different
Interface Clusters With Similar Interfaces and Dissimilar Functions-1
Chromatin StructureMouse hp1 (m31) C terminal (shadow chromo) domain
TransferaseStructure of
Human transaldolase
AB
A B
1dz1AB1f05AB
Interface Clusters With Similar Interfaces and Dissimilar Functions-2
Complex (DNA-binding protein/DNA)
Human PCNA
RibonucleaseBarnase Wildtype Structure
C
C
B
A
1axcAC
1a2pBC
Interface Clusters With Similar Interfaces and Dissimilar Functions-3
Virus/viral proteinStructure of the
Ebola Virus Membrane-fusion
Contractile proteinTropomyosin Molecule
A BC
D
1eboAB 1ic2CD
Similar Interfaces; Different Functions
• The similar interfaces - different function can be rationalized:
- Just as in monomer structures, evolution has utilized "good" favorable motifs for many (different!) functions
- Hence, of all the combinatorially possible ways for different monomer structures to associate, they still prefer to interact in similar ways to yield preferred interface architectures
How Can We Use the Dataset of Interfaces for
Prediction of Binding Sites?
• Experimentally, a hot spot is a residue that, when mutated to alanine, gives rise to a distinct drop in the binding constant (tenfold or more)
• All data are deposited inhttp://www.asedb.org
DeLano, W.L., Unraveling hot spots in binding interfaces:progress and challenges. Curr. Opin. Struct. Biol. 2002
Hot Spots In The Interfaces
Computationally, Hot Spots Distinguish Between Binding Sites and Exposed
Protein Surfaces
(B. Ma, T. Elkayam, H. Wolfson, R. NussinovPNAS | May 13, 2003 | vol. 100 | no. 10 | 5772-5777)
Multiple Structure Alignment (MUSTA)
Interface
Exposed surface
Conserved hot spots
• Align all structures in each cluser• Find structurally conserved residues (hot spot)