Protein-Protein Interactions: Stability, Function and...

transcript

Protein-Protein Interactions:

Stability, Function and Landscape

Structural Aspects of Protein-Protein Interactions

Agenda

• Understand the importance of studying protein-protein interactions at the structural level

• Classify the various types of interactions• Look at one structure-based method for predicting protein-protein

interactions

Protein interaction

• Definition• Specific interactions between two or more

proteins.

• Examples• Enzyme-inhibitor complex; antibody-

antigen complex; receptor-ligand interactions, multiprotein complexes such as ribosomes or RNA polymerases.

Homocomplexes are usuallypermanent and optimized (e.g., thehomodimer cytochrome c9 (1)) (Fig. 1a).Heterocomplexes can also have suchproperties, or they can be non-obligatory,being made and broken according to theenvironment or external factors and involveproteins that must also exist independently[e.g., the enzyme–inhibitorcomplex trypsin with the inhibitor frombitter gourd (2) (Fig. 1b) and the antibody–protein complex HYHEL-5 with lysozyme(3) (Fig. 1c)].

It is important to distinguish between the different types of complexes when analyzing the intermolecularinterfaces that occur within them.

Characteristics

Classification: Protein-protein interactions can be arbitrarily classified based on theproteins involved (structural or functional groups) or based on their physicalproperties (weak and transient, “non-obligate” vs. strong and permanent). Proteininteractions are usually mediated by defined domains, hence interactions can also beclassified based on the underlying domains.

Universality: All of molecular biology is about protein-protein interactions (Alberts etal. 2002, Lodish et al. 2000). Protein-protein interactions affect all processes in a cell:structural proteins need to interact in order to shape organelles and the whole cell,molecular machines such as ribosomes or RNA polymerases are hold together byprotein-protein interactions, and the same is true for multi-subunit channels orreceptors in membranes.

Specificity distinguishes such interactions from random collisions that happen byBrownian motion in the aqeous solutions inside and outside of cells. Note that manyproteins are known to interact although it remains unclear whether certain interactionshave any physiological relevance.

Number of interactions: It is estimated that even simple single-celled organisms suchas yeast have their roughly 6000 proteins interact by at least 3 interactions per protein,i.e. a total of 20,000 interactions or more. By extrapolation, there may be on the orderof ~100,000 interactions in the human body.

The protein-protein interaction network in yeast.An interaction map of the yeast proteome assembled from published interactions. The map contains 1,548 proteins (boxes) and 2,358 interactions (connecting lines).

Homo- and hetero-oligomeric complexes

Protein-protein interactions (PPIs) occur between identical or non-identical chains (i.e. homo- or hetero-oligomers). (A-B)

Oligomers of identical or homologous protein units can be organized in an isologous or heterologous way (Monod et al., 1965) with structural symmetry (Goodsell and Olson, 2000).

An isologous association involves the same surface on both monomers (e.g. Arc repressor and lysin; Figure 1A and C), related by a 2-fold symmetry axis.

In contrast to an isologous association that can only further oligomerize using a different interface (e.g. form a dimer of dimers with three 2-fold axes of symmetry), heterologous assemblies use different interfaces that, without a closed (cyclic) symmetry, can lead to infinite aggregation.

Non-obligate and obligate complexesAs well as composition, two different types of complexescan be distinguished on the basis of whether a complex isobligate or non-obligate. In an obligate PPI, the protomersare not found as stable structures on their own in vivo.Such complexes are generally also functionally obligate;for example, the Arc repressor dimer (Figure 1A) isessential for DNA binding. Many of the hetero-oligomeric structures in the Protein Data Bank involve non-obligateinteractions of protomers that exist independently, such asintracellular signalling complexes (e.g. RhoA±RhoGAP;Figure 1D) and antibody±antigen, receptor±ligand andenzyme±inhibitor (e.g. thrombin±rodniin; Figure 1E) complexes.The components of such protein±protein complexesare often initially not co-localized and thus need tobe independently stable. However, some homo-oligomers,which by definition are co-localized, can also form nonobligateassemblies (e.g. sperm lysin; Figure 1C).

Transient and permanent complexesPPIs can also be distinguished based on the lifetime of thecomplex. In contrast to a permanent interaction that isusually very stable and thus only exists in its complexedform, a transient interaction associates and dissociatesin vivo. We distinguish weak transient interactions thatfeature a dynamic oligomeric equilibrium in solution,where the interaction is broken and formed continuously(e.g. lysin; Figure 1C), and strong transient associationsthat require a molecular trigger to shift the oligomericequilibrium. For example, the heterotrimeric G protein(Figure 1F) dissociates into the Ga and Gbg subunitsupon guanosine triphosphate (GTP) binding, but forms astable trimer with guanosine diphosphate (GDP) bound.Structurally or functionally obligate interactions areusually permanent, whereas non-obligate interactionsmay be transient or permanent.

Types of protein-protein interactions (PPI)

Obligate PPI

the protomers are notfound as stable

structures on their own in vivo

Non-obligate PPI

Obligate homodimer

P22 Arc repressor

DNA-binding

Obligate heterodimer

Human cathepsin D

Non-obligate homodimer

Sperm lysin

Non-obligate heterodimer

RhoA and RhoGAP signaling complex

Obligate PPI

usually permanent

the protomers are not found as stable structures on their

own in vivo

Non-obligate PPI

Human cathepsin D

Non-obligate transient homodimer, Sperm lysin(interaction is broken and

formed continuously)

Permanent(many enzyme-inhibitor

complexes)

dissociation constant Kd=[A][B] / [AB]

10-7 - 10-13 M

Transient

Weak(electron transport

complexes)

Kd mM-µM

Non-obligate permanent

heterodimerThrombin and rodniin

inhibitor

Intermediate(antibody-antigen, TCR-MHC-peptide, signal transduction PPI), Kd µM-nM

Strong(require a molecular

trigger to shift the oligomeric

equilibrium)

Kd nM-fMBovine G protein dissociates into Gα and Gβγ subunits upon GTP, but forms a stable trimer upon GDP

Obligate PPI

usually permanent

the protomers are not found as stable structures on their

own in vivo

Non-obligate PPI

Human cathepsin D

Non-obligate transient homodimer, Sperm lysin(interaction is broken and

formed continuously)

Permanent(many enzyme-inhibitor

complexes)

dissociation constant Kd=[A][B] / [AB]

10-7 ÷ 10-13 M

Transient

Weak(electron transport

complexes)

Kd mM-µM

Non-obligate permanent

heterodimerThrombin and rodniin

inhibitor

Intermediate(antibody-antigen, TCR-MHC-peptide, signal transduction PPI), Kd µM-nM

Strong(require a molecular

trigger to shift the oligomeric

equilibrium)

Kd nM-fMBovine G protein dissociates into Gα and Gβγ subunits upon GTP, but forms a stable trimer upon GDP

Structural features of protein-interaction sites

• The contact area between two proteins is almost always bigger than 1100 Å2 with each of the interacting partners contributing at least 550 Å2 of complementary surface.

• On average each partner loses about 800 Å2 of solvent-accessible surface upon contact, contributed by some 20 amino acid residues of each partner, i.e. the average interface residue covers some 40 Å2.

• NACCESS

• The Accessible surface area (ASA) of the complexes is calculated using an implementaion of the Lee and Richards (1971) algorithm devloped by Hubbard (1992). With a probe sphere, of radius 1.4 angstroms, the ASA was defined as the surface mapped out by the centre of the probe as if it were rolled around the van der Waals surface of the protein. The program is used to calculate the ASA of each protomer in the complex and then the complete complex.The ASA shown in the results table is for a single subunit (chain1 as designated by the user on the submission form (this subunit is indiacted at the top of the table and coloured purple)).

• Forces that mediate protein-protein interactions include electrostatic interactions, hydrogen bonds, the van der Waals attraction and hydrophobic effects.

• The average protein-protein interface is not less polar or more hydrophobic than the surface remaining in contact with the solvent. Water is usually excluded from the contact region.

• Non-obligate complexes tend to be more hydrophilic in comparison, as each component has to exist independently in the cell.

• It has been proposed that hydrophobic forces drive protein-protein interactions and hydrogen bonds and salt bridges confer specificity.

Shape: Independent studies showed that 83-84% of interfaces are more or less flat. With few exceptions, the interfaces are approximately circular areas on the protein surface in both permenant and non-obligate complexes. Interfaces in permanent associations tend to be larger, less planar, more highly segmented (in terms of sequence), and closer packed than interfaces in non-obligate associations.

Complementarity: can be measured in terms of “fitting surface shape”. Interfaces in homodimers, enzyme-inhibitor complexes, and permanent heterocomplexes are the most complementary, whilst the antibody-antigen complexes and the non-obligate heterocomplexes are the least complementary.

Secondary structure: In one study the loop interactions contributed, on average, 40% of the interface contacts. In another study (involving 28 homodimers), 53% of the interface residues were a-helical, 22% beta sheets, and 12% ab, with the rest being coils.

Amino acid composition: Interfaces have been shown to be more hydrophobic thanthe exterior but less hydrophobic than the interior of a protein. In one study, 47% ofinterface residues were hydrophobic, 31% polar and 22% charged. Permanentcomplexes have interfaces that contain hydrophobic residues, whilst the interfaces in 5 non-obligate complexes favour the more polar residues. Site-directed mutagenesisshowed that in many cases a large majority (i.e. > 50%) of interface residues can bemutated to alanine with little effect on Kd: i.e. the functional epitope is a subset of thestructural epitope.

Clinical relevance and applications of protein-protein interaction analysis

Biologically active proteins such as peptide hormones or antibodies act by interactingwith other proteins such as receptors or antigens, respectively. Knowing theirinteraction sites allows the modification of the activity of such proteins or changingtheir specificity. In addition, small molecules may be designed that block interactionssuch as the binding of virus coat proteins to their cellular receptors, thereby blockinginfection. Proteins and their interactions are therefore potential drug targets.Sometimes, protein-protein interactions are disadvantageous, such as in insulin thattends to form dimers and hexamers which are less active than monomers. Geneticallyengineered insulin molecules retain biological activity without oligomerizing.

What Is the Preferred Way for Proteins to Interact?

• An ultimate goal in molecular and cellular biology is to predict the preferred mode of protein associations

- Similar protein structures can associate in different ways

- Different protein structures can associate in similar ways

Binding Is Still NotEntirely Understood!

• We usually observe one or two interaction sites; However a large portion of the surface is probably involved in binding

• Some associations are stable; others are low affinity

• Binding reactions are often cooperative events

• Binding strength is condition-dependent

Possible reasons:

Interfaces Are Variable

• Different relative contributions of the hydrophobic effect versus electrostatic interactions

• Wide range of motifs, with no prevailing architectures

A Dataset ofProtein-Protein Interfaces

• A nonredundant dataset provides diversity

• The clusters allow studies of

- interface structures vs function

- residue conservation

Definition Of Interfaces:• An interface is the region between two polypeptide chains not

covalently linked• Residue selection is based on how close this residue is to

residues of the second chain. If two residues (one from each chain) are in contact, they are interacting residues

• Residues in the vicinity of interacting residues are nearby residues. They provide the structural scaffold of the interfaces

A protein complex forming an interface

Magenta:Interacting residues

Cyan:Nearby residues

Generation of the Dataset of Interfaces

• We started the generation of the dataset by extracting the interfaces between chains from the PDB coordinates

• On July 18, 2002, there were 18,687 entries in the PDB which included 35,112 single chains including all individual chains in dimers, trimers and so on. The dataset of interfaces contains 21,686two-chain interfaces

An Interface Between Two Chains

Interface Composition: Example

The interface between the two chains:In green 'nearby' residues and in blue contact residues in chain A.In red nearby' residues and in magenta contact residues in chain B.

A magnification of the interface:Balls depict C-alpha. The numbers refer to the residue positions. Green and blue are nearby and contact C-alpha's in chain A. Red and magenta are nearby and contact C-alpha's in chain B.

Residue Order Independence

A B A B

DE E D

Similar arrangement in space;Different sequential order

Representation of Proteins As Sets of Points in the Three Dimensional Space

Each ball is a C-alpha

• Experimentally, a hot spot is a residue that, when mutated to alanine, gives rise to a distinct drop in the binding constant (tenfold or more)

• All data are deposited inhttp://www.asedb.org

DeLano, W.L., Unraveling hot spots in binding interfaces:progress and challenges. Curr. Opin. Struct. Biol. 2002

Hot Spots In The Interfaces

Computationally, Hot Spots Distinguish Between Binding Sites and Exposed

Protein Surfaces

(B. Ma, T. Elkayam, H. Wolfson, R. NussinovPNAS | May 13, 2003 | vol. 100 | no. 10 | 5772-5777)

Multiple Structure Alignment (MUSTA)

Interface

Exposed surface

Conserved hot spots

• Align all structures in each cluser• Find structurally conserved residues (hot spot)

• Clustering is iterative• At each iteration strict criteria are used • At the end of the first cycle, the number of interface clusters

decreased from 21,686 to 16,446• Members in each cluster share at most 90% connectivity score with

at most 90% sequence identity• All have exact number of interface residues on their interfaces

5000.5F40100.6E20250.7D10500.8C3800.9B0900.9A

Maximal amino acid size difference

between interfaces

Minimal %aminoAcid

identity

Minimal connec-tivityscore

Number of interfacesCycle

Clustering

The parameters used during the clustering of the interfaces are:

21686 → 1644616446 → 96379637 → 66476647 → 53325332 → 44294429 → 3799

Further Filtering

• Each cluster should at least have 5 members

• None of the members should share a sequence similarity score of 50% or higher

(Sequence alignments are done with CLUSTALW)

Cluster Categories• This filtering reduced the number of clusters from 3799 to 103

- Library construction carried out through pair-wise comparison

• Based on multiple structure alignment, the clusters are divided into two categories1. Category I interface are clusters which share only ONE

similar side. These clusters allow us to address the problem of how a given binding site can bind somewhat different protein surfaces

Cluster Category II2. Interface clusters which share TWO similar sides

- Type I: Clusters with similar interfaces and similar functions

- Type II: Clusters with similar interfaces but dissimilar functions

Sample List of Some of the Two-chain Interface ClustersThe dataset contains functional dimers, and others as receptor/ligands,

antibody/antigens, enzyme/inhibitors, coat/capsid proteins

6261a0nAB, 1aboAC, 1azeAB, 1gcqAC, 1io6AB, 1jegABSH3-domain proteins

6411d3bAB, 1i4k12, 1i8fAG, 1d3bAB, 1i4kZ1, 1i4k12SM-LIKE RIBONUCLEOPROTEINS, SNRNP

6671as4AB, 1c8oAB, 1d5sAB, 1hleAB, 1jjoC, 1paiABSERPINS

51111cydAD, 1e3sAC, 1e92AC, 1hdcAD, 1i01ABNAD(P)-BINDING PROTEINSRossmann-fold domains: Tyrosine-dependent oxidoreductases

6841aoiAB, 1aoiCD, 1b67AB, 1bh8AB, 1jfiAB, 1tafABHISTONE-FOLD PROTEINS

51101al212,1aym12,1bev12,1cov12,1hri12VIRAL COAT & CAPSID PROTEINS

5931al223, 1aym23, 1b35BC, 1bev23, 1tme23VIRAL COAT & CAPSID PROTEINS

5181b77AC, 1axcAC, 1axcAE, 1b77AB, 1a2pBCDNA clamp Family1: DNA polymerase processivity factor & Microbial ribonucleases

7611jh5AB, 1iqaAB, 1d0gAB, 1cdaAB, 1c28AC, 1bziBC, 1a8mAB

APOPTOSIS PROTEINS(Superfamily:TNF-like, Family: TNF-like)

10331cd0AB, 1a2yAB, 1a6uLH, 1ac6AB, 1akjDE, 1ao7DE, 1a14HL, 1d9kAB, 1fo0AB, 1tvdAB

ANTIBODIES Immunoglobulinantibody variable domain like

106710gsAB, 1axdAB, 1b48AB, 1c72AB, 1f2eAB, 1gnwAB, 1gwcBC, 1jlvAB, 1ljrAB, 1pd212

TRANSFARASESGlutathione S-transferases,C-terminal domain

# of members

aligned residues

Members of the cluster(proteincomplexes in the cluster)

Family Name (from SCOP database)

First Type: Interface Clusters WithSimilar Interfaces; Similar Functions

10gsAB 1b48AB

Glutathione s-transferases

Human glutathione S-transferase p1-complexwith ter117

Crystal structure of mgsta4-4 in complex with GSH conjugate of 4-hydroxynonenal in one subunit and GSH in the other

Type II: Structure and Function

• A well known paradigm states that proteins with similar structures can have different functions

• The type II interface clusters similarly illustrates that interfaces sharing the same cluster can belong to functionally different families

Extending theStructure-Function Paradigm

• The clusters extend and generalize this striking structure-function paradigm

• Not only does it apply to monomers, it further applies to protein-protein interfaces

Extending the Sequence-Structure Postulate

• For monomers it has been well known that different amino acid sequences can fold into similar structures; Since the sequences are different, it is not surprising that the function can also be different

• The clusters illustrate that in all such similar interfaces different function cases, the structures of the monomers are also different

Examples of Cases of Similar Interfaces and Different Functions

In all such casesthe monomer structures are different

Interface Clusters With Similar Interfaces and Dissimilar Functions-1

Chromatin StructureMouse hp1 (m31) C terminal (shadow chromo) domain

TransferaseStructure of

Human transaldolase

1dz1AB1f05AB

Complex (DNA-binding protein/DNA)

Human PCNA

RibonucleaseBarnase Wildtype Structure

1axcAC

1a2pBC

Virus/viral proteinStructure of the

Ebola Virus Membrane-fusion

Contractile proteinTropomyosin Molecule

1eboAB 1ic2CD

Similar Interfaces; Different Functions

• The similar interfaces - different function can be rationalized:

- Just as in monomer structures, evolution has utilized "good" favorable motifs for many (different!) functions

- Hence, of all the combinatorially possible ways for different monomer structures to associate, they still prefer to interact in similar ways to yield preferred interface architectures

How Can We Use the Dataset of Interfaces for

Prediction of Binding Sites?

• Experimentally, a hot spot is a residue that, when mutated to alanine, gives rise to a distinct drop in the binding constant (tenfold or more)

• All data are deposited inhttp://www.asedb.org

DeLano, W.L., Unraveling hot spots in binding interfaces:progress and challenges. Curr. Opin. Struct. Biol. 2002

Hot Spots In The Interfaces

Computationally, Hot Spots Distinguish Between Binding Sites and Exposed

Protein Surfaces

(B. Ma, T. Elkayam, H. Wolfson, R. NussinovPNAS | May 13, 2003 | vol. 100 | no. 10 | 5772-5777)

Multiple Structure Alignment (MUSTA)

Interface

Exposed surface

Conserved hot spots

• Align all structures in each cluser• Find structurally conserved residues (hot spot)

Protein-Protein Interactions: Stability, Function and...

Documents