Topology of the protein network H. Jeong, S.P. Mason, A.-L. Barabasi, Z.N. Oltvai, Nature 411, 41-42...

Post on 30-Dec-2015

215 views 0 download

Tags:

transcript

Topology of the protein network

H. Jeong, S.P. Mason, A.-L. Barabasi, Z.N. Oltvai, Nature 411, 41-42 (2001)

Erdös-Rényi model (1960)

- Democratic

- Random

Pál ErdösPál Erdös (1913-1996)

Connect with probability p

p=1/6 N=10

k ~ 1.5

Degree distribution of a random graph

P(k): the probability that a node has k links

P(k)= Ck N-1 p

k (1-p)N-1-k

For large N P(k) can be replaced by a Poisson distribution:

P(k)~ e-<k> <k>k/k!

Poisson distribution

World Wide Web

Over 3 billion documentsROBOT: collects all URL’s found in a document and follows them recursively

Nodes: WWW documents Links: URL links

R. Albert, H. Jeong, A-L Barabasi, Nature, 401 130 (1999).

Exp

ected

P(k) ~ k-

Fou

nd

Sca

le-f

ree

Netw

ork

Exp

on

en

tial

Netw

ork

Scale-free model

Barabási & Albert, Science 286, 509 (1999)

jj

ii k

kk

)(

P(k) ~k-3

(1) Networks continuously expand by the addition of new nodesWWW : addition of new documents Citation : publication of new papers

GROWTH: add a new node with m links

PREFERENTIAL ATTACHMENT: the probability that a node connects

to a node with k links is proportional to k.

(2) New nodes prefer to link to highly connected nodes.

WWW : linking to well known sites Citation : citing again highly cited papers

Other Models

Internet

Metabolic network

Organisms from all three domains of life are scale-free networks!

H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai, and A.L. Barabasi, Nature, 407 651 (2000)

Archaea Bacteria Eukaryotes

http://www.orgnet.com

Many real world networks have a similar architecture:

Scale-free networks

WWW, Internet (routers and domains), electronic circuits, computer software, movie actors, coauthorship networks, sexual web, instant messaging, email web, citations, phone

calls, metabolic, protein interaction, protein domains, brain function web, linguistic networks, comic book

characters, international trade, bank system, encryption trust net, energy landscapes, earthquakes, astrophysical

network…

Copyright ©2002 American Society for Biochemistry and Molecular BiologyCopyright ©2002 American Society for Biochemistry and Molecular Biology

Deane, C. M. (2002) Mol. Cell. Proteomics 1: 349-356Deane, C. M. (2002) Mol. Cell. Proteomics 1: 349-356

Interacting yeast proteins as detected in several studiesInteracting yeast proteins as detected in several studies

L.Salwinski & D.Eisenberg. Curr.Op.Struct.Biol. 13 (2003)L.Salwinski & D.Eisenberg. Curr.Op.Struct.Biol. 13 (2003)

J.S. Bader et al.. Nature Biotechnology 22 (2004)J.S. Bader et al.. Nature Biotechnology 22 (2004)

Filtered Yeast Interactome dataset Filtered Yeast Interactome dataset (Han et al., Nature 430, 2004)(Han et al., Nature 430, 2004) (HT-Y2H) projects (HT-Y2H) projects (5,249 potential interactions - union of the available data sets)(5,249 potential interactions - union of the available data sets)

Co-IP Co-IP (6,630 potential interactions from two datasets) (6,630 potential interactions from two datasets)

in silicoin silico computational predictions of interactions computational predictions of interactions (7,446 potential interactions from the 'von Mering' data set (7,446 potential interactions from the 'von Mering' data set obtained from the union of gene co-occurrence, gene obtained from the union of gene co-occurrence, gene neighbourhood and gene fusion predictions)neighbourhood and gene fusion predictions)

'MIPS protein complexes' published singly in the literature 'MIPS protein complexes' published singly in the literature (9,597 pairwise interactions between components of complexes(9,597 pairwise interactions between components of complexes

MIPS physical interactions MIPS physical interactions (excluding genome-scale experiments: 1,285 interactions).(excluding genome-scale experiments: 1,285 interactions).

J-D. Han et al. Nature 430 (2004)J-D. Han et al. Nature 430 (2004)

•UNIPROT DATABASE UNIPROT DATABASE filtered by SEGfiltered by SEG

VirusesViruses

BacteriaBacteria

ArchaeaArchaea

AscomycotaAscomycota

MetazoaMetazoa

SpermatophytaSpermatophyta

BLASTPBLASTPS. CerevisiaeS. Cerevisiae sequence as a query sequence as a query

BACTERIA

METAZOA

VIRUSES

1010-4-4

11ARCHAEA

ARCHAEA

Top scoring sequences from each group Top scoring sequences from each group parwise SW homology check with 100 randomisations and Z-score cutoff parwise SW homology check with 100 randomisations and Z-score cutoff

A-L. Barabasi & Z.N. Oltvai. Nature Reviews Genetics 5 (2004)A-L. Barabasi & Z.N. Oltvai. Nature Reviews Genetics 5 (2004)

A-L. Barabasi & Z.N. Oltvai. Nature Reviews Genetics 5 (2004)A-L. Barabasi & Z.N. Oltvai. Nature Reviews Genetics 5 (2004)

Archaea, Eubacteria, Fungi, Plants, Animals (33/26)Archaea, Eubacteria, Fungi, Plants, Animals (33/26)

protein synthesis mprotein synthesis machinery (40S and 60S ribosomachinery (40S and 60S ribosomal subunitsal subunits, ,

translational factorstranslational factors, , t-RNA synthetases t-RNA synthetases

bbasic metabolismasic metabolism (e.g. ATP (e.g. ATP synthesis , synthesis , KrebsKrebs cycle cycle))

protein folding and degradation (cprotein folding and degradation (chaperoneshaperones, proteases), proteases)

domains participating in protein-protein interactionsdomains participating in protein-protein interactions (TP(TPRR, ,

WD40)WD40) +Viruses (16/5)+Viruses (16/5)

replication (RNA polymerases, helicases, replication factor C, replication (RNA polymerases, helicases, replication factor C,

ribonucleotide reductase)ribonucleotide reductase)

protein degradation (19S proteasome)protein degradation (19S proteasome)

40S ribosome40S ribosome Mitochondrial ribosomalMitochondrial ribosomal40S subunit40S subunit

S. Wuchty, Z.N. Oltvai & A-L. Barabasi. Nature Genetics 22 (2003)S. Wuchty, Z.N. Oltvai & A-L. Barabasi. Nature Genetics 22 (2003)

Mitochondrial alpha-ketoglutarateMitochondrial alpha-ketoglutaratedehydrogenase and pyruvate dehydrogenase and pyruvate

degydrogenase complexesdegydrogenase complexes Succinate dehydrogenaseSuccinate dehydrogenase

General repressor of transcriptionGeneral repressor of transcription

CoatomerCoatomer NuclearNuclear

poreporeAnaphase-promoting complexAnaphase-promoting complex

(ubiquitin-protein ligase)(ubiquitin-protein ligase)Cyclophilin, heat shock protein Cyclophilin, heat shock protein

(HSP82) and STI1 inhibitor(HSP82) and STI1 inhibitor

Eubacteria, EukaryotaEubacteria, Eukaryota | |Archaea Archaea (4/4)(4/4)

catalytic core delta subunit of mitochondrial ATP-asecatalytic core delta subunit of mitochondrial ATP-ase

tubulintubulin (BtubA/B (BtubA/B - -ProsthecobacterProsthecobacter )

mitochondrial subunits of 60S ribosomemitochondrial subunits of 60S ribosome

RNA-binding proteins (cleavage factor I)RNA-binding proteins (cleavage factor I)

Mitochondrial ribosomalMitochondrial ribosomalproteins of the 60S subunitproteins of the 60S subunit

Central stalk Central stalk of mitochondrial of mitochondrial

F1F0 ATP synthaseF1F0 ATP synthase

Subunits of cleavage factor I (HRP1, RNA15), Subunits of cleavage factor I (HRP1, RNA15), poly-A binding protein (PAB1, SGN1), poly-A binding protein (PAB1, SGN1),

uncharacterized protein YGR250Cuncharacterized protein YGR250C

Alpha- andAlpha- andbeta-tubulinbeta-tubulin

Archaea, EukaryotaArchaea, Eukaryota | | Eubacteria Eubacteria (8/8)(8/8)

RNA polymerase II (non-catalyticRNA polymerase II (non-catalytic subunits)subunits)

60S ribosomal subunits (cytoplasmatic)60S ribosomal subunits (cytoplasmatic)

splicing (archeal-like LSM proteins)splicing (archeal-like LSM proteins)

exosome 3’exosome 3’5’ exoribonuclease complex5’ exoribonuclease complex

20S proteasome 20S proteasome

Protein components of large 60S ribosome subunitProtein components of large 60S ribosome subunit Subunits of RNA polymerase IISubunits of RNA polymerase II

EukaryotaEukaryota || Archaea, Eubacteria, Archaea, Eubacteria, Viruses Viruses (24/19)(24/19)

vesicle transport and membrane fusionvesicle transport and membrane fusion (multicompartmental cell)(multicompartmental cell)

mitochondrial transportersmitochondrial transporters

regulation of actin cytosregulation of actin cytoskkeleton stabilityeleton stability

Actin cytoskeleton Actin cytoskeleton regulating complexregulating complex

Actin capping heterodimerActin capping heterodimer Cofilin like protein and adenylylCofilin like protein and adenylylcyclase-asssociated proteincyclase-asssociated protein

Mitochondrial inner membrane ATP/ADP translocatorMitochondrial inner membrane ATP/ADP translocatorand mitochondrial inner membrane transporters (TIM22, TIM9, MRS1)and mitochondrial inner membrane transporters (TIM22, TIM9, MRS1)

Mitochondrial transport systemMitochondrial transport system

M.C. Rivera & J.A.Lake Nature 431 (2004)M.C. Rivera & J.A.Lake Nature 431 (2004)

M.C. Rivera & J.A.Lake Nature 431 (2004)M.C. Rivera & J.A.Lake Nature 431 (2004)

M.C. Rivera & J.A. Lake Nature 431 (2004)M.C. Rivera & J.A. Lake Nature 431 (2004)

44

2424

88

4949

L.Giot et al. Science. 2003, 302 (2003)

Subunits of 26 S proteasome comlexSubunits of 26 S proteasome comlex

Trehalose-6-phosphate complexTrehalose-6-phosphate complex

0.1

1

10

100

1000

10000

1 10 100 1000

nk

k

PL

GPL+EC

27.131065.1 knk

)0.3/exp()3.0(104.2 5.03 kknk

The number of S. cerevisiae proteins

The node degree

0

200

400

600

800

0 0.25 0.5 0.75 1 1.25.i

+- s.e.

Ai

Linear combination of exponential decays method:

)exp(max_

0

kAn i

i

iik

„S”

„F”

)exp()exp( kAkAn SSFFk

0.01

0.1

1

10

100

1000

10000

1 10 100 1000

nk

k

DEL

0

500

1000

1500

2000

2500

3000

0 1 2 3 4 5 6 7 8 10 11 12 13 14 15

nkF, nkS

k

F

S

The contribution of “F” and “S” component

050

100150200250300

0 5 10 15 20

Helicobacter pylorink

k

050

100150200250300

0 5 10 15 20

Escherichia colink

k

0

500

1000

1500

2000

0 5 10 15 20

Saccharomyces cerevisiaenk

k

0500

10001500200025003000

0 5 10 15 20

Drosophila melanogasternk

k

0

500

1000

1500

2000

0 5 10 15 20

Caenorhabditis elegansnk

k

0

50

100

150

200

250

0 5 10 15 20

Arabidopsis thalianank

k

0

100

200

300

400

500

0 5 10 15 20

Saccharomyces cerevisiaenk

k

PL

GPL+ECDEL

AICc = 94.6

AICc = 135.8

AICc = 112.4

1

)1(22)ln( 2

mz

mmmzAICc

Akaike's Information Criterion (AICc)

- the average squared residual for a given model, m - the number of the model parameters, z - the number of observations

2

E. coli23%

H. pylori71%

A. thaliana61%

37% C. elegans

D. melanogaster68%

S. cerevisiae74%

Interacting proteins