+ All Categories
Home > Documents > 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College...

341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College...

Date post: 31-Mar-2015
Category:
Upload: tiana-hazelton
View: 217 times
Download: 2 times
Share this document with a friend
Popular Tags:
56
341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London [email protected] Winter 2011 1
Transcript
Page 1: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

341: Introduction to Bioinformatics

Dr. Nataša PržuljDepartment of ComputingImperial College [email protected]

Winter 2011

1

Page 2: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

22

Topics

Introduction to biology (cell, DNA, RNA, genes, proteins) Sequencing and genomics (sequencing technology, sequence

alignment algorithms) Functional genomics and microarray analysis (array technology,

statistics, clustering and classification) Introduction to biological networks Introduction to graph theory Network properties

Network/node centralities Network motifs

Network models Network/node clustering Network comparison/alignment Software tools for network analysis Interplay between topology and biology 2

Page 3: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

Network Comparisons:Properties of Large Networks

• Large network comparison is computationally hard due to NP-completeness of the underlying subgraph isomorphism problem:

• Given 2 graphs G and H as input, determine whether G contains a subgraph that is isomorphic to H.

• Thus, network comparisons rely on easily computable heuristics (approximate solutions), called “network properties”

• Network properties can roughly & historically be divided in two categories:

1.Global network properties: give an overall view of the network, but might not be detailed enough to capture complex topological characteristics of large networks.

2.Local network properties: more detailed network descriptors which usually encompass larger number of constraints, thus reducing degrees of freedom in which the networks being compared can vary.

3

Page 4: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

1. Global Network Properties

Readings: Chapter 3 of “Analysis of biological networks” by Junker and Björn

• Global Network Properties:1) Degree distribution2) Average clustering coefficient3) Clustering spectrum4) Average Diameter5) Spectrum of shortest path lengths6) Centralities

4

Page 5: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

1) Degree Distribution

Definitions: • degree of a node is the number of edges incident to

the node.• Average degree of a network:

average of the degrees over all nodes in the network.

However, avg. deg might not be representative, since the distribution of degrees might be skewed.

5

1. Global Network Properties

x

deg(x)=5

Page 6: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

• Degree distribution:Let P(k) be the percentage of nodes of degree k in

the network. The degree distribution is the distribution of P(k) over all k.

P(k) can be understood as the probability that a node has degree k.

6

1. Global Network Properties

1) Degree Distribution

Page 7: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

• Example:

(log-log plot)

Here P(k) ~ k-γ , where often 2 ≤ γ < 3. This is a power-law, heavy-tailed distribution. Networks with power-law degree distributions are called scale-free networks. In them, most

of the nodes are of low degree, but there is a small number of highly-linked nodes (nodes of high degree) called “hubs.”

7

1. Global Network Properties1) Degree Distribution

Page 8: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

• Another Example:

average degree is meaningful

Here P(k) is a Poisson distribution.8

1. Global Network Properties1) Degree Distribution

Page 9: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

• However: degree distribution (and global properties in general) are weak predictors of network structure.

• Illustration:

G and H are of the same size (i.e.,|G|=|H| -- they have the same number of nodes and edges) and they have same degree distribution, but G and H have very different topologies (i.e., graph stucture). 9

1. Global Network Properties1) Degree Distribution

Page 10: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

G

Examples:

Page 11: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

1111

Research debates…• Assortative vs. disassortative mixing of degrees:

– Do high-degree nodes interact with high-degree nodes?– Done by:

– Pearson corr. coefficient between degrees of adjacent vertices– Average neighbor degree; then average over all nodes of degree k

• Structural robustness and attack tolerance:– “Robust, yet fragile”

• Scale-free degree distribution:– “Party” vs. “date” hubs

• J.D. Han et al., Nature, 430:88-93, 2004 – Bias in the data collection – sampling?

• M. Stumpf et al., PNAS, 102:4221-4224, 2005• J. Han et al., Nature Biotechnology, 23:839-844, 2005

• High degree nodes:– Essential genes

• H. Jeong at al., Nature 411, 2001. – Disease/cancer genes

• Jonsson and Bates, Bioinformatics, 22(18), 2006• Goh et al., PNAS, 104(21), 2007

Page 12: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

• Definition: clustering coefficient Cv of a node v:Cv = |E(N(v))|/(max possible number of edges in N(v))

Where N(v) the neighborhood of v, i.e., all nodes adjacent to v

Cv can be viewed as the probability that two neighbors of v are connected.

Thus 0 ≤ Cv ≤ 1.

By definition: For vertex v of degree 0 or 1, by definition Cv=0.12

1. Global Network Properties2) Average Clustering Coefficient

Page 13: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

• Example:

|N(v)|= 4, since there are 4 nodes in N(v), i.e., N(v)= {1, 2, 3, 4} |E(N(v))|= 3, since there are 3 edges between nodes in N(v) Max possible number of edges between nodes in N(v) is: choose(4,2) = 6. Therefore Cv= 3/6 = 1/2

13

1. Global Network Properties2) Average Clustering Coefficient

Page 14: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

• Definition: average clustering coefficient, C, of a network is the average Cv over all the nodes v V∈ .

14

1. Global Network Properties2) Average Clustering Coefficient

Page 15: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

• Definition: clustering spectrum, C(k), is the distribution of the average clustering coefficients of all nodes of degree k in the network, over all k.

Example:

15

1. Global Network Properties3) Clustering Spectrum

Page 16: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

• Cv – Clustering coefficient of node vCA= 1/1 = 1CB = 1/3 = 0.33CC = 0 CD = 2/10 = 0.2 …

• C = Avg. clust. coefficient of the whole network = avg {Cv over all nodes v of G}

• C(k) – Avg. clust. coefficient of all nodesof degree kE.g.: C(2) = (CA + CC)/2 = (1+0)/2 = 0.5

=> Clustering spectrum

E.g. (not for G)

2) And 3) Clustering Coefficient and Spectrum

G

Need to evaluate whether the value of C (or any other property) is statistically significant.

Page 17: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

17

• Definition: the distance between two nodes is the smallest number of links that have to be traversed to get from one node to the other.

• Definition: the shortest path is the path that achieves that distance.

• Definition: the average network diameter is the average of shortest path lengths over all pairs of nodes in a network.

1. Global Network Properties4) Average Diameter

Page 18: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

• Definition:Let S(d) be the percentage of node pairs that are at distance d. The spectrum of shortest path lengths is the distribution of S(d) over d.Example:

18

1. Global Network Properties5) Spectrum of shortest path lengths

Page 19: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

4) and 5) Average Diameter and Spectrum of Shortest Path Lengths

G

u

v

E.g.(not for G)

• Distance between a pair of nodes u and v:

Du,v = min {length of all paths between u and v} = min {3,4,3,2} = 2 = dist(u,v)

• Average diameter of the whole network:

D = avg {Du,v for all pairs of nodes {u,v} in G}

• Spectrum of the shortest path lengths

Page 20: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

(Readings: Chapter 3 of “Analysis of biological networks”-Junker,Björn)

• Rank nodes according to their “topological importance”

• Definition: – Centrality quantifies the topological importance of a node (edge) in a network.

There are many different types of centralities.

• There are many different types of centralities:– Degree centrality– Closeness centrality– Eccentricity centrality– Betweenness centrality– Subgraph centrality– Eigenvector centrality

• Software tools: Visone (social nets) and CentiBiN (biological nets)20

1. Global Network Properties6) Node Centralities

Page 21: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

• Definitions:

1. Degree centrality, Cd(v): nodes with a large number of neighbors (i.e., edges) have high centrality. Therefore, we have Cd(v)=deg(v).

Example of a use of degree centrality:

In PPI networks, nodes with high degree centrality are considered to be “biologically important.” We will learn later in the course what this means.

2. Closeness centrality, Cc(v): nodes with short paths to all other nodes in the network have high closeness centrality

Cc(v)=

21

1

dist(u,v)uV

1. Global Network Properties6) Node Centralities

Page 22: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

22

• Definitions:

3. Betweenness centrality, Cb(v): Nodes (or edges) which occur in many of the shortest paths have high betweeness centrality. Cb(v)=

Above:

The above summation means that there is a sum on the top and on the bottom of the fraction.

σst(v) = the number of shortest paths from s to t that pass through v

σst = the number of all shortest paths from s to t (they may or not pass through node v) 22

1. Global Network Properties6) Node Centralities

st(v)stst

svvt

Page 23: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

2323

• Definitions: 4. Eccentricity centrality, Ce(v): nodes with short paths to any

other node have high eccentricity centrality

Eccentricity of a node v is defined as ecc(v) = So it is the maximum shortest path length from node u to all other nodes v in V.

Eccentricity centrality of a node v:

Thus, central nodes have higher Ce since they have lower ecc.

There exist many other definitions of node centralities.23

1. Global Network Properties6) Node Centralities

maxvVdist(u,v)

Page 24: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

• Example:

24

Degree Closeness Betweeness

From highest D F, G H

F, G D, H F, G

to A, B A, B I

C, E, H C, E D

lowest I I A, B

J J C, D, J

1. Global Network Properties6) Node Centralities

Page 25: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

• You need to know how to compute these centralities (and all other network properties) by hand on small networks.

• For large real-world networks, you could use software, e.g., CentiBiN.– http://centibin.ipk-gatersleben.de/

25

1. Global Network Properties6) Node Centralities

Page 26: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

26

Network Properties

2. Local Network Properties(Chapter 5 of the course textbook “Analysis of Biological Networks” by Junker and Schreiber)

• They encompass a larger number of constraints, thus reducing degrees of freedom in which networks being compared can vary

• How do we show that two networks are different?• How do we show that they are the same?• How do we quantify the level of similarity?

Page 27: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

2727

Network Properties

2. Local Network Properties(Chapter 5 of the course textbook “Analysis of Biological Networks” by Junker and Schreiber)

1) Network motifs2) Graphlets

Two network comparison measures based on graphlets: 2.1) Relative Graphlet Frequence Distance between two networks 2.2) Graphlet Degree Distribution Agreement between two networks

Page 28: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

(Uri Alon’s group, 2002-2004)

• Definition: A network motif is a small over-represented partial subgraph of real network.

Here, over-represented means that it is over-represented when compared to networks coming from a random graph model.

Problem: What is expected at random, i.e., which network “null model” to use to identify motifs?

28

2. Local Network Properties1) Network Motifs

Page 29: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

29

2. Local Network Properties1) Network MotifsExample of a random graph model:• Erdos-Renyi (ER) random graphs – Definition:

– A graph on n nodes (for some positive integer n)– Edges are added between pairs of nodes uniformly at

random with same probability p

ER graphs usually have a small number of dense (in term of number of edges) subgraphsThere will be no regions in the network that have

large density of edges. Why?

Page 30: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

Example:

If motifs are identified when comparing the data with ER model networks, every dense subgraph would come up as a motif because they do not exist in our ER model networks.

30

2. Local Network Properties1) Network Motifs

Page 31: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

31

• Small subgraphs that are overrepresented in a network when compared to randomized networks

• Network motifs:– Reflect the underlying evolutionary processes that generated the network– Carry functional information– Define superfamilies of networks

- Zi is statistical significance of subgraph i, SPi is a vector of numbers in 0-1

• But:– Functionally important but not statistically significant patterns could be missed– The choice of the appropriate null model is crucial, especially across “families”

1) Network motifs (Uri Alon’s group, ’02-’04)

Feed-forward loop

Page 32: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

32

• Small subgraphs that are overrepresented in a network when compared to randomized networks

• Network motifs:– Reflect the underlying evolutionary processes that generated the network– Carry functional information– Define superfamilies of networks

- Zi is statistical significance of subgraph i, SPi is a vector of numbers in 0-1

• But:– Functionally important but not statistically significant patterns could be missed– The choice of the appropriate null model is crucial, especially across “families”– Random graphs with the same in- and out- degree distribution as data might not be the best

network null model– Motifs are partial subgraphs, while we use induced ones to understand network structure

1) Network motifs (Uri Alon’s group, ’02-’04)

Page 33: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

33

2. Local Network Properties1) Network Motifs

Example: Feed-forward loop

Shen-Orr, Milo, Mangan, and Alon, “Network motifs in the transcriptional regulation network of Escherichia coli,” Nature Genetics, 2002

Page 34: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

34

1) Network motifs (Uri Alon’s group, ’02-’04)http://www.weizmann.ac.il/mcb/UriAlon/

Also, see Pajek, MAVisto, and FANMOD

Page 35: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

N. Przulj, D. G. Corneil, and I. Jurisica, “Modeling Interactome: Scale Free or Geometric?,” Bioinformatics, vol. 20, num. 18, pg. 3508-3515, 2004.

_____

Different from network motifs: Induced subgraphs Of any frequency (don’t need to be over-represented)

2) Graphlets (Przulj group, ’04-’10)

Page 36: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

N. Przulj, D. G. Corneil, and I. Jurisica, “Modeling Interactome: Scale Free

or Geometric?,” Bioinformatics, vol. 20, num. 18, pg. 3508-3515, 2004.

Page 37: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

N. Przulj, D. G. Corneil, and I. Jurisica, “Modeling Interactome: Scale Free

or Geometric?,” Bioinformatics, vol. 20, num. 18, pg. 3508-3515, 2004.

Page 38: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

N. Przulj, D. G. Corneil, and I. Jurisica, “Modeling Interactome: Scale Free

or Geometric?,” Bioinformatics, vol. 20, num. 18, pg. 3508-3515, 2004.

2.1) Relative Graphlet Frequency (RGF) distance between networks G and H:

Page 39: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

Generalize node degree

2.2) Graphlet Degree Distributions

Page 40: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

N. Przulj, “Biological Network Comparison Using Graphlet Degree Distribution,” ECCB, Bioinformatics, vol. 23, pg. e177-e183, 2007.

Page 41: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

N. Przulj, “Biological Network Comparison Using Graphlet Degree Distribution,” ECCB, Bioinformatics, vol. 23, pg. e177-e183, 2007.

Page 42: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

T. Milenkovic and N. Przulj, “Uncovering Biological Network Function via Graphlet Degree Signatures”, Cancer Informatics, vol. 4, pg. 257-273, 2008.

Network structure vs. biological function & disease

Graphlet Degree (GD) vectors, or “node signatures”

Page 43: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

Similarity measure between “node signature” vectors

T. Milenkovic and N. Przulj, “Uncovering Biological Network Function via Graphlet Degree Signatures”, Cancer Informatics, vol. 4, pg. 257-273, 2008.

Page 44: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

T. Milenkovic and N. Przulj, “Uncovering Biological Network Function via Graphlet Degree Signatures”, Cancer Informatics, vol. 4, pg. 257-273, 2008.

Signature Similarity Measure between nodes u and v

Page 45: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

T. Milenković and N. Pržulj, “Uncovering Biological Network Function via Graphlet Degree Signatures,” Cancer Informatics, 2008:6 257-273, 2008 (Highly Visible).

Page 46: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

40%SMD1

PMA1

YBR095C

T. Milenković and N. Pržulj, “Uncovering Biological Network Function via Graphlet Degree Signatures,” Cancer Informatics, 2008:6 257-273, 2008 (Highly Visible).

Page 47: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

T. Milenković and N. Pržulj, “Uncovering Biological Network Function via Graphlet Degree Signatures,” Cancer Informatics, 2008:6 257-273, 2008 (Highly Visible).

Page 48: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

90%*

SMD1

SMB1RPO26

T. Milenković and N. Pržulj, “Uncovering Biological Network Function via Graphlet Degree Signatures,” Cancer Informatics, 2008:6 257-273, 2008 (Highly Visible).

*Statistically significant threshold at ~85%

Page 49: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

Later we will see how to use this and other techniquesto link network structure with biological function

Page 50: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

N. Przulj, “Biological Network Comparison Using Graphlet Degree Distribution,” Bioinformatics, vol. 23, pg. e177-e183, 2007.

Generalize Degree Distribution of a network

The degree distribution measures:• the number of nodes “touching” k edges for each value of k

Page 51: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

N. Przulj, “Biological Network Comparison Using Graphlet Degree Distribution,” Bioinformatics, vol. 23, pg. e177-e183, 2007.

Page 52: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

N. Przulj, “Biological Network Comparison Using Graphlet Degree Distribution,” Bioinformatics, vol. 23, pg. e177-e183, 2007.

Page 53: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

/ sqrt(2) ( to make it between 0 and 1)

This is called Graphlet Degree Distribution (GDD) Agreement between networks G and H.

Page 54: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

Software that implements many of these networkproperties and compares networks with respect to them: GraphCrunchhttp://bio-nets.doc.ic.ac.uk/graphcrunch/

Page 55: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

Software that implements many of these networkproperties and compares networks with respect to them: GraphCrunchhttp://bio-nets.doc.ic.ac.uk/graphcrunch2/

Page 56: 341: Introduction to Bioinformatics Dr. Nataša Pržulj Department of Computing Imperial College London natasha@imperial.ac.uk Winter 2011 1.

5656

Topics

Introduction to biology (cell, DNA, RNA, genes, proteins) Sequencing and genomics (sequencing technology, sequence

alignment algorithms) Functional genomics and microarray analysis (array technology,

statistics, clustering and classification) Introduction to biological networks Introduction to graph theory Network properties

Network/node centralities Network motifs

Network models Network/node clustering Network comparison/alignment Software tools for network analysis Interplay between topology and biology 56


Recommended