Post on 17-Jun-2018
transcript
2014-11-10 © ETH Zürich |
Modeling and Simulating Social Systems with MATLAB
Lecture 8 – Introduction to Graphs/Networks
© ETH Zürich |
Chair of Sociology, in particular of
Modeling and Simulation
Olivia Woolley, Tobias Kuhn, Dario Biasini, Dirk Helbing
2014-11-10 Modeling and Simulation of Social Systems with MATLAB 2
Schedule of the course 22.09. 29.09. 06.10. 13.10. 20.10. 27.10. 03.11. 10.11. 17.11. 24.11. 01.12. 08.12. 15.12.
Introduction to MATLAB
Introduction to social-science modeling and simulations
Working on projects (seminar thesis)
Handing in seminar thesis and giving a presentation
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Seven Bridges of Königsberg § Graph Theory was born in 1736, when Euler
posted the following problem: Is it possible to have a walk in the city of Königsberg, that crosses each of the seven bridges only once?
3
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Seven Bridges of Königsberg (II) § In order to approach the problem, Euler
represented the important information as a graph:
Source: wikipedia.org
4
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Definition of Graph A graph consists of two entities:
§ Nodes (vertices): N
§ Links: L Edge: undirected link Arc: directed link
The graph is defined as G = (N,L)
Source: Batagelj
5
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Properties of Links and Nodes § A link can either be encoded as a:
§ boolean flag (connection vs. no connection), or § value or weight (distance, traveling time, etc.)
§ Links of different types can exist (multiplex networks)
§ A node can also contain information (attributes)
6
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
The social network
7
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Graphs - examples
Internet Map [opte project] Food Web [Martinez ’91]
Protein Interactions [genomebiology.com]
Friendship Network [Moody ’01]
8
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Graphs - Examples
NODES LINKS
Protein interaction Proteins Metabolic reactions
Internet
Routers Communication channels
Social networks Individuals Social relations
WWW Web pages Hyperlinks
Scientific Coauthorship Networks
Authors Papers
9
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Paths § Path of length n = ordered collection of
§ n+1 nodes. Eg: A,C,D,E in G =(N,L) § n links. Eg: (A,C), (C,D),(D,E) in G =(N,L)
§ Circuit = closed path (last node = first node)
10
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Paths and connectedness § A graph G=(N,L)is connected if and only if there
exists a path connecting any two nodes in G
§ •is not connected
Connected
(Tree) Not Connected
(Forest) Connected with loops
11
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Giant Component § The giant component connects the vast majority of the nodes
of a Graph.
12
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Shortest paths § The shortest path between i and j is minimum
number of traversed edges
§ Distance l(i,j) = shortest path between i and j § Diameter D of the graph = max(l(i,j))
I J A
X H
B
D
I J A
X H
B
D
13
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Shortest paths: Average Path Length § Average path length is the average number of
steps along the shortest paths for all possible pairs of network nodes.
§ It is a measure of the efficiency of transport through a network, e.g. how quick an epidemics can spread.
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Centrality Measures § The importance of a node can be captured by:
§ Degree: number of connections
§ Flux or strength: Sum of strength of all connections
§ Closeness: Average distance (inverse of connection strength) form others.
§ Eigenvector centrality (e.g. PageRank): Centrality score is higher the more high-scoring others a node is connected to.
15
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Centrality Measures: Betweeness Centrality § Idea: Controlling network flows
§ The number of shortest paths passing through a node v. Namely,
Example of a node v with high betweeness centrality
v
16
σst
= number of shortest paths from s to t σ
st (v) = number of shortest paths from s to t passing through v
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Statistical description of network topology: Degree Distribution § Probability distribution function P(k) of the
degree k of nodes § Random graph: P(k) = binomial distribution
§ Scale-free graph: P(k) = k-γ (power law)
Source: www.computerworld.com 17
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Examples of different network topologies
18
Source: Wang (2003)
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Local structure: Clustering Coefficient § Local clustering coefficient C(i): fraction of
pairs of neighbors of a node that are also neighbors of each other.
§ Global clustering coefficient: network average
§ It measures how “clickish” a network is.
Source Costa (2008)
Question: What is the local clustering coefficient for the node i ?
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Small Worlds: Clustering & small diameter § Graphs are useful for modeling social networks,
disease spreading, transportation, and so on …
§ One of the most famous graph studies is the Small World Experiment (S. Milgram), which shows that the minimum distance between any two persons in the world is almost never longer than through 5 friends.
20
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Small World Example: Oracle of Bacon § There is a web page http://oracleofbacon.org/
finding the path from any actor at any time to the Hollywood actor Kevin Bacon.
§ It can also be used to find the shortest path between any two actors.
21
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Small World Network Properties § High clustered networks, like regular lattices,
and small path lengths, like random graphs.
§ A small-world network is defined to be a network where the typical distance L between two randomly chosen nodes grows logarithmically with total number of nodes
22
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Small World model
23
Source: Watts, D. J., & Strogatz, S. H. (1998)
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
MATLAB Implementation § A graph can be implemented in MATLAB via its
adjacency matrix, i.e. an N x N matrix, defining how N nodes are connected to the other N-1 nodes:
N = 10; A = zeros(N, N);
A(1,2) = 1; A(10,4) = 1;
…
24
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Graphs § If the nodes are cities and the links define
connections and travel times for the SBB network it looks like this:
13
2
4Geneva
Basel
Bern Zurich
25
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Graphs § If the nodes are cities and the links define
connections and travel times for the SBB network it looks like this:
A = [0 1 1 0; 1 0 1 0; 1 1 0 1; 0 0 1 0];
13
2
4Geneva
Basel 0 1 1 0
1 0 1 0
1 1 0 1
0 0 1 0
A =
Bern Zurich
1 2 3 4 1 2 3 4
26
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Graphs § If the nodes are cities and the links define
connections and travel times for the SBB network it looks like this:
13
2
4Geneva
Bern
Basel
Zurich 0:57
0:54 0:55
1:41
27
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Graphs § If the nodes are cities and the links define
connections and travel times for the SBB network it looks like this:
13
2
4Geneva
Bern
Basel
Zurich
0 54 57 0
54 0 55 0
57 55 0 101
0 0 101 0
A =
1 2 3 4 1 2 3 4
0:57
0:54 0:55
1:41
28
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Alternatives Ways to Store Network Data § Edge/Arc lists can easily stored to a file and
loaded when needed
13
2
4Geneva
Basel
Bern Zurich
1 2 1 3 2 1 2 3 3 1 3 2 3 4 4 3
29
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Alternatives Ways to Store Network Data § Cell arrays can contain vectors of different size
13
2
4Geneva
Basel
Bern Zurich
>> A = [2 3]; >> B = [1 3]; >> C = [1 2 4]; >> D = [3]; >> Net = {A;B;C;D}; >> Net{1}(1) >> ans = 2
30
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Alternatives Ways to Store Network Data § Cell arrays grants more freedom in representing
data structures, in spite of losing the simplicity and clarity of the matrix notation.
0 54 57 0
54 0 55 0
57 55 0 101
0 0 101 0
1 2 3 4 1 2 3 4
>> A = [2,54; 3,57]; >> B = [1,54; 3,55]; >> C = [1,57; 2,55; 4,101]; >> D = [3,101]; >> Net = {A;B;C;D};
31
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Alternatives Ways to Store Network Data § Cell arrays grants more freedom in representing
data structures, in spite of loosing the simplicity and clarity of the matrix notation.
0 54 57 0
54 0 55 0
57 55 0 101
0 0 101 0
1 2 3 4 1 2 3 4
>> A = [2,54; 3,57]; >> B = [1,54; 3,55]; >> C = [1,57; 2,55, 4,101]; >> D = [3,101]; >> Net = {A;B;C;D};
Warning: you must validate your own data structure !
32
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Software Packages for Graph Visualization § The following programs are valuable tools for
representing and and visualizing networks: § Pajek (http://pajek.imfm.si/doku.php) -> Easy to use § NWB (http://nwb.cns.iu.edu/) -> Good for Analysis § Gephi (http://gephi.org/) -> New § Visone (http://visone.info/) -> made in Konstanz § JUNG (http://jung.sourceforge.net/) -> library § Net Draw (http://www.analytictech.com/netdraw/netdraw.htm)
§ Pegasus (http://www.cs.cmu.edu/~pegasus/) -> for huge data
§ Use them!!
33
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Exporting and visualizing a graph in Gephi § csvwrite (’filename’,matrix)writes a
matrix as a list of comma seperated values…
§ …but works only with adjacency matrixes.
§ Often we need an edge list (cell array).
§ Download two files from the web site: cell2csv.m export.m!
34
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Exporting and visualizing a graph in Gephi § Download Gephi
§ Open the .csv edge list that you just exported
§ Visualize the network
§ Compute the modularity score:
35
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
Live demo which should get this as a final result
2014-11-10 Modeling and Simulation of Social Systems with MATLAB
References § Handbook of graphs and networks: from the Genome to the Internet,
edited by S. Bornholdt, H. G. Schuster. John Wiley and Sons, 2003. § Watts,D.J.,& Strogatz, S.H. (1998).Collective dynamics of ‘small-
world’ networks. nature, 393(6684), 440-442. § Newman, M.E. (2003).The structure and function of complex
networks. SIAM review, 45(2), 167-256. § Newman, M. E. (2009). Networks: an introduction. Oxford University
Press. § Easley,D., &Kleinberg,J. (2010). Networks, crowds, and markets.
Cambridge: Cambridge University Press. § Xiao Fan Wang and Guanrong Chen Complex Networks: Small-
World, Scale-Free and Beyond § GEPHI:
§ https://gephi.org/users/supported-graph-formats/csv-format/ § https://gephi.org/users/ § https://wiki.gephi.org/index.php/Datasets
37