Date post: | 19-Jul-2015 |
Category: |
Data & Analytics |
Upload: | lokesh-ramaswamy |
View: | 41 times |
Download: | 1 times |
Agenda
• What is Social Network
• Why Social Network
• Basic Vocabulary
• Network Measures
• Centrality
• Cohesion
Social Network
• A Social Network is a social structure made up of individuals (or organizations) called ‘nodes’, which are tied (connected) by one or more specific types of interdependency, such as friendship, kinship, common interest, financial exchange, dislike, or relationships of beliefs, knowledge or prestige.
• Social Network Analysis views social relationships in terms of network theory consisting of nodes and ties (also called edges, links or connections). Nodes are the individual actors within the networks, and ties are relationships between the actors.
• Social Network is a map of specified ties, such as friendship between nodes being studied.
Why Social Network Analysis
• Enables us to segment data based on user behaviour
• Understand natural groups that have formed• Topics
• Personal characteristics
• Understand who are important people in these groups
Node
• Email – person’s mail address
• Web – URLs, http://...
• Wikipedia – articles, URLs
• Twitter – Twitter Name
• Facebook – Facebook Name
• Video – Video URLs, http://...
Link
• Email – FROM Address TO Address
• Web – FROM URL TO URL
• Wikipedia – FROM articles TO Articles
• Twitter – FROM (comm) TO (comm target)
• Facebook – FROM (name) TO (name)
• Video – FROM URLs to URLs
Degree
• The count of the number of ties to other actors in the network
• Indegree is the count of the number of ties directed to the node (popularity)
• Outdegree is the number of ties that the node directs to others (gregariousness)
Density
• Indicates the robustness of the network
• In mathematics, a dense graph is a graph in which the number of edges is close to the maximal number of edges
• A graph with only a few edges is a sparse graph
6 actual / 6 possible = 1
2 actual / 6 possible = 0.33
Bridge
• An edge is said to be a bridge, if deleting it would cause its end points to lie in different components of the graph
Distance
• Defined for a pair of vertices
• Number of connections in the path between 2 vertices
• 𝑑𝑖𝑠𝑡 𝑢, 𝑣 ≥ 0 𝑎𝑛𝑑 𝑑𝑖𝑠𝑡 𝑢, 𝑣 = 0 𝑜𝑛𝑙𝑦 𝑖𝑓 𝑢 = 𝑣
• 𝑑𝑖𝑠𝑡 𝑢, 𝑣 = 𝑑𝑖𝑠𝑡 𝑣, 𝑢
• 𝑑𝑖𝑠𝑡 𝑢, 𝑣 + 𝑑𝑖𝑠𝑡 𝑣, 𝑤 ≥ 𝑑𝑖𝑠𝑡(𝑢, 𝑤)
Eccentricity
• Defined for a single vertex
• Eccentricity of Vertex v є V(G) is𝑒 𝑣 = 𝑚𝑎𝑥 𝑑(𝑢, 𝑣) 𝑢 є 𝑉(𝐺)
• e(v) = 1 only if v is adjacent to all other vertices
V(1) V(2)
V(3)V(5)
V(4)
e(v1) = e(v5) = 3
e(v2) = e(v3) = e(v4) = 3
a
b c d
e(a) = 1e(b) = e(c) = e(d) = 2
Periphery and Centre
• If e(v) = dia(G), then that vertex is the peripheral vertex
• The rest of all such vertices make the periphery of G
• If e(v) = rad(G), then that vertex is the central vertex
• The set of all such vertices make the centre of G
• 𝑟𝑎𝑑 𝐺 ≤ 𝑑𝑖𝑎𝑚 𝐺 ≤ 2 ∗ 𝑟𝑎𝑑(𝐺)
Measuring Node’s Importance
• Social Graph illustrates social relationships
• Nodes: people
• Links: relationships between nodes• Sometime ambiguous
• Assume it means nodes know each other
• Links can be:• Undirectional : bidirectional
• Directional : unidirectional
Anna
Ben
Cara
Dara
Evan
Frank
Centrality
• This measure gives a rough indication of the social power of a node based on how well they ‘connect’ with the network. “Betweenness” and “Degree” are measures of centrality
Degree Centrality
• Degree centrality: number of nearest neighbours
𝐶𝐷(𝑖) = 𝑘 𝑖 =
𝑗
𝐴𝑖𝑗
• Normalized degree centrality
𝐶𝐷∗ 𝑖 =
1
𝑛 − 1𝐶𝐷(𝑖)
• High centrality degree -direct contact with many other actors
• Low degree - not active, peripheral actor
Degree Centrality
• Number of nodes connected to a node
• No need to distinguish between in and out
• Example..
• Ranking• Is this reasonable?
• Consider who causes network partition
• Many other options
Anna
Ben
Cara
Dara
Evan
Frank
Closeness Centrality
• Closeness centrality: how close an actor to all the other actors in network
𝐶𝐶 𝑖 =1
𝑗 𝑑(𝑖, 𝑗)
• Normalized Closeness Centrality
𝐶𝐶∗ 𝑖 =
1
𝑛 − 1𝐶𝐶(𝑖)
• Actor in the center can quickly interact with all others, short
communication path to others, minimal number of steps to reach others
Closeness Centrality
• Degree does not factor in the distance• Refers to the number of links to on the path between two nodes
• Path: set of links between two nodes
• Shortest Path: Path between two nodes
with shortest distance
• Diameter: longest of the shortest paths
Considering all the node pairs
• Closeness centrality for node• Find shortest path lengths to others
• Take average of those
• 𝑐𝑙𝑜𝑠𝑒𝑛𝑒𝑠𝑠 𝑐𝑒𝑛𝑡𝑟𝑙𝑖𝑡𝑦 =1
𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝑆ℎ𝑜𝑟𝑡𝑒𝑠𝑡 𝑃𝑎𝑡ℎ 𝐿𝑒𝑛𝑔𝑡ℎ
Anna
Ben
Cara
Dara
Evan
Frank
Betweenness
• number of shortest paths going through the actor
• Message passing examples• Anna to Frank
• Ben to Anna
• Betweenness Centrality of the node• Find shortest paths for each pair
• Award other nodes for being on the shortest path
• Points awarded to node for a pair is fraction of shortest paths between pairs the node is on
Anna
Ben
Cara
Dara
Evan
Frank
Eigen Vector Centrality
• Importance of a node depends on the importance of its neighbors (recursive definition)
• How central you are depends on how central your neighbours are
• 𝐶 𝑖 = 𝜔𝑖𝑗 ∗ 𝐶𝑗 + 𝜔𝑖𝑙 ∗ 𝐶𝑙 + 𝜔𝑖𝑘 ∗ 𝐶𝑘
i jK
l
Cohesion (Group Measure
• Measurement of cohesion for maximal social group or graphical boundaries where related elements cannot be disconnected except by removal of a certain minimal number of other nodes
• Ease with which a network can connect
• Aggregate measure of shortest path between each node pair at the network level reflects average distance
Centrality
• Measure giving rough indication of the social power of a node based on how well they "connect" the network.
• Measures of centrality• Betweenness – The extent to which a node lies between other nodes in the network.
• Closeness – The mean geodesic distance (i.e., the shortest path) between a vertex v and all other vertices reachable from it
• Degree – number of links incident upon the node
• Eigen vector – measure of importance of a node in the network. connections to high-scoring nodes contribute more to the score of the node in question than equal connections to low-scoring nodes.