BYHani Maher Ahmad
What is Social NetworkSocial Network is heterogynous and
multirelational data set represented by graphSocial networks need not to be social in
contextExamples :
Electrical power gridsThe webCoauthorship
Why do we study Social Network Small world effect & universal behavior 100th Monkey effect & tipping behaviorIt is a complex Dynamical System More information in Data Mining
Links information and structure of data are involved in the mining process
More realistic applicationsNew types of patterns (e.g. link prediction)
What do you thinkSmall world ExperimentPeople in city X are asked to direct message
to stranger in city YBy forwarding it to friend they think he know
the strangerWhat is the number of intermediate peoples
links until message is received?
Small WorldIt is a graph have high degree of local
clustering Six degree of separation
E.g. Science Coauthorship
Graph
Small World
What Do you think100th monkey effect
What do you thinkWhy there are a sudden events in our life?
How dose on product or movie or idea spread at once ?
Why do the most smart students became smart suddenly ?
Why dose we change our mind suddenly?
Evolution of a Random NetworkWe have a large number n of vertices
We start randomly adding edges one at a
time
At what time t will the network:have at least one “large” connected component?
have a single connected component?
have “small” diameter?
Formalizing Familiar IdeasExplaining universal behavior through statistical models
our models will always generate many networksalmost all of them will share certain properties (universals)
Explaining tipping through incremental growthwe gradually add edgesmany properties will emerge very suddenly during this process
size of police force
crim
e ra
te
number of edges
prob
. NW
con
nect
ed
How to study SNRandom graph generation models
E.g. Forest Fire model1. chooses an ambassador node w.2. selects x links incident to w randomly . Let w1;w2; …;wx denote the nodes at the other end of the selected edges.3. Our new node, v, forms out-links to w1;w2; …;wx and then applies step 2 recursivelyto w1;w2;…;wx. The process continues until it dies out
How to study SNThe models are realistic and tell how the
reality will be It is seen that rich become richer But it is blind and cant tell how things
happen exactly
Very hard to predict exactly since most of the problems are NP-hard
Dynamical SystemIt is a state and a rule changing that state E.g. pupation number is a state and logistic
growth is a rule
Dynamical Systems Dynamical Systems has important property of
attractors (points of stability )
Dynamical SystemsBut some times a chaotic behavior or
divergence occur like when traffic network become stuck
We study social network to control its stability and prevent chaos
Social Network Characteristics Densification power Law
Number of edges grows exponentially with number of nods
Shrinking diameterThe effective diameter of network shrink with
network growth Heavy-tailed out degree and in degree
The number of out and in degree follow the heavy tail distribution
What do you thinkWhat are things can be mined from social
network?What is the difference and similarity of data
and network mining ?Dose the graph need to be labeled or not ?Dose the graph need to be directed or not ?Can we mine the graph for all and exact
patterns?
Link Mining tasks1. Link based object classification
Category is classified based on links and attributes (generalize data classification)
2. Object type prediction3. Link type prediction4. Predicting link existence5. Link cardinality estimation
Link mining tasks6. Object reconciliation
• To detect if two objects are the same • E.g. if two desires are the same or two paper
sites are the same
7. Group detection8. Sub graph detection
What is the difference between 7 and 8 ?
Are you looking for herWho is the most perfect woman ?
In other word how can we find the most valuable object in the network and how can we find the rank of an object ?
How can we find her prestige .
Representing Network in suitable way for computationWe can represent the graph with matrices
Adjacency matrix the rows and columns represent nodes with
entries equal 1 if there is an edge and 0 else Incidence matrix
the rows and columns represent nodes and edges with entries equal 1 if the edge is incident to node and 0 else
Adjacency matrix of a network
Three algorithms Prestige algorithmPage rankHITS authority and hubs
Note :the first two compute the prestige vectored of the network representing the prestige of each node
the third algorithm compute the hub score vector and authority score vector
Prestige algorithmThe prestige of a node depend on the
prestige of nodes pointing to it.That is for node i : P[i] = AT[i].P
sum of nodes pointing to it * there prestigeFor all nodes P = AT.PStarting from all prestige in the beginning =
1Apply the multiplication until converge i.e. Pt+1 = AT.Pt
Page Rank AlgorithmFor node prestige dose not depend only on
the prestige of nodes pointing to it but also on a randomly chosen nodes
Random surfing model: At any page, With prob. , randomly jumping to a page With prob. (1 – ), randomly picking a
link to followPage rank = prestige + random walk
Page Rank Algorithm
• Note that the adjacency matrix is normalized
• This is the main algorithm behind google
HITS AlgorithmThis algorithm give two ranks to the node .As authority if it has been pointed to by many
good hubs and hub if it point to many good authorities.
HITS
Application : Viral Marketing The marketing has many models
Direct marketing : based on customer attributes classification problem
Massive marketing : based on the population segment the person belong
to clustering problem have advantage that it capture indirect costumers
Viral marketing : massive marketing + optimize word of mouth effect
Application : Viral Marketing E.g. a person how buy a car motivate his
friends to buy a car
Aim is to find Network value of person
If the person is a good hub it is potential customer that can maximize the network profit so spend more money in marketing product to him
If the person have negative effect don’t market to him
Application : Viral Marketing
Viral Marketing can be used in non marketing tasks
E.g.Fighting teenage smoking Stopping virus spreadSpread an ideamarketing for a Political men “e.g. election”
So What do you thinkShe has the best authority score all “hubs”
are pointing to her.Is it a good idea to marry her ??
Yes
or NO
What do you thinkShe can have best authority because of
Rich become richer Some tipping phenomena She is modda She have more hubsBecause of butterfly effect and divergenceShe can appear due to marketing effort
She also can be good authority
So What do you think
Google use the page rank and HITS do you think that the result are perfect or just popular
Dose that make sense when working with real people in the real world
So for me it is
Big NO
Social Networks out of controlIf the social network is not controlled
Rich will become richer and all the capital will accumulate with him
Most people like the wrong things due to joy of adrenaline and self prodding
Many stuck in the relation ships will occur as bas ideas , drugs , bad practices spreading
Many silly persons will appear as authority due to there strange or bad ideas
Social Networks out of controlThe number of links will become Extremely large
making life harder and noisy and much loose in time
The diameter will shrink making the spy and crimes easy
Some tipping events will destroy the society
More effort will be on marketing instead of industry
The civilization will stop and we only will focus on communication
Social Networks out of controlHidden persons can control the network and
affection others by making adjusting links and spreading ideas “program the SN” to there benefits
It is not proved but I guess a sudden death of the network will occur .
“ we are running into Chaos”
What is the solution ??
References The text bookThe another slidesDr Mohammed Zaki lectures “one of the
leading data mining researcher”http://www.cs.rpi.edu/~zaki/www-new/pmwiki.php/
Dmcourse/Main
SATNAM ALAG : Collective Intelligence in Action
Wikipedia : small worlds , social networks articles
Kathleen T. Alligood : CHAOS An Introduction to Dynamical Systems
Thank you
Questions ???