Why does Alice follow Bob?
Filippo MenczerCenter for Complex Networks and Systems Research
School of Informatics and ComputingIndiana University, Bloomington
The Role of Information Diffusion in the Evolution of
Social Networks
Marvin
Competition for attention
Dynamics of the network
Dynamics on the network
Dynamics of Network:Link Creation
Dynamics on Network:InforPDWLRQ�Àow
A B
A B
Dynamics of Network:Link Creation
Dynamics on Network:InforPDWLRQ�Àow
A B
A B
memepopularity(number of messages)
lifetime (longest consecutive
number of days)
#bieberfact 139760 145
#bieberthing 3 1
Two key ingredients
AB
C
DPost existing topics(1 - Pn)
Post a new topic(Pn)
#jobs
#justinbieber
#ladygaga
#apple
!#jan25
#apple
#jobs
#justinbieber
#apple
#jan25
#apple
#jobs
#jan25
#ladygaga
#jan25#jobs
#jan25#jobs
AB
C
D
Before After
Follower Post
Screen(Pr) Memory ("! Pr) Screen(Pr) Memory ("! Pr)(Pm) (Pm)
Weng et al. Nature Sci. Rep. 2012
Dataset: Twitter 10% sampleOctober 2010 – January 2011~12.5M users, ~1.3M hashtags
b
c d
NJ�Į
a
Competition for attention
Dynamics of the network
Dynamics on the network
Dynamics of Network:Link Creation
Dynamics on Network:InforPDWLRQ�Àow
A B
A B
Dynamics of Network:Link Creation
Dynamics on Network:InforPDWLRQ�Àow
A B
A B
Dynamics of Network:Link Creation
Dynamics on Network:InforPDWLRQ�Àow
A B
A B
shortcut
Dataset: Yahoo! MemeApril 2009 – March 2010
(A)
(B)
~128k users, ~3.5M links, ~7M posts
TargetUser
Information FlowFollowing(A)
(B)
4.13%
14.89%
61.46%17.24%
0.03%
0.15%
2.00%
0.10%
Others
GrandparentOriginTriadic Node
Traffic shortcut 24%
Triadic closure 85%
TargetUser
Information FlowFollowing(A)
(B)
4.13%
14.89%
61.46%17.24%
0.03%
0.15%
2.00%
0.10%
Others
GrandparentOriginTriadic Node
Traffic shortcut 24%
Triadic closure 85%
TargetUser
Information FlowFollowing(A)
(B)
4.13%
14.89%
61.46%17.24%
0.03%
0.15%
2.00%
0.10%
Others
GrandparentOriginTriadic Node
Traffic shortcut 24%
Triadic closure 85%
TargetUser
Information FlowFollowing(A)
(B)
4.13%
14.89%
61.46%17.24%
0.03%
0.15%
2.00%
0.10%
Others
GrandparentOriginTriadic Node
Traffic shortcut 24%
Triadic closure 85%
TargetUser
Information FlowFollowing(A)
(B)
4.13%
14.89%
61.46%17.24%
0.03%
0.15%
2.00%
0.10%
Others
GrandparentOriginTriadic Node
Traffic shortcut 24%
Triadic closure 85%
Could this happen by chance?
Could this happen by chance?
Actual number of links of that type in the
data
Expected number of links of a certain type according to the null hypothesis (by
chance). E.g., links to grandparents:
Could this happen by chance?
Actual number of links of that type in the
data
very large ⇒ reject null hypothesis:
links are not created randomly
Expected number of links of a certain type according to the null hypothesis (by
chance). E.g., links to grandparents:
Preference for traffic-based shortcuts as users become more active
The more messages we see from someone, the more we are likely
to follow them
Preference for traffic-based shortcuts as users become more active
The more posts we see from someone, the more
we are likely to follow them
Rank percentile (by traffic)
P
The more messages we see from someone, the more we are likely
to follow them
Preference for traffic-based shortcuts as users become more active
The more posts we see from someone, the more
we are likely to follow them
Rank percentile (by traffic)
P
The more messages we see from someone, the more we are likely
to follow them
Shortcuts are more efficient at carrying messages we see and report
1e-7 (B) Reposted Traffic
Link
effi
cien
cy
Maximum Likelihood Estimation
Maximum Likelihood Estimation
∀ link ℓ, compute:
f(ℓ | Γ, Θ) = likelihood of the target being
followed by the creator according to a particular strategy Γ, given the network
configuration Θ at the time when ℓ is created.
Maximum Likelihood Estimation
∀ link ℓ, compute:
f(ℓ | Γ, Θ) = likelihood of the target being
followed by the creator according to a particular strategy Γ, given the network
configuration Θ at the time when ℓ is created.
Single strategy
Combined strategy
Individual strategy
MLE single strategies
• Random
• Triadic closure (∆)
• Grandparent (G)
• Origin (O)
• Traffic shortcut (G ∪ O)
MLE single strategies
• Random
• Triadic closure (∆)
• Grandparent (G)
• Origin (O)
• Traffic shortcut (G ∪ O)
MLE single strategies
• Random
• Triadic closure (∆) 1–p
p
1–pp
MLE single strategies
• Random
• Triadic closure (∆)
• Grandparent (G)
1–p
p
MLE single strategies
• Random
• Triadic closure (∆)
• Grandparent (G)
• Origin (O)
• Traffic shortcut (G ∪ O)
1–p
p
MLE single strategies
• Random
• Triadic closure (∆)
• Grandparent (G)
• Origin (O)
• Traffic shortcut (G ∪ O) Example:
MLE combined strategies
• Grandparent or triadic closure (G + ∆)
• Origin or triadic closure (O + ∆)
• Traffic shortcut or triadic closure (G ∪ O + ∆)
MLE combined strategies
• Grandparent or triadic closure (G + ∆)
• Origin or triadic closure (O + ∆)
• Traffic shortcut or triadic closure (G ∪ O + ∆)
1–p1–p2
p1
p2
MLE combined strategies
• Grandparent or triadic closure (G + ∆)
• Origin or triadic closure (O + ∆)
• Traffic shortcut or triadic closure (G ∪ O + ∆)
Example:
1–p1–p2
p1
p2
MLE combined strategies
• Grandparent or triadic closure (G + ∆)
• Origin or triadic closure (O + ∆)
• Traffic shortcut or triadic closure (G ∪ O + ∆)
Example:
1–p1–p2
p1
p2
Maximum Likelihood(G ∪ O + ∆)
p(traffic shortcut)
p(∆)
p(traffic shortcut)
p(∆)
4.7%
Maximum Likelihood(G ∪ O + ∆)
=1.0
= 0.0( ) =1.0=1.0
= 0.0
(
)
= 0.
0
(
)
Random BrowsingMixtureInformation-OrientedCasual FriendshipFrienship
10.4%
4.7%28%51% 5.5%
=1.0
= 0.0( ) =1.0=1.0
= 0.0
(
)
= 0.
0
(
)
Random BrowsingMixtureInformation-OrientedCasual FriendshipFrienship
10.4%
4.7%28%51% 5.5%
4.7%
(A) (B) (C)
(F) In-degree ratio(D) Lifetime (E) In-degree
(H) Posts (I) Post ratio(G) Reposted
longer lived follow more more followers
influential active spreaders
As users become more active, more popular, and more influential, they make the network more “efficient” by shortening the distance between producers and consumers of information.
Papers: cnets.indiana.edu/groups/nan/truthy