Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 1
CSE 6240: Web Search and Text Mining. Spring 2020
Message Passing and
Node Classification
Prof. Srijan Kumar
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 2
Outline
• Main question today: Given a network with labels on some nodes, how do we labels all the other nodes?
• Example: In a network, some nodes are fraudsters and some nodes are fully trusted. How do you find the other fraudsters and trustworthy nodes?
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 3
Intuition• Collective classification: Idea of assigning
labels to all nodes in a network together– Leverage the correlations in the network!
• We will look at three techniques today:– Relational classification– Iterative classification– Belief propagation
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 4
Today’s Lecture• Overview of collective classification• Relational classification• Iterative classification• Belief propagation
The lecture slides are borrowed from Prof. Jure Leskovec’s slides from CS224W
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 5
Correlations Exists in Networks
Example:• Real social network
– Nodes = people– Edges = friendship– Node color = race
• People are segregated by race due to homophily
(Easley and Kleinberg, 2010)
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 6
Classification with Network Data• How to leverage this correlation observed
in networks to help predict user attributes or interests?
How to predict the labels for the nodes in yellow?
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 7
Motivation• Similar entities are typically close together or
directly connected:– “Guilt-by-association”: If I am connected to a
node with label X, then I am likely to have label X as well.
– Example: Malicious/benign web page: Malicious web pages link to one another to increase visibility, look credible, and rank higher in search engines
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 8
Intuition• Classification label of a node O in network
may depend on:– Features of O– Labels of the objects in O’s neighborhood– Features of objects in O’s neighborhood
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 9
Guilt-by-association
Given: •few graph and•labeled nodes
Find: class (red/green)for rest nodes
Assuming: networks have homophily
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 10
Guilt-By-Association
• Let 𝑾 be a 𝑛×𝑛 (weighted) adjacency matrix over 𝑛 nodes
• Let Y = −1, 0, 1 ) be a vector of labels:– 1: positive node, known to be involved in a gene
function/biological process– -1: negative node– 0: unlabeled node
• Goal: Predict which unlabeled nodes are likely positive
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 11
Collective Classification• Intuition: simultaneous classification of
interlinked objects using correlations• Several applications
– Document classification – Part of speech tagging – Link prediction – Optical character recognition – Image/3D data segmentation – Entity resolution in sensor networks – Spam and fraud detection
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 12
Collective Classification Overview
• Markov Assumption: the label Yi of one node i depends on the label of its neighbors Ni
• Collective classification involves 3 steps:
LocalClassifier
• Assigninitiallabel
RelationalClassifier
• Capturecorrelationsbetweennodes
CollectiveInference
• Propagatecorrelationsthroughnetwork
𝑃(𝑌-|𝑖) = 𝑃 𝑌- 𝑁-)
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 13
• Predicts label based on node attributes/features• Classical classification• Does not employ network information
Collective Inference
• Propagate correlations through network
Local Classifier
• Assign initial label
Relational Classifier
• Capture correlations between nodes
• Learn a classifier from the labels or/and attributes of its neighbors to label one node
• Network information is used
• Apply relational classifier to each node iteratively• Iterate until the inconsistency between neighboring
labels is minimized• Network structure substantially affects the final
prediction
Collective Classification Overview
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 14
Today’s Lecture• Overview of collective classification• Relational classification• Iterative classification• Belief propagation
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 15
Problem Setting• How to predict the labels Yi for the nodes i in
yellow?– Each node i has a feature vector fi– Labels for some nodes are given (+ for green, - for
blue)• Task: find P(Yi) given the network and features
P(Yi)=?
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 16
Probabilistic Relational Classifier• Basic idea: Class probability of Yi is a
weighted average of class probabilities of its neighbors.
• For labeled nodes, initialize with ground-truth Y labels
• For unlabeled nodes, initialize Y uniformly • Update all nodes in a random order till
convergence or till maximum number of iterations is reached
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 17
Probabilistic Relational Classifier• Repeat for each node i and label c
– W(i,j) is the edge strength from i to j– |Ni| is the number of neighbors of I
• Challenges:– Convergence is not guaranteed– Model cannot use node feature information
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 18
Example
Initialization: All labeled nodes to their labels and all unlabeled nodes uniformly
P(Y=1)=0
P(Y=1)=0
P(Y=1)=0.5
P(Y=1)=0.5
P(Y=1)=0.5
P(Y=1)=0.5
P(Y=1)=1
P(Y=1)=1
P(Y=1)=0.5
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 19
• Update for the 1st Iteration:– For node 3, N3={1,2,4}
Example
P(Y=1)=0
P(Y=1)=0
P(Y=1)=0.5
P(Y=1)=0.5
P(Y=1)=0.5
P(Y=1)=0.5
P(Y=1)=1
P(Y=1)=1
P(Y=1|N3)=1/3(0+0+0.5)=0.17
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 20
• Update for the 1st Iteration:– For node 4, N4={1,3, 5, 6}
Example
P(Y=1)=0
P(Y=1)=0
P(Y=1|N4)=¼(0+0.17+0.5+1)=0.42
P(Y=1)=0.17
P(Y=1)=0.5
P(Y=1)=0.5
P(Y=1)=0.5
P(Y=1)=1
P(Y=1)=1
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 21
• Update for the 1st Iteration:– For node 5, N5={4,6,7,8}
Example
P(Y=1)=0
P(Y=1)=0
P(Y=1|N4)=0.42
P(Y=1)=0.17P(Y=1|N5)=¼(0.42+1+1+0.5)=0.73
P(Y=1)=0.5
P(Y=1)=0.5
P(Y=1)=1
P(Y=1)=1
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 22
After Iteration 1
P(Y=1)=0
P(Y=1)=0
P(Y=1)=0.17
P(Y=1)=0.42
P(Y=1)=0.73
P(Y=1)=0.91
P(Y=1)=1.00
Example
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 23
After Iteration 2
P(Y=1)=0
P(Y=1)=0
P(Y=1)=0.14
P(Y=1)=0.47
P(Y=1)=0.85
P(Y=1)=0.95
P(Y=1)=1.00
Example
All neighbors values are fixed. So the value can not change.
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 24
After Iteration 3
P(Y=1)=0
P(Y=1)=0
P(Y=1)=0.16
P(Y=1)=0.50
P(Y=1)=0.86
P(Y=1)=0.95
P(Y=1)=1.00
Example
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 25
After Iteration 4
P(Y=1)=0
P(Y=1)=0
P(Y=1)=0.16
P(Y=1)=0.51
P(Y=1)=0.86
P(Y=1)=0.95
P(Y=1)=1.00
Example
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 26
• All scores stabilize after 5 iterations• Final labeling
– Nodes 5, 8, 9 are + (P(Yi = 1) > 0.5)– Node 3 is – (P(Yi = 1) < 0.5)– Node 4 is in between (P(Yi = 1) =0.5)
++
+
-
+/-
Example
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 27
Today’s Lecture• Overview of collective classification• Relational classification• Iterative classification• Belief propagation
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 28
Iterative Classification
• Relational classifiers do not use node attributes– How can one leverage them?
• Main idea of iterative classification: classify node i based on its attributes as well as labels of neighbor set Ni
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 29
Iterative Classification: Process
1. Create a feature vector ai for each node i2. Train a classifier to classify using ai3. Node may have various number of
neighbors, so we can aggregate using:count , mode, proportion, mean, exists, etc.
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 30
Basic Architecture• Bootstrap phase
– Convert each node i to a flat vector ai– Use local classifier f(ai) (e.g., SVM, kNN, …) to
compute best value for Yi• Iteration phase: Iterate till convergence
– Repeat for each node i• Update node vector ai• Update label Yi to f(ai). This is a hard assignment
– Iterate until class labels stabilize or max number of iterations is reached
• Note: Convergence is not guaranteed– Run for max number of iterations
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 31
Application of Iterative Classification Framework:
Fake Reviewer/Review Detection
REV2: Fraudulent User Predictions in Rating PlatformsKumar et al. ACM Web Search and Data Mining, 2018
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 32
Fake Review Spam• Review sites are an attractive target for
spam: a +1 star increase in rating increases 5-9% revenue!
• Often hype/defame spam• Paid spammers
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 33
Fake Review Spam Detection• Behavioral analysis
– individual features, geographic locations, login times, session history, etc.
• Language analysis– use of superlatives, many self-referencing, rate of
misspell, many agreement words, …• Behavior and language is easy to fake!• Graph structure is hard to fake
– Graphs capture relationships between reviewers, reviews, stores
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
Problem Setup
• Input: bipartite rating graph as a weighted signed network:– Nodes: users, products– Edges: rating scores
between -1 and +1• Output: set of users
that give fake ratings
34
Rededges=-1ratingGreenedges=+1rating
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
• Basic idea: Users, products, and ratings have intrinsic quality scores:– Users have fairness scores– Products have goodness
scores– Ratings have reliability
scores• All values are unknown
35
Eachproducthasa‘goodness’scoreG 𝑝 ∈ −1,1
Eachuserhasa‘fairness’score𝐹 𝑢 ∈ 0,1
Eachratinghasa‘reliability’scoreR 𝑢, 𝑝 ∈ 0,1
REV2 Solution Formulation
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
• How can one calculate the values for all nodes and edges simultaneously?
• Solution: Collective classification
36
Eachproducthasa‘goodness’scoreG 𝑝 ∈ −1,1
Eachuserhasa‘fairness’score𝐹 𝑢 ∈ 0,1
Eachratinghasa‘reliability’scoreR 𝑢, 𝑝 ∈ 0,1
REV2 Solution Formulation
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 37
Fairness of Users• Fixing goodness and reliability, fairness is
updated as:
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 38
Goodness of Products• Fixing fairness and reliability, goodness is
updated as:
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 39
Reliability of Ratings• Fixing fairness and goodness, reliability is
updated as:
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 40
Initialization: Start with Best Scores
G(p)=1
G(p)=1
G(p)=1
F(u)=1
F(u)=1
F(u)=1
R(u,p)=1 R(u,p)=1
R(u,p)=1 R(u,p)=1
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 41
Updating Goodness, Iteration 1
F(u)=1
F(u)=1
F(u)=1
F(u)=1
F(u)=1
R(r) = 1 R(r)=1
G(p)=0.67
G(p)=0.67
G(p)=-0.67
R(r)=1 R(r)=1
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 42
Updating Reliability, Iteration 1
F(u)=1
F(u)=1
F(u)=1
F(u)=1
F(u)=1
F(u)=1
R(r)=0.92 R(r)=0.92
R(r)=0.92R(r)=0.58
R(r)=0.58G(p)=0.67
G(p)=0.67
G(p)=-0.67
Bothgammavaluesaresetto1
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 43
Update Fairness, Iteration 1
F(u)=0.92
F(u)=0.92
F(u)=0.58
F(u)=0.92
F(u)=0.92
F(u)=0.92
R(r)=0.92
R(r)=0.92R(r)=0.58
R(r)=0.58
R(r)=0.92
G(p)=0.67
G(p)=0.67
G(p)=-0.67
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 44
After Convergence
F(u)=0.83
F(u)=0.83
F(u)=0.17
F(u)=0.83
F(u)=0.83
F(u)=0.83
R(r)=0.83 R(r)=.83
R(r)=0.83
R(r)=0.17 R(r)=0.83
R(r)=0.17
G(p)=0.67
G(p)=0.67
G(p)=-0.67
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 45
Properties of REV2 Solution• Guaranteed to converge• Number of iterations till convergence is
upper-bounded• Time–complexity: linear
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 46
Performance• Low fairness users = Fraudsters• 127 of 150 lowest fairness users in Flipkart
were real fraudsters• REV2 is being used in production at
Flipkart
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 47
Linear Scalability• Multiple iterations, but linear scalability
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 48
Today’s Lecture• Overview of collective classification• Relational classification• Iterative classification• Belief propagation
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 49
Loopy belief propagation• Intuition: Use neighbors belief about a node to
predict node label – Used to estimate marginals (beliefs) or the most likely
states of all variables (nodes)• Iterative process in which neighbor variables “talk” to
each other, passing messages
• When consensus is reached, calculate final belief
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
Message Passing BasicsTask: Count the number of nodes in a graph*Condition: Each node can only interact (pass message) with its neighbors
Example: straight line graph
50adaptedfromMacKay(2003)textbook
*Graphcannothaveloops.Explanationlater.
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
1beforeyou
2beforeyou
there's1ofme
3beforeyou
4beforeyou
5beforeyou
Task: Count the number of nodes in a graphCondition: Each node can pass message to its neighborsSolution: Each node listens to the message from its neighbor, updates it, and passes it forward
51
1afteryou
2afteryou
3afteryou
4afteryou
5afteryou
6afteryou
Message Passing Basics
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
3behindyou
2 beforeyou
there's1ofme
Belief:Mustbe2+1+3=6ofus
onlyseemyincomingmessages
52
2beforeyou
Eachnodeonlyseesincomingmessages
Message Passing Basics
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
4behindyou
1beforeyou
there's1ofme
onlyseemyincomingmessages
53
Belief:Mustbe2+1+3=6ofus
Belief:Mustbe1+1+4=6ofus
Eachnodeonlyseesincomingmessages
Message Passing Basics
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
Message Passing in a Tree
7here
3here
11here(=7+3+1)
1ofme
54
Eachnodereceivesreportsfromallbranchesoftree
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
3here
3here
7here(=3+3+1)
Eachnodereceivesreportsfromallbranchesoftree
55
Message Passing in a Tree
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
Message Passing in a Tree
7here
3here
11here(=7+3+1)
56
Eachnodereceivesreportsfromallbranchesoftree
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
Message Passing in a Tree
7here
3here
3here
Belief:Mustbe14ofus
57
Eachnodereceivesreportsfromallbranchesoftree
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
Message Passing in a Tree
7here
3here
3here
Belief:Mustbe14ofus
58
Eachnodereceivesreportsfromallbranchesoftree
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 59
Loopy BP algorithm
What message will i send to j? - It depends on what i hears
from its neighbors k- Each neighbor k passes a
message to i: k’s beliefs of the state to i
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 60
Notations• Label-label potential matrix : Dependency
between a node and its neighbor. equals the probability of a node i being in state given that it has a j neighbor in state
• Prior belief : Probability of node i being in state
• is i’s estimate of j being in state • is the set of all states
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 61
Loopy BP algorithm
1. Initialize all messages to 12. Repeat for each node
61
Label-labelpotential Prior Allmessagesfromneighbors
Sumoverallstates
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 62
Loopy BP algorithm
After convergence:= i’s belief of being in
state
Prior Allmessagesfromneighbors
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 63
Loopy belief propagation• What if our graph has cycles?
– Message from different subgraphs are no longer independent!
– BP will give wrong results
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining
BP and Loops
64
T 2F 1
T 2F 1
T 2F 1 T 2
F 1
T 2F 1
T 4F 1
T 4F 1 • Messageslooparoundandaround:
2,4,8,16,32,...MoreandmoreconvincedthatthesevariablesareT!
• BPincorrectlytreatsthismessageasseparateevidencethatthevariableisT.
• Multipliesthesetwomessagesasiftheywereindependent.
• Buttheydon’tactuallycomefromindependent partsofthegraph.
• Oneinfluencedtheother(viaacycle).
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 65
Advantages of Belief Propagation• Advantages:
– Easy to program & parallelize– General: can apply to any graphical model w/ any
form of potentials (higher order than pairwise)• Challenges:
– Convergence is not guaranteed (when to stop), especially if many closed loops
• Potential functions (parameters)– require training to estimate– learning by gradient-based optimization:
convergence issues during training
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 66
Application of belief propagation: Online auction
fraud
Netprobe:AFastandScalableSystemforFraudDetectioninOnlineAuctionNetworks
Pandit etal.,WorldWideWebconference2007
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 67
Online Auction Fraud• Auction sites: attractive target for fraud• 63% complaints to Federal Internet Crime
Complaint Center in U.S. in 2006• Average loss per incident: = $385
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 68
Online Auction Fraud Detection• Insufficient solution to look at individual
features: user attributes, geographic locations, login times, session history, etc.
• Hard to fake: graph structure• Capture relationships between users
• Main question: how do fraudsters interact with other users and among each other?– In addition to buy/sell relations, are there more
complex relations?
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 69
Feedback Mechanism• Each user has a reputation score• Users rate each other via feedback
• Question: How do fraudsters game the feedback system?
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 70
Auction “Roles” of Users
• Do they boost each other’s reputation?– No, because if one is
caught, all will be caught
• They form near-bipartite cores (2 roles)– Accomplice: trades with
honest, looks legit – Fraudster: trades with
accomplice, fraud with honest
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 71
Detecting auction fraud• How to find near-bipartite cores? How to find
roles (honest, accomplice, fraudster)?– Use belief propagation!
• How to set BP parameters (potentials)?– prior beliefs: prior knowledge, unbiased if none– compatibility potentials: by insight
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 72
Belief propagation in actionInitialize all nodes as unbiased
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 73
Belief propagation in actionInitialize all nodes as unbiased
At each iteration, for each node, compute messages to its neighbors
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 74
Belief propagation in actionInitialize all nodes as unbiased
Continue till convergence
At each iteration, for each node, compute messages to its neighbors
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 75
Final belief scores = final roles
P(fraudster)
P(associate)
P(honest)
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 76
Today’s Lecture• Overview of collective classification• Relational classification
– Weighted average of neighborhood properties– Can not take node attributes while labeling
• Iterative classification– Takes node features while labeling
• Belief propagation– Message passing to update each node’s belief
of itself based on neighbors’ beliefs