Post on 25-Dec-2015
transcript
Active Learning for Networked Data Based on Non-
progressive Diffusion Model
Zhilin Yang, Jie Tang, Bin Xu, Chunxiao XingDept. of Computer Science and Technology
Tsinghua University, China
An Example
An Example
Instances
Correlation
An Example
Instances
Correlation
??
?
?
?
?
Classify each instance into {+1, -1}
An Example
Instances
Correlation
+1?
-1
+1
?
?
An Example
Instances
Correlation
+1?
-1
+1
?
?
Query for label
An Example
Instances
Correlation
+1?
-1
+1
-1
?
Problem: Active Learning for Networked Data
Instances
Correalation
+1?
-1
+1
?
?ChallengeIt is expensive to query for labels!
QuestionsWhich instances should we select to query?How many instances do we need to query, for an accurate classifier?
Challenges
Active Learning for Networked Data
How to leverage network correlation among instances?
How to query in a batch mode?
Batch Mode Active Learning for Networked Data
Given a graph ( , , , , )LU LG V V E y X
Unlabeled instancesFeatures MatrixLabeled instancesLabels of labeled instancesEdges
Our objective is
max ( )s UV V SQ V Subject to | |SV k
A subset of unlabeled instances
The utility functionLabeling budget
Factor Graph Model
?
?
?
?
?
?
Variable Node
Factor Node
Factor Graph Model
The joint probability
( , )
1( | ; ) exp ( , ) ( , )
i i j
T TL i i i j
v V v v E
P f y g y yZ
y y θ λ x β
Local factor function Edge factor function
Log likelihood of labeled instances
|( ) log exp exp- logL L
T T y y yθ θ S θ SO
Factor Graph Model
Learning
Gradient descent
( | ; ) ( ; )LP P θy y y θθ
S SE EO
Calculate the expectation: Loopy Belief Propagation (LBP)
* *
1
( )\( ) ( )
i i if i iN y f yy f f
x x
1( )\{~ }( ) ( ) ( )
i i j i jy i f N f yf y y y f jf x x x
Message from variable to factor
Message from factor to variable
Question: How to select instances from Factor graph for
active learning?
Basic principle: Maximize the Ripple Effects
?
?
?
?
?
?
Maximize the Ripple Effects
?
?
?
+1
?
?
Labeling information is propagated
Maximize the Ripple Effects
?
?
?
+1
?
?
Labeling information is propagated
Maximize the Ripple Effects
?
?
?
+1
?
?
Labeling information is propagated
Statistical bias is propagated
How to model the propagation process in a unlabeled network?
Diffusion Model
Linear Threshold ModelEach instance has a threshold Each instance at time has two statuses (inactive) or (active)Each instance has a set of neighbors
Progressive Diffusion Model
iff or
Non-Progressive Diffusion Model
iff Linear Threshold
Maximize the Ripple Effects
?
?
?
+1
?
?
Labeling information is propagated
Statistical bias is propagated
Will it be dominated by labeling information (active) or statistical bias (inactive)?
Based on non-progressive diffusion model
Maximize the number of activated instances in the end
An instance has an uncertainty measure
We aim to activate the most uncertain instances!
Instantiate the Problem
Active Learning Based on Non-Progressive Diffusion Model
max max{ | |}S U T UV V V V TV | |SV k,
The number of activated instancesWith constraints
0 ( ) 1 Sf v Vv Initially activate all queried instances
s.t. , ( 1 )M T Mv V f v All instances in should be active after convergence
, ( ) ),\ (U T Tv V V u V v u We activate the most uncertain instances
( ) 1( ) 1 ( ) ( )u N vf v f u t v Based on the non-progressive diffusion
Reduce the Problem
The original problem
Fix , maximize
The reduced problem
Fix , minimize Constraints are inherited.
Reduction procedure
Enumerate by bisection. Solve the reduced problem.
Algorithm
The reduced problem
Fix , minimize The key idea
Find a superset ()Such that there exists a subset ()If we initially activate , we can activate finally
Algorithm
Input: , for each instanceOutput: Initialize to be top uncertain instances;For each iteration:
greedily select a set with minimum thresholds from , while satisfying the constraint that each instance has at least neighbors in ;
;if then converges;
Greedily select a set with minimum degrees from , while satisfying the constraint that each instance has at least neighbors in ;Return ;
Theoretical Analysis
Convergence
Lemma 1 The algorithm will converge within (| | | |)U TV VO time.
Correctness
Theorem 1 If the algorithm converges, is a feasible solution, i.e., if we initially label , we will activate finally.
Approximation Ratio
Theorem 2 Let be the solution given by the algorithm, represent the optimal solution. Let be the max degree of instances and suppose . Then we have
2,
,
| | ( )
| | (1 ) [2 ( ) ( )]s g
s opt
V
V Avg t v d v
Experiments
Datasets
Datasets #Variable node #Factor node
Coauthor 6,096 24,468
Slashdot 370 1,686
Mobile 314 513
Enron 100 236
Comparison Methods
Batch Mode Active Learning (BMAL), proposed by Shi et al.Influence Maximization Selection (IMS), proposed by Zhuang et al.Maximum Uncertainty (MU)Random (RAN)Max Coverage (MaxCo), our method
Experiments
Performance
Related Work
Active Learning for Networked DataActively learning to infer social ties
H. Zhuang, J. Tang, W. Tang, T. Lou, A. Chin and X. Wang
Batch mode active learning for networked dataL. Shi, Y. Zhao and J. Tang
Towards active learning on graphs: an error bound minimization approachQ. Gu and J. Han
Integreation of active learing in a collaborative crfO. Martinez and G. Tsechpenakis
Diffusion ModelOn the non-progressive spread of influence through social networks
M. Fazli, M. Ghodsi, J. Habibi, P. J. Khalilabadi, V. Mirrokni and S. S. Sadeghabad
Maximizing the spread of influence through a social networkD. Kempe, J. Kleinberg and E. Tardos
Conclusion
Connect active learning for networked data to non-progressive diffusion model, and precisely formulate the problem
Propose an algorithm to solve the problem
Theoretically guarantee the convergence, correctness and approximation ratio of the algorithm
Empirically evaluate the performance of the algorithm on four datasets of different genres
Future work
Consider active learning for networked data in a streaming setting, where data distribution and network structure are changing over time
About Me
Zhilin Yangkimiyoung@yeah.net
3rd year undergraduate at Tsinghua Univ.
Applying for PhD programs this year
Data Mining & Machine Learning
Thanks!
kimiyoung@yeah.net