Date post: | 14-Apr-2017 |
Category: |
Data & Analytics |
Upload: | bigmine |
View: | 405 times |
Download: | 0 times |
Arizona State University
InsidetheAtoms:MiningaNetworkofNetworksandBeyond
Hanghang Tong [email protected]
http://tonghanghang.org
- 1 -
@KDD BigMine 16: the 5th International Workshop on Big Data, Streams and Heterogeneous Source Mining
Arizona State University
Hospital Networks
US Power Grid
Biological Networks
Collaboration Networks
Observation: Graphs are everywhere!
- 2 -
Traffic Network
Brain Networks
Arizona State University
Graph Mining: An Overview
- 3 -
Observation: Mining stops at nodes/links (atom) level. Q: Is there a level x (x=4, 5, …)? What is it?
graph
subgraph
node/link
Arizona State University
A Motivating Example: Cross-Network Association (e.g., candidate gene prioritization problem)
- 4 -
§ Problem Definition – Given: (1) two networks P and G,
and (2) their partial association A;
– Find: missing associations in A.
§ Solutions: Graph Ranking – Given: a green node (disease); – Find: the most relevant blue nodes (genes).
P G A
A Powerful Primitive in (A1) drug discovery; (A2) social recommendation; (3) QA post-tagging, etc.
(PPI)
(Phenotype)
Arizona State University
A Motivating Example: Cross-Network Association (e.g., candidate gene prioritization problem)
- 5 -
§ Problem Definition – Given: (1) two networks P and G,
and (2) their partial association A;
– Find: missing associations in A.
§ Solutions: Graph Ranking – Given: a green node (disease); – Find: the most relevant blue nodes (genes).
§ Limitations: Each green node (disease) might have its own PPI network!
O. Magger, Y. Y. Waldman, E. Ruppin, and R. Sharan. Enhancing the prioritization of disease-causing genes through tissue specific protein interaction networks. PLoS Computational Biology, 8(9), 2012.
P G A
Arizona State University
A Motivating Example: Cross-Network Association (e.g., candidate gene prioritization problem)
- 6 -
• ADiseaseNetworkP• APPINetworkG
a
b
c
d
G A
4 5
3
6 7
2 1 P
• ADiseaseNetworkP• A set of :ssue-specific PPINetworksG1,…,G7
4 5
3
6 7
2 1 P
A G1 a b
d c
G2 a
c d b
G7 a b
d c
… … …
…
Arizona State University
A Set of Networks: More Applications
- 7 -
Collaborations
System of Systems
Brain Networks
Cyber-Physics Systems
Arizona State University
Roadmap
§ Motivations § NoN: A Network of Networks
– NoN Modeling
– NoN Mining
§ Beyond NoN § Some of Our Other Recent Work
- 8 -
Arizona State University
Modeling NoN
§ Q: How to represent a set of inter-connected networks (e.g., Tissue-Specific PPI Networks)?
- 9 -
4 5
3
6 7
2 1 P
A G1 a b
d c
G2 a
c d b
G7 a b
d c
… …
… …
Arizona State University
Introducing the NoN Model
§ A: each green node (disease) itself is a network
- 10 -
NoN (A Network of Networks) := a triplet R = <G, A, θ> • G: Main Network (the green, disease to disease networks) • A: Domain Networks (the blue, tissue-specific PPI networks) • θ: Mapping function (each green, main node à a blue, domain network)
J. Ni, H. Tong, W. Fan, X. Zhang: Inside the atoms: ranking on a network of networks. KDD 2014
Arizona State University
NoN Models: Examples
Applications The Main Network (G) Domain Networks (A) Gene-Pheno Assoc. Disease Sim Network Tissue-specific PPI Nets LBSN Geo-proximity network Social Networks Brain Initiative Person-Person Network Brain Networks Team of Teams Project Dependence Net Team Networks Scholarly Data Res. Area Sim Network Collaboration Networks
- 11 -
NoN (A Network of Networks) := a triplet R = <G, A, θ> • G: Main Network (the green, disease to disease networks) • A: Domain Networks (the blue, tissue-specific PPI networks) • θ: Mapping function (each green, main node à a blue, domain network)
Arizona State University
NoN - Generalizations
§ G1: Multi-layered NoN – Candidate Gene Prioritization: Disease-tissue-
protein
– Geo-social networks: City-district-person
§ G2: Soft Mapping function θ – 1-to-many, or many-to-many
- 12 - • C. Chen, J. He, N. Bliss and H. Tong: “On the Connectivity of Multi-layered Networks: Models, Measures and
Optimal Control” ICDM 2015.
Arizona State University
NoN vs. Some Popular Multi-Network Models
§ They are all special case of our NoN model! – Tensor: a special NoN with
1) A full clique main network (G);
2) All domain networks (A) sharing the same node sets
– Hypergraph: a special NoN with 1) All domain networks (A) being empty
– Multiplex: a special NoN with 1) Two-layers
2) All domain networks (A) sharing the same node sets
- 13 -
Arizona State University
Roadmap
§ Motivations § NoN: A Network of Networks
– NoN Modeling
– NoN Mining: Ranking and Clustering
§ Beyond NoN § Some of Our Other Recent Work
- 14 -
Arizona State University
NoN Mining - Ranking A1: Given a disease (e.g. P1), what are the most relevant genes (blue nodes)?
- 15 -
A2: Who is most influential, considering both the within- and cross-area influence?
Arizona State University
Ranking on a Single Network
- 16 -
Node 4 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12
0.13 0.10 0.13 0.22 0.13 0.05 0.05 0.08 0.04 0.03 0.04 0.02
1
4
3
2
5 6
7
9 10
8 11
12 0.13
0.10
0.13
0.13
0.05
0.05
0.08
0.04
0.02
0.04
0.03
Ranking vector More red, more relevant Nearby nodes, higher scores
Background
4rr
H. Tong, C. Faloutsos, J.-Y. Pan: Fast Random Walk with Restart and Its Applications. ICDM 2006. (best paper award at 2006, ICDM 2015 10-Yeart Highest Impact Paper Award)
Arizona State University
Ranking on a Single Network
- 17 -
Node 4 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12
0.13 0.10 0.13 0.22 0.13 0.05 0.05 0.08 0.04 0.03 0.04 0.02
1
4
3
2
5 6
7
9 10
8 11
12 0.13
0.10
0.13
0.13
0.05
0.05
0.08
0.04
0.02
0.04
0.03
Ranking vector More red, more relevant Nearby nodes, higher scores
4rr
Background
Footnote: “Maxwell Equation” for Web [Soumen Chakrabarti]
ri = c x A x ri + (1-c) x ei
Arizona State University
Ranking on a Single Network
- 18 -
Background
An Optimization Viewpoint of “Maxwell Equation” for Web (Symmetric A)
ri = c x A x ri + (1-c) x ei
= argmin cri'(I – A)ri + (1-c) x||ri – ei||2
Network Smoothness Query Preference
Arizona State University
Ranking on NoN § Optimization Formulation:
§ Intuition: – Similar ranking scores for an overlapped node, if their
G(i,j) is high.
– A set of correlated g random walks
- 19 - J. Ni, H. Tong, W. Fan, X. Zhang: Inside the atoms: ranking on a network of networks. KDD 2014
#1: within-network smoothness #2: query preference #2: query preference
#3: cross-network consistency
Arizona State University
Ranking on NoN § Optimization Formulation:
§ Equivalence: J(r) = J(r1,…,rg)
– Intuition: a single R.W. on the integrated graph A
– Property: J(r) is positive-definite!
- 20 -
~
#1: within-network smoothness #2: query preference
#3: cross-network consistency
Arizona State University
Ranking on NoN
§ Equivalence: J(r) = J(r1,…,rg)
– Intuition: One single random walk on the integrated graph A
– Property: J(r) is positive-definite!
§ Algorithms: – #1: A linear algorithm à the optimal solution
– #2: Any existing fast solution on a single network
– #3: Further Speedup: O(T(m+ng)) à O(T(g log(g) + z)) • g << n; and z << m (key idea: using main network to do pruning)
- 21 -
~
Arizona State University
NoN Ranking - Results
- 22 -
A1: Candidate Gene Prioritization • Which genes are most relevant wrt
disease a?
ROC Curve Comparison
A2: Co-authorship Prediction • Which DM authors are most likely to
collaborate with a given Med author?
AUC and Accuracy
Arizona State University
NoN Mining - Clustering
§ Obj. Function:
- 23 - J. Ni, H. Tong, W. Fan, X. Zhang: Flexible and Robust Multi-Network Clustering. KDD 2015
Similar Intuition ! P-value vs. (biologically meaningful) clusters
§ Results:
Arizona State University
Roadmap
§ Motivations § NoN: A Network of Networks
– NoN Modeling
– NoN Mining
§ Beyond NoN: From NoN to NoX § Some of Our Other Recent Work
- 24 -
Arizona State University
NoT: A Network of Time Series § Problem Definition
- 25 - • Y. Cai, H. Tong, W. Fan and P. Ji: Fast Mining of a Network of Coevolving Time Series. SDM 2015. • Y. Cai, H. Tong, W. Fan, P. Ji, Q. He:Facets: Fast Comprehensive Mining of Coevolving High-order Time Series. KDD 2015
§ Models &
Algorithms
§ Results
0 50 100 150200
400
600
800
1000
1200
1400
frame #
coord
inate
original
DCMF
DMF
dynaMMo
DCMFdynaMMo
DMF
MARKER PLACEMENT GUIDE
The marker placement in this document is only one of many possible combinations. T his
guide will only show the standard marker placement that’s being used in the motion capture laboratory. The marker placement in this guide resembles the one that is shown and explained in the Vicon 512 manual. As such, the Vicon 512 Manual can offer
additional information. The difference with the marker set in this document from the Vicon 512 Manual is the addition of 4 m arkers, namely RARM, L ARM, RLEG, a nd LLEG.
Before starting, below are some general rules of thumb one should follow:• Have the person who’s going to be motion captured wear tight fitt ing clot hes—strap
down any areas of the clothing that is loose. The marker balls’ posit ion should move
as lit tle as poss ible and should be properly seen.• Place the marker balls as close to the bone as possible. T his follows t he rule of
having the marker balls stay stationary during movement.
Arizona State University
iBall: A Network of Regression Models
- 26 - • Y. Yao, H. Tong, F. Xu, J. Lu: Predicting long-term impact of CQA posts: a comprehensive viewpoint. KDD 2014 • L. Li, H. Tong: The Child is Father of the Man: Foresee the Success at the Early Stage. KDD 2015. • “Data Mining Reveals the Secret to Getting Good Answers”, MIT Technology Review, 2013
§ Results
§ Models & Algorithms § Problem Definition
D1
D3
D2 D4
Arizona State University
Fascinate: Cross-Layer Dependence Inference on Multi-Layered Networks
- 27 -
§ R
esul
ts
§ Methods § Problem Definition
Infer Unobserved Cross-Layer Links Cross-Layer Inference = Collective CF
Effectiveness Efficiency • C. Chen, J. He, N. Bliss and H. Tong: “On the Connectivity of Multi-layered Networks: Models, Measures and Optimal Control” ICDM15. • C. Chen, H. Tong, L. Xie, L. Ying and Q. He: “FASCINATE: Fast Cross-Layer Dependence Inference on Multi-layered Networks”, KDD16, 3:15pm, Monday, Plaza Room A/B
Arizona State University
Conclusion: a Network of X § Summary
– NoN: Network + Networks
– NoT: Network + Time Series
– iBall: Network + Regression
– Fascinate: Network + Inference
§ Take Home Messages – Modeling: `No’ (i.e., a Network of X) as the answer
• Networks as data à as context
– Algorithms: Networks as the contextual regularizer - 28 -
Arizona State University
Roadmap
§ Motivations § NoN: A Network of Networks
§ Beyond NoN § Some of Our Other Recent Work
– Team Replacement
– TravelModeLogger
– BrainQuest
- 29 -
– Network Alignment
– Optimal Networks
– Visual Influence Sum
Arizona State University
Replacing the Irreplaceable: Team Replacement Recommendation
- 30 -
• L. Li, H. Tong, N. Cao, K. Ehrlich, Y.-R. Lin and N. Buchler: Replacing the Irreplaceable: Fast Algorithms for Team Member Recommendation, WWW 2015
• N. Cao, Y.-R. Lin, L. Li, H. Tong: g-Miner: Interactive Visual Group Mining on Multivariate Graphs, ACM CHI 2015 • System prototype & video demo: http://team-net-work.org
§ Problem Definition
§ S
yste
m
§ Sol.
§ R
esul
ts
Arizona State University
Travel Mode Identification w/ Smartphones
- 31 -
§ P
rob.
Dfn
• X. Su, H. Tong and P. Ji: Accelerometer-based Activity Recognition on Smartphone. CIKM 2014 • X. Su, H. Caceres, H. Tong and Q. He: Travel Mode Identification with Smartphones. TRB 2015
§ M
etho
d
§ R
esul
ts
§ Open Challenges
² Battery Consumption (sampling rates, sensor selection)
² On-line algorithms ² Adaptive (summer vs. winter;
high-way vs. local)
Arizona State University
BrainQuest: Visual Brain Comparison
- 32 -
Quest brains to spot picture diff.
• L. Shi, H. Tong, X. Mu: BrainQuest: Perception-Guided Visual Brain Comparison, ICDM 2015 • L. Shi, H. Tong, M. Daianu, X. Mu and P. Thompson Block-wise Human Brain Network Visual Comparison Using NodeTrix Representation. VIS'16
Arizona State University
BrainQuest: Visual Brain Comparison
- 33 -
Quest computers to spot brain diff.
• L. Shi, H. Tong, X. Mu: BrainQuest: Perception-Guided Visual Brain Comparison, ICDM 2015 • L. Shi, H. Tong, M. Daianu, X. Mu and P. Thompson Block-wise Human Brain Network Visual Comparison Using NodeTrix Representation. VIS'16
Arizona State University
BrainQuest: Visual Brain Comparison
- 34 -
Quest computers to spot brain diff.
AD group (n1) Control group (n2)
• L. Shi, H. Tong, X. Mu: BrainQuest: Perception-Guided Visual Brain Comparison, ICDM 2015 • L. Shi, H. Tong, M. Daianu, X. Mu and P. Thompson Block-wise Human Brain Network Visual Comparison Using NodeTrix Representation. VIS'16
Arizona State University
BrainQuest: Visual Brain Comparison
- 35 -
§ V
A F
ram
ewor
k § Model & Algorithm
§ P
robl
em D
fn. § Results
Spot structural diff. between two groups of brain networks
• L. Shi, H. Tong, X. Mu: BrainQuest: Perception-Guided Visual Brain Comparison, ICDM 2015 • L. Shi, H. Tong, M. Daianu, X. Mu and P. Thompson Block-wise Human Brain Network Visual Comparison Using NodeTrix Representation. VIS'16
Arizona State University
Query-Specific Optimal Networks
- 36 - L. Li, Y. Yao, J. Tang, W. Fan, H. Tong: QUINT: On Query-Specific Optimal Networks. KDD 2016. 10:00am, Monday, Plaza Room A/B
§ Goal: Optimal Networks – Query-Specific
– Optimal Topology + Weights – On-line Learning
§ + Error Estimation
§ Results
§ Methods: VERY efficient way to estimate
Acc
urac
y (M
AP
) S
cala
bilit
y s x
ij
Query node
Positive node@Q(x, s)
@As(i, j)
Q(j, s)⇥Q(x, i)
/
Neighbor of Neighbor ofs x
Arizona State University
Attributed Network Alignment
• D. Koutra, H. Tong, D. Lubensky:BIG-ALIGN: Fast Bipartite Graph Alignment. ICDM 2013. • S. Zhang and H. Tong: Final: Fast Attributed Network Alighnment. KDD 2016, 3:15pm, Monday, Plaza Room A/B
§ Fo
rmul
atio
n § Algorithms
§ P
robl
em D
fn. § Results
Accuracy vs. TimeAccuracy vs. Noise
• Iterative Alg. • Global Optimal • Same Complexity as
ISORANK • Further Speed-up
• Low-Rank Approximation • On-Query Alignment (Linear)
Arizona State University
Vegas: Influence Graph Visual Summarization
- 38 - • L. Shi, H. Tong, J. Tang and C. Lin: Flow-based Influence Graph Visual Summarization, ICDM 2014 • L. Shi, H. Tong, J. Tang, C. Lin: VEGAS: Visual influEnce GrAph Summarization on Citation Networks. TKDE 2015
§ S
olut
ion
§ R
esul
ts
“Stochastic High-Level Petri Net and Applications”
§ Prob. Dfn.
Who/What How/Why
Arizona State University
Q&A
Inside the atom is a whole new world!
- 39 -
• “A whole new world • Every turn a surprise • With new horizons to pursue • Every moment red-letter ……”
Arizona State University
§ Collaborators: – Norbou Buchler, Nan Cao, Madelaine Daianu, Kate Ehrlich, Wei
Fan, Qing He, Ping Ji, Yu-ru Lin, Lei Shi, Chuang Lin, Jie Tang, Paul M. Thompson, Lei Xie, Yuan Yao, Lei Ying, Xiang Zhang
§ Students: – Liangyue Li – Chen Chen
– Yongjie Cai (now at Google) – Xing Su
– Si Zhang
Acknowledgement
- 40 -