Reduce and Aggregate: Similarity Ranking in Multi ...Millions of Advertisers Billions of Queries eds...

Reduce and Aggregate: Similarity Ranking in Multi-Categorical

Bipartite Graphs

J. Feldman*, S. Lattanzi*, S. Leonardi°, V. Mirrokni*. *Google Research °Sapienza U. Rome

Alessandro Epasto

Motivation

● Recommendation Systems: ● Bipartite graphs with Users and Items. ● Identify similar users and suggest relevant

items. ● Concrete example: The AdWords case.

● Two key observations: ● Items belong to different categories. ● Graphs are often lopsided.

Modeling the Data as a Bipartite Graph

Millions of Advertisers Billions of Queries

Hundreds of Labels

Nike Store New York

Soccer Shoes

Soccer Ball

2$

3$

4$

1$

5$

2$

Retailers

Apparel

Sport Equipment

Personalized PageRank

v u

The stationary distribution assigns a similarity score to each node in the graph w.r.t. node v.

For a node v (the seed) and a probability alpha

The Problem

Millions of Advertisers Billions of Queries

Hundreds of Labels

Nike Store New York

Soccer Shoes

Soccer Ball

2$

3$

4$

1$

5$

2$

Retailers

Apparel

Sport Equipment

Other Applications

● General approach applicable to several contexts: ●User, Movies, Genres: find similar users

and suggest movies. ● Authors, Papers, Conferences: find

related authors and suggest papers to read.

Semi-Formal Problem Definition

Advertisers

Queries


A

Advertisers

Queries


A

Advertisers

Queries

Labels:


A

Advertisers

Queries

Labels:Goal:

Find the nodes most “similar” to A.

How to Define Similarity?

● We address the computation of several node similarity measures: ● Neighborhood based: Common neighbors,

Jaccard Coefficient, Adamic-Adar. ● Paths based: Katz. ● Random Walk based: Personalized PageRank.

● Experimental question: which measure is useful? ● Algorithmic questions: ● Can it scale to huge graphs? ● Can we compute it in real-time?

Our Contribution

● Reduce and Aggregate: general approach to induce real-time similarity rankings in multi-categorical bipartite graphs, that we apply to several similarity measures.

● Theoretical guarantees for the precision of the algorithms.

● Experimental evaluation with real world data.

Personalized PageRank

v u

The stationary distribution assigns a similarity score to each node in the graph w.r.t. node v.

For a node v (the seed) and a probability alpha

Challenges

● Our graphs are too big (billions of nodes) even for very large-scale MapReduce systems.

● MapReduce is not real-time.

● We cannot pre-compute the rankings for each subset of labels.

Reduce and Aggregate

Reduce: Given the bipartite and a category construct a graph with only A nodes that preserves the ranking on the entire graph.

Aggregate: Given a node v in A and the reduced graphs of the subset of categories interested determine the ranking for v.

a

b

c

a b

c

c

a

b

a

c

1)

b

2)

3)

Reduce (Precomputation)

Advertisers

Queries


Advertisers

Queries

Precomputed Rankings


Advertisers

Queries




Advertisers

Queries




Aggregate (Run Time)



Ranking of Red + Yellow

A

Reduce for Personalized PageRank

●Markov Chain state aggregation theory (Simon and Ado, ’61; Meyer ’89, etc.).

● 750x reduction in the number of node while preserving correctly the PPR distribution on the entire graph.

XSide A

Side BSide A

Y X

Y

Run-time Aggregation

Koury et al. Aggregation-Disaggregation Algorithm

Step 1: Partition the Markov chain into DISJOINT subsets

A B


Step 2: Approximate the stationary distribution on each subset independently.

⇡A⇡B

A B


Step 3: Consider the transition between subsets.

⇡A

PAB

PBA

PBB

PAAA B

⇡B


Step 4: Aggregate the distributions. Repeat until convergence.

PAB

PBA

PBB

PAAA B

⇡0B⇡0

A

Aggregation in PPR

X Y

Precompute the stationary distributions individually

⇡A

A

Aggregation in PPR

X Y

Precompute the stationary distributions individually

⇡B

B

Aggregation in PPR

The two subsets are not disjoint!

A B

Our Approach

X Y X Y

● The algorithm is based only on the reduced graphs with Advertiser-Side nodes.

● The aggregation algorithm is scalable and converges to the correct distribution.

⇡A ⇡B

Experimental Evaluation

● We experimented with publicly available and proprietary datasets:

● Query-Ads graph from Google AdWords > 1.5 billions nodes, > 5 billions edges.

● DBLP Author-Papers and Patent Inventor-Inventions graphs.

● Ground-Truth clusters of competitors in Google AdWords.

Patent Graph

Recall

Prec

isio

n

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Pre

cis

ion

Recall

Precision vs Recall

InterJaccard

Adamic-AdarKatzPPR

Google AdWords

Recall

Prec

isio

n

Conclusions and Future Work

● It is possible to compute several similarity scores on very large bipartite graphs in real-time with good accuracy.

●Future work could focus on the case where categories are not disjoint is relevant.

Thank you for your attention

Reduction to the Query Side

X Y

⇡A ⇡B

Reduction to the Query Side

X Y

This is the larger side of the graph.

⇡A ⇡B

Convergence after One Iteration

0

0.2

0.4

0.6

0.8

1

10 20 30 40 50 All

Kend

all-T

au

Position (k)

Kendall-Tau Correlation

DBLPPatent

Query-Ads (cost)

Convergence

Iterations

1-Co

sine

Sim

ilari

ty

1e-06

1e-05

0.0001

0.001

0 2 4 6 8 10 12 14 16 18 20

1-C

osi

ne

Iterations

Approximation Error vs # Iterations

DBLP (1 - Cosine)Patent (1 - Cosine)

Date post:	09-Jul-2020
Category:	Documents
Upload:	others
View:	6 times
Download:	0 times

Reduce and Aggregate: Similarity Ranking in Multi ...Millions of Advertisers Billions of Queries eds...

Documents