+ All Categories
Home > Technology > Sparse Data Structures for Weighted Bipartite Matching

Sparse Data Structures for Weighted Bipartite Matching

Date post: 11-May-2015
Category:
Upload: jason-riedy
View: 1,120 times
Download: 5 times
Share this document with a friend
Description:
A stab at optimizing the inner loop for auctiion-based, sparse bipartite matching.
Popular Tags:
17
Sparse Data Structures for Weighted Bipartite Matching E. Jason Riedy Dr. James Demmel (... and thanks to the BeBOP group) SIAM Workshop on Combinatorial Scientific Computing 2004
Transcript
Page 1: Sparse Data Structures for Weighted Bipartite Matching

Sparse Data Structures for Weighted BipartiteMatching

E. Jason RiedyDr. James Demmel

(. . . and thanks to the BeBOP group)

SIAM Workshop on Combinatorial Scientific Computing 2004

Page 2: Sparse Data Structures for Weighted Bipartite Matching

Use Sparse Matrix Optimizations...

I Take a fixed, simple algorithm: Auction alg. for matchingsI Repeated iterations over a sparse graph.

I What’s expensive, and is there anything we can do about it?I Take an idea from optimizing sparse matrix-vector products.

I A little speed-up in some cases, but there are more ideasavailable...

Page 3: Sparse Data Structures for Weighted Bipartite Matching

Where’s the Time Going?

Auction algorithm: Iterative, greedy algorithm bipartite matching:1. An unmatched row i finds a “most

profitable” column jI π(i) = maxj b(i , j)− p(i)

2. Row i places a bid for column j .I Bid price raised until j is no longer the best

choice. (Min. increment µ)

3. Highest bid gets the matching (i , j).

Page 4: Sparse Data Structures for Weighted Bipartite Matching

Time linear in entries examined...Number of entries examined is problem-dependent.

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

0 5e+06 1e+07 1.5e+07 2e+07 2.5e+07 3e+07

tim

e(s

)

number of entries examined

Page 5: Sparse Data Structures for Weighted Bipartite Matching

Expensive Inner Loop!

1.3 GHz Itanium 2

0

2

4

6

8

10

100 150 200 250 300 350 400 450

cycles per entry

Page 6: Sparse Data Structures for Weighted Bipartite Matching

Verifying...Using kcachegrind (N.Nethercote and J.Weidendorfer) and valgrind(J. Seward).

Page 7: Sparse Data Structures for Weighted Bipartite Matching

And Locating...

No obvious culprits in the instructions...

Page 8: Sparse Data Structures for Weighted Bipartite Matching

And Locating...

But considering cache effects!

Page 9: Sparse Data Structures for Weighted Bipartite Matching

Auction’s Inner Loop

Same accesses as sparse matrix-vector multiplication!

value = entry - price

save largest...

Entry

Index

Price

Page 10: Sparse Data Structures for Weighted Bipartite Matching

Auction’s Inner Loop

Same accesses as sparse matrix-vector multiplication!

Entry

Index

Price

y += a(i,j) * x(j)

Page 11: Sparse Data Structures for Weighted Bipartite Matching

Performance Through Blocking?

(Images swiped from Berkeley’s BeBOP group.)

Page 12: Sparse Data Structures for Weighted Bipartite Matching

Performance Through Blocking?

(Images swiped from Berkeley’s BeBOP group.)

Page 13: Sparse Data Structures for Weighted Bipartite Matching

Performance Through Blocking?

More entries, but 1.5× performance on Pentium 3!(Images swiped from Berkeley’s BeBOP group.)

Page 14: Sparse Data Structures for Weighted Bipartite Matching

Blocking Speeds Some MatchesFinite element matrix from Vavasis (in UF collection):

Page 15: Sparse Data Structures for Weighted Bipartite Matching

Blocking Speeds Some Matches

Page 16: Sparse Data Structures for Weighted Bipartite Matching

Blocking Speeds Some Matches

Page 17: Sparse Data Structures for Weighted Bipartite Matching

Observations

A blocked graph data structure may provide additionalperformance if:

I you iterate over whole rows,

I the graph / matrix has runs of columns, and

I you’re willing to use an automated tuning system.

Maximizing the runs: linear arrangement. Hard, but there may becheap heuristics. Only worth-while if you’re performing manyiterations. (For mat-vec, often > 50 computations of Ax .)


Recommended