+ All Categories
Home > Documents > The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion,...

The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion,...

Date post: 05-Jan-2016
Category:
Upload: steven-johns
View: 213 times
Download: 0 times
Share this document with a friend
25
The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel
Transcript
Page 1: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

The Bloom Paradox

Ori Rottenstreich

Joint work with

Yossi Kanizo and Isaac Keslassy

Technion, Israel

Page 2: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

• Requirement: A data structure in user with fast answer to• Solutions:

o O(n) – Searching in a listo O(log(n)) – Searching in a sorted listo O(1) – But with false positives / negatives

Slocal cache

Problem Definition

2

Mcentral memory with

all elements

vuzyxzx

x

usercost = 10

cost = 1x

y

cost = 10

y

user

y

Page 3: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

• False Positive: but the data structure answers

• Results in a redundant access to the local cache.

Additional cost of 1.

• False Negative: but the data structure answers

• Results in an expensive access to the central memory instead of the local cache.

Additional cost of 10-1=9.

Two Possible Errors

3

x

y

Page 4: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

1

• Initialization: Array of zero bits.

• Insertion: Each of the elements is hashed times, the corresponding bits are set.

• Query: Hashing the element, checking that all bits are set.

• False positive rate (probability) of • No false negatives

Bloom Filters (Bloom, 1970)

4

0000000000 00

1

y1 1

0000000000 00

1 1

z

x11

1 1

1 11 1 1

x11 1 w

1 11

Page 5: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

• Cache/Memory Framework• Packet Classification• Intrusion Detection• Routing• Accounting• Beyond networking: Spell Checking, DNA Classification

• Can be found in o Google's web browser Chromeo Google's database system BigTableo Facebook's distributed storage system Cassandrao Mellanox's IB Switch System

Bloom Filters are Widely Used

5

Page 6: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

Outline

Introduction to Bloom Filters

The Bloom Paradox

The Variable-Increment Counting Bloom Filter

6

Page 7: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

The Bloom Paradox

7

Sometimes, it is better to disregard the Bloom filter results, and in fact not to even query it,

thus making the Bloom filter useless.

Page 8: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

• Parameters:

• Extreme case without locality: All elements with equal probability of

belonging to the cache.o Toy example

Example

8

Bloom filter

Page 9: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

• Parameters:• Let be the set of elements that the Bloom filter indicates are in

o In particular, no false negatives →

• Intuition:

Slocal cache

Mcentral memory with

all elements

vuzyxzx

cost = 10cost = 1

cost = 10

The Bloom Paradox

. .

userBBloom filterBloom filter

9

Page 10: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

• Parameters:• Let be the set of elements that the Bloom filter indicates are in

o In particular, no false negatives →

• Surprise:

cost = 1

Slocal cache

Mcentral memory with

all elements

vuzyxzx

cost = 10

cost = 10

The Bloom Paradox

. . 9

BBloom filter

Page 11: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

• Parameters:• Let be the set of elements that the Bloom filter indicates are in

o In particular, no false negatives →

• Surprise:

The Bloom filter indicates the membership of

elements. Only of them are indeed in .

The Bloom Paradox

. .

BBloom filter

Page 12: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

• When the Bloom filter states that , it is wrong with probability

• Average cost if we listen to the Bloom filter:

• Average cost if we don’t:

The Bloom filter is useless!

The Bloom Paradox

11

Don’t listen to the Bloom filter

= =

Page 13: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

Outline

Introduction to Bloom Filters

The Bloom Paradox

The Variable-Increment Counting Bloom Filter

12

Page 14: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

1

• Bloom filters do not support deletions of elements. Simply resetting bits might cause false negatives.

• The solution: Counting Bloom filters - Storing array of counters instead of bits.o Insertion: Incrementing counters by one.o Deletion: Decrementing counters by one. o Query: Checking that counters are positive.

• The same false positive probability.• Require too much memory, e.g. 57 bits per element for .

Counting Bloom Filters (CBFs)

y+1 +1

0102001010 01

+1 +1x

+1+1

0000001010 00

x11 111

Page 15: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

• Upon query, we should consider the exact values of the counters and not just their positiveness

• Can we design a deterministic scheme that exploits the exact values of the counters?

• Idea: Use variable increments to encode the element identity

Intuition for Variable Increments

14

0381052010 12

zy

Page 16: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

• Each hash entry contains a pair of counters:o , fixed increments → number of elements in entry (as in CBF)o , variable increments → weighted sum of elements

o weights from a pre-determined set

Architecture

15

34 9 6 2626 17 210 25

5 3 3 42 30 3c1

c2

2 7 8 94 5 61 3

2

• We use two sets of hash functions:o The first set uses hash functions with range

, i.e. it points to the set of entries.o The second set uses hash functions with

range , i.e. it points to the set .

Page 17: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

• Insertion:At each entry , the two counters are updated as follows.

o o from the set

• Example 1:

Insertion

16

34 9 13 2617 17 210 25

5 3 3 42 30 3c1

c2

2 7 8 94 5 61 3

x

+4+8

2

z

+4+13

Page 18: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

• Query ( with )

• We ask whethero 17 can be a sum of 2 elements from the set including 4o 30 can be a sum of 3 elements from the set including 8

• No: • How should we pick the set of variable increments?

Query

17

y

We should use Sequences!

34 30 13 2617 30 210 25

5 4 3 42 30 3c1

c2

2 7 8 94 5 61 3

3

y?

8?4?

Page 19: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

• Definition 1:Let be a sequence of positive integers.

Then, is a sequence iff all the sums

with are distinct.

• Example 2:

All the sums of elements of are distinct:

Therefore, is a sequence. • sequences are widely used in error-correcting codes.

Bh Sequences

18

Page 20: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

The Bh-CBF Scheme Query

19

• Example 3: is a sequence

o Since , then the Bh-CBF can determine that

34 30 13 2617 30 210 25

5 4 3 42 30 3c1

c2

2 7 8 94 5 61 3

X?

1?

3

4?

Page 21: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

• Example 3: is a sequence

The Bh-CBF Scheme Operations

19

o Here, and then necessarily

Since , the Bh-CBF can determine that

34 30 13 2617 30 210 25

5 4 3 42 30 3c1

c2

2 7 8 94 5 61 3

X?

1?

3

4?

The Bh-CBF Scheme Query

y?

8?4?

Page 22: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

• Example 3: is a sequence

The Bh-CBF Scheme Operations

19

o Since , the Bh-CBF cannot exclude that

34 30 13 2617 30 210 25

5 4 3 42 30 3c1

c2

2 7 8 94 5 61 3

X?

1?

3

4?

z?

4? 13?

The Bh-CBF Scheme Query

y?

8?4?

Page 23: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

• Internet trace (equinix-chicago) with real hash functions.

For the Bh-CBF, (with ).

20

Experimental Results

Page 24: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

• The Bloom Paradoxo Discovery of the Bloom paradoxo Importance of the a priori membership probability

• The Variable-Increment Counting Bloom Filtero Can extend many variants of the counting Bloom filtero First time sequences are presented in networking applications

Concluding Remarks

21

Page 25: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.

Thank You


Recommended