+ All Categories
Home > Documents > Alternative Data Structures in Ruby

Alternative Data Structures in Ruby

Date post: 18-Nov-2014
Category:
Upload: senor-smiles
View: 117 times
Download: 1 times
Share this document with a friend
Popular Tags:
75
Alternative Data Structures in Ruby Tyler McMullen Friday, February 19, 2010
Transcript
Page 1: Alternative Data Structures in Ruby

Alternative Data Structures in Ruby

Tyler McMullen

Friday, February 19, 2010

Page 2: Alternative Data Structures in Ruby

Why?

Friday, February 19, 2010

Page 3: Alternative Data Structures in Ruby

Why?

• Speed

• Memory

• Clarity

Friday, February 19, 2010

Page 4: Alternative Data Structures in Ruby

What’s wrong with my favorite data structure, X?

Friday, February 19, 2010

Page 5: Alternative Data Structures in Ruby

Nothing. (Maybe.)

Friday, February 19, 2010

Page 6: Alternative Data Structures in Ruby

•Bloom Filter

•BK-tree

•Splay Tree

•Trie

Friday, February 19, 2010

Page 7: Alternative Data Structures in Ruby

Bloom Filters

• Tests for existence in a set

• Probabilistic

• Minimal memory use

Friday, February 19, 2010

Page 8: Alternative Data Structures in Ruby

100 million strings in a Set

Traditional Set: Minimum 10gb

Friday, February 19, 2010

Page 9: Alternative Data Structures in Ruby

100 million strings in a Set

Traditional Set: Minimum 10gbBloom Filter (0.00001): 280mb

Friday, February 19, 2010

Page 10: Alternative Data Structures in Ruby

100 million strings in a Set

Traditional Set: Minimum 10gbBloom Filter (0.00001): 280mb

Bloom Filter (0.001): 170mb

Friday, February 19, 2010

Page 11: Alternative Data Structures in Ruby

Friday, February 19, 2010

Page 12: Alternative Data Structures in Ruby

10 2 3 4 5 6 7

Friday, February 19, 2010

Page 13: Alternative Data Structures in Ruby

10 2 3 4 5 6 7

“to be or not to be”

Friday, February 19, 2010

Page 14: Alternative Data Structures in Ruby

10 2 3 4 5 6 7

add: “to be or not to be”

Friday, February 19, 2010

Page 15: Alternative Data Structures in Ruby

10 2 3 4 5 6 7

add: “that is the question”

Friday, February 19, 2010

Page 16: Alternative Data Structures in Ruby

10 2 3 4 5 6 7

query: “whether ‘tis nobler”

NO MATCH

Friday, February 19, 2010

Page 17: Alternative Data Structures in Ruby

10 2 3 4 5 6 7

query: “to be or not to be”

MATCH

Friday, February 19, 2010

Page 18: Alternative Data Structures in Ruby

10 2 3 4 5 6 7

query: “in the mind to suffer”

FALSE MATCH

Friday, February 19, 2010

Page 19: Alternative Data Structures in Ruby

File Server

Friday, February 19, 2010

Page 20: Alternative Data Structures in Ruby

File Server

Request

exists?

200 404

Y N

Friday, February 19, 2010

Page 21: Alternative Data Structures in Ruby

File Server

Request

exists?

200 404

Y N

Bloom Filter

Friday, February 19, 2010

Page 22: Alternative Data Structures in Ruby

Bloom Filter

• Test for existence in set

• Tiny Memory Footprint

• Excellent Speed

Friday, February 19, 2010

Page 23: Alternative Data Structures in Ruby

BK-tree

Friday, February 19, 2010

Page 24: Alternative Data Structures in Ruby

BK-tree

• find items within a distance of a target

• reduces search space

• works inside a metric space

Friday, February 19, 2010

Page 25: Alternative Data Structures in Ruby

Triangle Inequality| d(x, y) - d(x, z) | ≤ d(y, z)

Friday, February 19, 2010

Page 26: Alternative Data Structures in Ruby

Triangle Inequality| d(x, y) - d(x, z) | ≤ d(y, z)

x

y

z

Friday, February 19, 2010

Page 27: Alternative Data Structures in Ruby

Triangle Inequality| d(x, y) - d(x, z) | ≤ d(y, z)

1

4

x

y

z

Friday, February 19, 2010

Page 28: Alternative Data Structures in Ruby

Triangle Inequality| d(x, y) - d(x, z) | ≤ d(y, z)

1

4

x

y

z

?

Friday, February 19, 2010

Page 29: Alternative Data Structures in Ruby

Triangle Inequality| 4 - 1 | ≤ d(y, z)

1

4

x

y

z

?

Friday, February 19, 2010

Page 30: Alternative Data Structures in Ruby

Triangle Inequality3 ≤ d(y, z)

1

4

x

y

z

≥3

Friday, February 19, 2010

Page 31: Alternative Data Structures in Ruby

BK-tree

paste

pasta

taser

pastor

shave

light

Friday, February 19, 2010

Page 32: Alternative Data Structures in Ruby

BK-tree

paste

pasta

taser

pastor

shave

light

Friday, February 19, 2010

Page 33: Alternative Data Structures in Ruby

BK-tree

paste

pasta taserpastor shave light1 2 3 4 5

root

Friday, February 19, 2010

Page 34: Alternative Data Structures in Ruby

BK-tree

paste

pasta taserpastor shave light1 2 3 4 5

rootpastu

Friday, February 19, 2010

Page 35: Alternative Data Structures in Ruby

BK-tree

paste

pasta taserpastor shave light1 2 3 4 5

rootpastu

1

Friday, February 19, 2010

Page 36: Alternative Data Structures in Ruby

BK-tree

paste

pasta taserpastor shave light1 2 3 4 5

rootpastu

1

Friday, February 19, 2010

Page 37: Alternative Data Structures in Ruby

BK-tree

paste

pasta pastor

rootpastu

1

1 2

Friday, February 19, 2010

Page 38: Alternative Data Structures in Ruby

BK-tree

paste

pasta pastor

rootpastu

1

1 2

Friday, February 19, 2010

Page 39: Alternative Data Structures in Ruby

BK-tree

paste

pasta taserpastor shave light1 2 3 4 5

root

Friday, February 19, 2010

Page 40: Alternative Data Structures in Ruby

BK-tree

paste

pasta taserpastor shave light1 2 3 4 5

rootpastu

Friday, February 19, 2010

Page 41: Alternative Data Structures in Ruby

BK-tree

paste

pasta taserpastor shave light1 2 3 4 5

rootpastu

Friday, February 19, 2010

Page 42: Alternative Data Structures in Ruby

BK-tree

paste

pasta taserpastor shave light1 2 3 4 5

rootpastu

Friday, February 19, 2010

Page 43: Alternative Data Structures in Ruby

BK-tree

paste

pasta taserpastor shave light1 2 3 4 5

rootpastu

Friday, February 19, 2010

Page 44: Alternative Data Structures in Ruby

BK-tree

• Most often used for spelling correctors

• Work in any metric space

• Reduce the search space

Friday, February 19, 2010

Page 45: Alternative Data Structures in Ruby

Splay Tree

Friday, February 19, 2010

Page 46: Alternative Data Structures in Ruby

Tangent: Access Patterns

Friday, February 19, 2010

Page 47: Alternative Data Structures in Ruby

Access Patterns

Usually assumed to be random or even.

Friday, February 19, 2010

Page 48: Alternative Data Structures in Ruby

Access Patterns

Rarely the case.

Friday, February 19, 2010

Page 49: Alternative Data Structures in Ruby

Splay Tree

• Self-balancing binary tree

• Brings most accessed items toward root

• The more uneven the access pattern, the better

Friday, February 19, 2010

Page 50: Alternative Data Structures in Ruby

Splay Tree

7

4

2 6

5 41 3

11

9 13

12 148 10

Friday, February 19, 2010

Page 51: Alternative Data Structures in Ruby

Splay Tree

7

4

2 6

5 41 3

11

9 13

12 148 10

Friday, February 19, 2010

Page 52: Alternative Data Structures in Ruby

Splay Tree

7

4

2 6

5 41 3

11

9

13

12 14

8

10

Friday, February 19, 2010

Page 53: Alternative Data Structures in Ruby

Splay Tree

7

4

2 6

5 41 3

11

9

13

12 14

8

10

Friday, February 19, 2010

Page 54: Alternative Data Structures in Ruby

Splay Tree

• Made for very uneven access patterns

• Caches, Garbage collectors, etc...

Friday, February 19, 2010

Page 55: Alternative Data Structures in Ruby

Trie

Friday, February 19, 2010

Page 56: Alternative Data Structures in Ruby

Trie

• O(1) on lookup, add, removal

• Ordered traversals

• Prefix matching

• Excellent memory usage (depending on implementation)

Friday, February 19, 2010

Page 57: Alternative Data Structures in Ruby

Trie

Friday, February 19, 2010

Page 58: Alternative Data Structures in Ruby

Trie

T

H

N

I

add: “thin”

Friday, February 19, 2010

Page 59: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

add: “trap”

Friday, February 19, 2010

Page 60: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

add: “bar”

Friday, February 19, 2010

Page 61: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

add: “burp”

Friday, February 19, 2010

Page 62: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

query: “trap”

Friday, February 19, 2010

Page 63: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

query: “trap”

Friday, February 19, 2010

Page 64: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

query: “trap”

Friday, February 19, 2010

Page 65: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

query: “trap”

Friday, February 19, 2010

Page 66: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

query: “trap”

Success!Friday, February 19, 2010

Page 67: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

query: “bumpkin”

Friday, February 19, 2010

Page 68: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

query: “bupkis”

Friday, February 19, 2010

Page 69: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

query: “bupkis”

Friday, February 19, 2010

Page 70: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

query: “bupkis”

Fail!Friday, February 19, 2010

Page 71: Alternative Data Structures in Ruby

Trie

Example: Autocompleter

Friday, February 19, 2010

Page 72: Alternative Data Structures in Ruby

Trie

class  Autocompleter    def  initialize(words)        @trie  =  Trie.new        words.each  {  |word|  @trie.add(word)  }    end

   def  query(word)        return  @trie.children(word)    endend

Friday, February 19, 2010

Page 73: Alternative Data Structures in Ruby

Trieclass  Autocompleter    def  initialize(words)        @trie  =  Trie.new        words.each  {  |word|  @trie.add(word)  }    end

   def  call(env)        request  =  Rack::Request.new(env)        return  [200,                        {  ‘content-­‐type’  =>  ‘application/json’  },                        @trie.children(word).to_json]    endend

Friday, February 19, 2010

Page 74: Alternative Data Structures in Ruby

Conclusion: Data structures are cool.

Friday, February 19, 2010

Page 75: Alternative Data Structures in Ruby

Questions?

Friday, February 19, 2010


Recommended