Lara: A Language of Linear and Relational Algebra for Polystores

Post on 31-Dec-2016

219 views 0 download

transcript

LARA: A Language of Linear and Relational Algebra

for PolystoresDylan Hutchison

advised by Bill Howe, Dan Suciu

- Work in Progress -

Polystores

Table Store

Graph Engine

Array Store

Key-Value Store

MatlabSQL Spark Streaming DataFrames

Polystores connect backend systems with frontend languages through a unifying "narrow API," using each

system where it performs best.

How to choose an algebra?

Goal: Implement algorithms!

Algorithms

Data CubeMatrix Inverse

Max Flow PageRank

How to choose an algebra?

Ops:

Objects:

Algorithms

Data CubeMatrix Inverse

Max Flow PageRank

Goal: Implement algorithms!

Algebras

Relations Matrices Graphs Files

BLAS/Linear Algebra

Node/Edge Updates

File AccessRelational

Algebra

Many candidate algebras…

Algebra := Objects + (closed) Operations on Objects

Ops:

Objects:

How to choose an algebra?

Algorithms

Data CubeMatrix Inverse

Max Flow PageRank

Goal: Implement algorithms!

Algebras

Relations Matrices Graphs Files

BLAS/Linear Algebra

Node/Edge Updates

File AccessRelational

Algebra

Many candidate algebras…

Execution Engines

PostgreSQLNeo4J, Allegro

CSV, HDF5

Algebra := Objects + (closed) Operations on Objects

Many algebras have optimized execution engines

ScaLAPACK

LARA

Ops:

Objects:

How to choose an algebra?Algorithms

Data CubeMatrix Inverse

Max Flow PageRank

Algebras

Relations Matrices Graphs Files

BLAS/Linear Algebra

Node/Edge Updates

File AccessRelational

Algebra

Execution Engines

PostgreSQLNeo4J, Allegro

CSV, HDF5ScaLAPACK

Associative Tables

⋈⊗ mapf promoteV

Answer: No choice necessary. Use Lara!1. Write algorithm in any/all algebras2. Translate to/from Lara common algebra3. Use any/all execution engines

Goal: Implement algorithms!

Operations of Lara

• ⋈⊗ – Join: horizontally merge columns, select equal colliding keys, multiply colliding values

• ⊕ – Union: vertically merge columns, group by colliding keys, sum colliding values

• mapf – Map keys and old values to new values

• promoteV – Promote values to keys

Example: Ranking a SearchSuppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance.Goal: Compute ranks of sites in D for search query Q, weighing by W

Q

word score

delicious 1

green 1

(others) 0

D

site word score

pizzanow.com pizza 6

pizzanow.com delicious 5

allrecipes.com delicious 2

allrecipes.com green 2

allrecipes.com potatoes 5

recycle.org green 2

(others) 0

W

word score

delicious 1

pizza 1

potatoes 3

green 2

(others) 0

Desired Output

site score

pizzanow.com 1*5*1 = 5

allrecipes.com 1*2*1+1*2*2 = 6

recycle.org 1*2*2 = 4

(others) 0

Example: Ranking a Search

Q

word score

delicious 1

green 1

(others) 0

D

site word score

pizzanow.com pizza 6

pizzanow.com delicious 5

allrecipes.com delicious 2

allrecipes.com green 2

allrecipes.com potatoes 5

recycle.org green 2

(others) 0

W

word score

delicious 1

pizza 1

potatoes 3

green 2

(others) 0

Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance.Goal: Compute ranks of sites in D for search query Q, weighing by W

RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W)))

LA: diag(Q) +.* D +.* W

Desired Output

site score

pizzanow.com 1*5*1 = 5

allrecipes.com 1*2*1+1*2*2 = 6

recycle.org 1*2*2 = 4

(others) 0

Example: Ranking a Search

Q

word score

delicious 1

green 1

(others) 0

D

site word score

pizzanow.com pizza 6

pizzanow.com delicious 5

allrecipes.com delicious 2

allrecipes.com green 2

allrecipes.com potatoes 5

recycle.org green 2

(others) 0

W

word score

delicious 1

pizza 1

potatoes 3

green 2

(others) 0

Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance.Goal: Compute ranks of sites in D for search query Q, weighing by W

RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W)))

LA: diag(Q) +.* D +.* W

Desired Output

site score

pizzanow.com 1*5*1 = 5

allrecipes.com 1*2*1+1*2*2 = 6

recycle.org 1*2*2 = 4

(others) 0

(Matlab)

Example: Ranking a Search

Q

word score

delicious 1

green 1

(others) 0

D

site word score

pizzanow.com pizza 6

pizzanow.com delicious 5

allrecipes.com delicious 2

allrecipes.com green 2

allrecipes.com potatoes 5

recycle.org green 2

(others) 0

W

word score

delicious 1

pizza 1

potatoes 3

green 2

(others) 0

Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance.Goal: Compute ranks of sites in D for search query Q, weighing by W

RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W)))

LA: diag(Q) +.* D +.* W

Hybrid: πword(Q) ⋈ D +.* W

Desired Output

site score

pizzanow.com 1*5*1 = 5

allrecipes.com 1*2*1+1*2*2 = 6

recycle.org 1*2*2 = 4

(others) 0

(Matlab)

Example: Ranking a Search

Q

word score

delicious 1

green 1

(others) 0

D

site word score

pizzanow.com pizza 6

pizzanow.com delicious 5

allrecipes.com delicious 2

allrecipes.com green 2

allrecipes.com potatoes 5

recycle.org green 2

(others) 0

W

word score

delicious 1

pizza 1

potatoes 3

green 2

(others) 0

Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance.Goal: Compute ranks of sites in D for search query Q, weighing by W

RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W)))

LA: diag(Q) +.* D +.* W

Hybrid: πword(Q) ⋈ D +.* W

LARA: (Q ⋈* D ⋈* W) Esite

Desired Output

site score

pizzanow.com 1*5*1 = 5

allrecipes.com 1*2*1+1*2*2 = 6

recycle.org 1*2*2 = 4

(others) 0

+

(Matlab)

Example: Ranking a Search

Q

word score

delicious 1

green 1

(others) 0

D

site word score

pizzanow.com pizza 6

pizzanow.com delicious 5

allrecipes.com delicious 2

allrecipes.com green 2

allrecipes.com potatoes 5

recycle.org green 2

(others) 0

W

word score

delicious 1

pizza 1

potatoes 3

green 2

(others) 0

Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance.Goal: Compute ranks of sites in D for search query Q, weighing by W

RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W)))

LA: diag(Q) +.* D +.* W

Hybrid: πword(Q) ⋈ D +.* W

LARA: (Q ⋈* D ⋈* W) Esite

Desired Output

site score

pizzanow.com 1*5*1 = 5

allrecipes.com 1*2*1+1*2*2 = 6

recycle.org 1*2*2 = 4

(others) 0

+

Executes on both RDBMS and BLAS, depending on cost model

Many ways to express algorithms. Lara presents an economical algebra preserving • LA's familiar math, numerical prowess• RA's flexibility, scale-out optimization

(Matlab)

LARA: A Unifying Algebra

Do you have an application more easily expressed in several algebras?

Do you seek multi-system optimizations?

Let's discuss!

Vision for Polystore Systems

ScriptSQLSQLSQLMatlabMatlabMatlabSQLSQL…

∪ × πC

σf ρ

∖ γ

⊕⊗ f⊕.⊗ T

⋈⊗

mapf

promoteV

LA

RA

RA

RDBMS

BLAS

Optimize &

Schedule

LARA

APIs of RA and LA

Relational Algebra

Object: Relation

• ∪ – Union

• × – Cartesian Product

• πC – (Extended) Projection

• σf – Select

• ρ – Rename

• ∖ – Difference

• γ – Aggregate

Linear Algebra

Object: N-D Matrix

•⊕ – Element-wise add

•⊗ – Element-wise multiply

•⊕.⊗ – Matrix multiply

• Reduce – Sum along a dimension

• Apply function to each element

• T – Transpose

• (Construction & De-construction)

Objects of Lara

Associative Tables. Several interpretations:

• Relational table with key columns & value columns with default values

• Total function from key-space to value-space

• Sparse tensor

Lara -> RA & LA

Lara RA LA

⋈⊗ ⋈, π⊗, ρ Tensor product

γ⊕, ∪ Reduce, e-wise sum

mapf πf Apply

promoteV Re-index Re-key

Example derived operation: Outer Join

InnerJoinP ⋈ S

P ⟗ S

(formulas out of date)