+ All Categories
Home > Documents > Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel...

Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel...

Date post: 25-Jul-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
46
Provenance for Natural Language Queries Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir Gilad Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 1 / 23
Transcript
Page 1: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Provenance for Natural Language Queries

Daniel Deutch Nave Frost Amir Gilad

Tel Aviv University

August 2017

Presented by Amir Gilad

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 1 / 23

Page 2: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Outline

1 Introduction

2 Mappings and Answer Tree - Single Assignment

3 Factorization

4 Summarization

5 Experiments

6 Related Work and Conclusions

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 2 / 23

Page 3: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Motivation

NL QueryReturn the organization of authors who published papers in database conferences after

2005

Formal Queryquery(oname) :- org(oid, oname), conf(cid, cname),

pub(wid, cid, ptitle, pyear), author(aid, aname, oid),

domainConf(cid, did), domain(did, dname),

writes(aid, wid), dname = ’Databases’, pyear > 2005

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 3 / 23

Page 4: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Motivation

NL QueryReturn the organization of authors who published papers in database conferences after

2005

Formal Queryquery(oname) :- org(oid, oname), conf(cid, cname),

pub(wid, cid, ptitle, pyear), author(aid, aname, oid),

domainConf(cid, did), domain(did, dname),

writes(aid, wid), dname = ’Databases’, pyear > 2005

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 3 / 23

Page 5: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Motivation

NL QueryReturn the organization of authors who published papers in database conferences after

2005

Formal Queryquery(oname) :- org(oid, oname), conf(cid, cname),

pub(wid, cid, ptitle, pyear), author(aid, aname, oid),

domainConf(cid, did), domain(did, dname),

writes(aid, wid), dname = ’Databases’, pyear > 2005

ResultTel Aviv University (TAU)

(why?)

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 3 / 23

Page 6: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Motivation

NL QueryReturn the organization of authors who published papers in database conferences after

2005

Formal Queryquery(oname) :- org(oid, oname), conf(cid, cname),

pub(wid, cid, ptitle, pyear), author(aid, aname, oid),

domainConf(cid, did), domain(did, dname),

writes(aid, wid), dname = ’Databases’, pyear > 2005

ResultTel Aviv University (TAU) (why?)

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 3 / 23

Page 7: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Motivation

NL QueryReturn the organization of authors who published papers in database conferences after

2005

Formal Queryquery(oname) :- org(oid, oname), conf(cid, cname),

pub(wid, cid, ptitle, pyear), author(aid, aname, oid),

domainConf(cid, did), domain(did, dname),

writes(aid, wid), dname = ’Databases’, pyear > 2005

What We Have - Provenance(oname,TAU)·(aname,Tova M.)·(ptitle,OASSIS...)·(cname,SIGMOD)·(pyear,14’)+(oname,TAU)·(aname,Tova M.)·(ptitle,Querying...)·(cname,VLDB)·(pyear,06’)+(oname,TAU)·(aname,Tova M.)· (ptitle,Monitoring...)·(cname,VLDB)·(pyear,07’)+(oname,TAU)·(aname,Slava N.)·(ptitle,OASSIS...)·(cname,SIGMOD)·(pyear,14’)+(oname,TAU)·(aname,Tova M.)·(ptitle,A sample...)·(cname,SIGMOD)·(pyear,14’)+...

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 3 / 23

Page 8: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Motivation

NL QueryReturn the organization of authors who published papers in database conferences after

2005

Formal Queryquery(oname) :- org(oid, oname), conf(cid, cname),

pub(wid, cid, ptitle, pyear), author(aid, aname, oid),

domainConf(cid, did), domain(did, dname),

writes(aid, wid), dname = ’Databases’, pyear > 2005

What We Want - ExplanationsTAU is the organization of 43 authors who published 170 papers

in 31 conferences in 2006 - 2015

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 3 / 23

Page 9: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Solution Overview

Solution

Use the input question to formulate a detailed NL answer by replacingwords with values

I This is a general idea: showing provenance in a way that correspondsto the standard user interaction

When a long answer is needed, compact it using algebraicfactorization and summarization

I Here, again, we leverage the structure of the user question

Current Limitations

Only conjunctive queries are supported

Some aspects of the solution are limited to a specific NLIDBI But the general idea is not

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 4 / 23

Page 10: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Framework

Fact. +Sentence

Fact. +Sentence

Parser(Augmented) NaLIR(Augmented) NaLIR

BuilderQuery Builder

NL Query

NL Query

DBDBSelP

Factorization GenerationSentence

GenerationFact. +MappingFact. +

Mapping

Results + Provenance + MappingResults + Provenance + Mapping

Query + MappingQuery + MappingDep.

TreeDep. Tree

SummarizationSentenceSentenceSentenceSentence Summarized SentenceSummarized Sentence

Augment NaLIR [Fei Li, Jagadish, 15’]

Use a provenance-aware engine - SelP [Deutch et al., 15’]

Store the provenance and mappings

Translate results and provenance to NL using factorization andsummarization

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 5 / 23

Page 11: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Outline

1 Introduction

2 Mappings and Answer Tree - Single Assignment

3 Factorization

4 Summarization

5 Experiments

6 Related Work and Conclusions

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 6 / 23

Page 12: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Mappings

(oname, TAU)

Return

organizationPOS=NN, REL=dobj

ofPOS=IN, REL=prep

authorsPOS=NNS, REL=pobj

publishedPOS=VBD, REL=rcmod

in

conferencesPOS=NNS, REL=pobj

databasePOS=NN, REL=nn

afterPOS=IN, REL=prep

2005POS=CD, REL=pobj

paperswho

the

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 7 / 23

Return the organization of authors who published papers in database conferences after 2005

query(oname) :- org(oid, oname), conf(cid, cname), pub(wid, cid, ptitle, pyear), author(aid,

aname, oid), domainConf(cid, did), domain(did, dname), writes(aid, wid), dname =

’Databases’, pyear > 2005

Page 13: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Mappings

(oname, TAU)

Return

organizationPOS=NN, REL=dobj

ofPOS=IN, REL=prep

authorsPOS=NNS, REL=pobj

publishedPOS=VBD, REL=rcmod

in

conferencesPOS=NNS, REL=pobj

databasePOS=NN, REL=nn

afterPOS=IN, REL=prep

2005POS=CD, REL=pobj

paperswho

the

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 7 / 23

Return the organization of authors who published papers in database conferences after 2005

query(oname) :- org(oid, oname), conf(cid, cname), pub(wid, cid, ptitle, pyear), author(aid,

aname, oid), domainConf(cid, did), domain(did, dname), writes(aid, wid), dname =

’Databases’, pyear > 2005

Page 14: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Mappings

(oname, TAU)

(aname, Tova M.)

Return

organizationPOS=NN, REL=dobj

ofPOS=IN, REL=prep

authorsPOS=NNS, REL=pobj

publishedPOS=VBD, REL=rcmod

in

conferencesPOS=NNS, REL=pobj

databasePOS=NN, REL=nn

afterPOS=IN, REL=prep

2005POS=CD, REL=pobj

paperswho

the

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 7 / 23

Return the organization of authors who published papers in database conferences after 2005

query(oname) :- org(oid, oname), conf(cid, cname), pub(wid, cid, ptitle, pyear), author(aid,

aname, oid), domainConf(cid, did), domain(did, dname), writes(aid, wid), dname =

’Databases’, pyear > 2005

Page 15: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Mappings

(oname, TAU)

(aname, Tova M.)

(ptitle, ‘OASSIS...’)

(cname, SIGMOD)

(pyear, 2014)

Return

organizationPOS=NN, REL=dobj

ofPOS=IN, REL=prep

authorsPOS=NNS, REL=pobj

publishedPOS=VBD, REL=rcmod

in

conferencesPOS=NNS, REL=pobj

databasePOS=NN, REL=nn

afterPOS=IN, REL=prep

2005POS=CD, REL=pobj

paperswho

the

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 7 / 23

Return the organization of authors who published papers in database conferences after 2005

query(oname) :- org(oid, oname), conf(cid, cname), pub(wid, cid, ptitle, pyear), author(aid,

aname, oid), domainConf(cid, did), domain(did, dname), writes(aid, wid), dname =

’Databases’, pyear > 2005

Page 16: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

From Mappings to an Answer

(oname, TAU)

(aname, Tova M.)

(ptitle, ‘OASSIS...’)

(cname, SIGMOD)

(pyear, 2014)

Return

organizationPOS=NN, REL=dobj

ofPOS=IN, REL=prep

authorsPOS=NNS, REL=pobj

publishedPOS=VBD, REL=rcmod

in

conferencesPOS=NNS, REL=pobj

databasePOS=NN, REL=nn

afterPOS=IN, REL=prep

2005POS=CD, REL=pobj

paperswho

the

organization

of

Tova M.

published

in

SIGMOD

in

2014

’OASSIS...’who

TAU (is the)

AnswerTAU is the organization of Tova M. who published ’OASSIS...’ in SIGMOD in 2014

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 8 / 23

Page 17: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

From Mappings to an Answer

(oname, TAU)

(aname, Tova M.)

(ptitle, ‘OASSIS...’)

(cname, SIGMOD)

(pyear, 2014)

Return

organizationPOS=NN, REL=dobj

ofPOS=IN, REL=prep

authorsPOS=NNS, REL=pobj

publishedPOS=VBD, REL=rcmod

in

conferencesPOS=NNS, REL=pobj

databasePOS=NN, REL=nn

afterPOS=IN, REL=prep

2005POS=CD, REL=pobj

paperswho

the

organization

of

Tova M.

published

in

SIGMOD

in

2014

’OASSIS...’who

TAU (is the)

AnswerTAU is the organization of Tova M. who published ’OASSIS...’ in SIGMOD in 2014

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 8 / 23

Page 18: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

From Mappings to an Answer

(oname, TAU)

(aname, Tova M.)

(ptitle, ‘OASSIS...’)

(cname, SIGMOD)

(pyear, 2014)

Return

organizationPOS=NN, REL=dobj

ofPOS=IN, REL=prep

authorsPOS=NNS, REL=pobj

publishedPOS=VBD, REL=rcmod

in

conferencesPOS=NNS, REL=pobj

databasePOS=NN, REL=nn

afterPOS=IN, REL=prep

2005POS=CD, REL=pobj

paperswho

the

organization

of

Tova M.

published

in

SIGMOD

in

2014

’OASSIS...’who

TAU (is the)

AnswerTAU is the organization of Tova M. who published ’OASSIS...’ in SIGMOD in 2014

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 8 / 23

Page 19: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

From Mappings to an Answer

(oname, TAU)

(aname, Tova M.)

(ptitle, ‘OASSIS...’)

(cname, SIGMOD)

(pyear, 2014)

Return

organizationPOS=NN, REL=dobj

ofPOS=IN, REL=prep

authorsPOS=NNS, REL=pobj

publishedPOS=VBD, REL=rcmod

in

conferencesPOS=NNS, REL=pobj

databasePOS=NN, REL=nn

afterPOS=IN, REL=prep

2005POS=CD, REL=pobj

paperswho

the

organization

of

Tova M.

published

in

SIGMOD

in

2014

’OASSIS...’who

TAU (is the)

AnswerTAU is the organization of Tova M. who published ’OASSIS...’ in SIGMOD in 2014

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 8 / 23

Page 20: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Outline

1 Introduction

2 Mappings and Answer Tree - Single Assignment

3 Factorization

4 Summarization

5 Experiments

6 Related Work and Conclusions

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 9 / 23

Page 21: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Provenance Factorization

Idea

Use algebraic factorization of the provenance to take-out commonvalues that appear in multiple assignments

Provenance[TAU]·[Tova M.]·[OASSIS...]·[SIGMOD]·[2014]+[TAU]·[Tova M.]·[Querying...]·[VLDB]·[2006]+[TAU]·[Tova M.]· [Monitoring..]·[VLDB]·[2007]+[TAU]·[Slava N.]·[OASSIS...]·[SIGMOD]·[2014]+[TAU]·[Tova M.]·[A sample...]·[SIGMOD]·[2014]

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 10 / 23

Page 22: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Provenance Factorization

Idea

Use algebraic factorization of the provenance to take-out commonvalues that appear in multiple assignments

Provenance[TAU]·[Tova M.]·[OASSIS...]·[SIGMOD]·[2014]+[TAU]·[Tova M.]·[Querying...]·[VLDB]·[2006]+[TAU]·[Tova M.]· [Monitoring..]·[VLDB]·[2007]+[TAU]·[Slava N.]·[OASSIS...]·[SIGMOD]·[2014]+[TAU]·[Tova M.]·[A sample...]·[SIGMOD]·[2014]

Two Different Factorizations[TAU] ·([SIGMOD] · [2014] ·([OASSIS...] ·([Tova M.] + [Slava N.]))

+ [Tova M.] · [A Sample...])

+ [VLDB] · [Tova M.] ·([2006] · [Querying...]+ [2007] · [Monitoring...])

[TAU] ·([Tova M.] ·([VLDB] ·([2006] · [Querying...]+ [2007] · [Monitoring...]))

+ [SIGMOD] · [2014] ·([OASSIS...] + [A Sample...]))

+ [Slava N.] · [OASSIS...] · [SIGMOD] · [2014])

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 10 / 23

Page 23: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Provenance Factorization

Idea

Use algebraic factorization of the provenance to take-out commonvalues that appear in multiple assignments

Provenance[TAU]·[Tova M.]·[OASSIS...]·[SIGMOD]·[2014]+[TAU]·[Tova M.]·[Querying...]·[VLDB]·[2006]+[TAU]·[Tova M.]· [Monitoring..]·[VLDB]·[2007]+[TAU]·[Slava N.]·[OASSIS...]·[SIGMOD]·[2014]+[TAU]·[Tova M.]·[A sample...]·[SIGMOD]·[2014]

Two Different Factorizations[TAU] ·([SIGMOD] · [2014] ·([OASSIS...] ·([Tova M.] + [Slava N.]))

+ [Tova M.] · [A Sample...])

+ [VLDB] · [Tova M.] ·([2006] · [Querying...]+ [2007] · [Monitoring...])

[TAU] ·([Tova M.] ·([VLDB] ·([2006] · [Querying...]+ [2007] · [Monitoring...]))

+ [SIGMOD] · [2014] ·([OASSIS...] + [A Sample...]))

+ [Slava N.] · [OASSIS...] · [SIGMOD] · [2014])

Shortermeansbetter?

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 10 / 23

Page 24: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

T -CompatibilityNL Query

Return the organization of authors who published papers in database conferences after 2005

Shortest Factorization

[TAU] ·([SIGMOD] · [2014] ·([OASSIS...] ·([Tova M.] + [Slava N.]))

+ [Tova M.] · [A Sample...])

+ [VLDB] · [Tova M.] ·([2006] · [Querying...]+ [2007] · [Monitoring...])

As a Sentence

TAU is the organization of authors who published inSIGMOD 2014

’OASSIS...’ which was published byTova M. and Slava N.

and Tova M. published ’A sample...’

and Tova M. published in VLDB

’Querying...’ in 2014

and ’Monitoring...’ in 2007.

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 11 / 23

Page 25: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

T -Compatibility

Shortest Factorization[TAU] ·([SIGMOD] · [2014] ·([OASSIS...] ·([Tova M.] + [Slava N.]))

+ [Tova M.] · [A Sample...])

+ [VLDB] · [Tova M.] ·([2006] · [Querying...]+ [2007] · [Monitoring...])

Return

organizationPOS=NN, REL=dobj

ofPOS=IN, REL=prep

authorsPOS=NNS, REL=pobj

publishedPOS=VBD, REL=rcmod

in

conferencesPOS=NNS, REL=pobj

databasePOS=NN, REL=nn

afterPOS=IN, REL=prep

2005POS=CD, REL=pobj

paperswho

the

conferences ≤T authors but conferences 6≤fbad authors

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 11 / 23

Page 26: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

T -CompatibilityNL Query

Return the organization of authors who published papers in database conferences after 2005

Longer, T -Compatible Factorization

[TAU] ·([Tova M.] ·([VLDB] ·([2006] · [Querying...]+ [2007] · [Monitoring...]))

+ [SIGMOD] · [2014] ·([OASSIS...] + [A Sample...]))

+ [Slava N.] · [OASSIS...] · [SIGMOD] · [2014])

As a Sentence

TAU is the organization of

Tova M. who published

in VLDB

’Querying...’ in 2006 and

’Monitoring...’ in 2007

and in SIGMOD in 2014

’OASSIS...’ and ’A sample...’

and Slava N. who published

’OASSIS...’ in SIGMOD in 2014.

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 11 / 23

Page 27: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Factorization Algorithm

Proposition

Obtaining a minimal T -compatible factorization is coNP-hard

Algorithm

Factorize greedily: traverse the dependency tree level-by-level

For every level with mapped words, factorize their correspondingvalues in the provenance

Prioritize which values to take-out in each level by frequency

Complexity

O(n2 · log n): recursively traverse the dependency tree and sort thevariables at each layer by their frequency in O(n · log n)

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 12 / 23

Page 28: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Factorization Algorithm

Proposition

Obtaining a minimal T -compatible factorization is coNP-hard

Algorithm

Factorize greedily: traverse the dependency tree level-by-level

For every level with mapped words, factorize their correspondingvalues in the provenance

Prioritize which values to take-out in each level by frequency

Complexity

O(n2 · log n): recursively traverse the dependency tree and sort thevariables at each layer by their frequency in O(n · log n)

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 12 / 23

Page 29: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Factorization Algorithm

Proposition

Obtaining a minimal T -compatible factorization is coNP-hard

Algorithm

Factorize greedily: traverse the dependency tree level-by-level

For every level with mapped words, factorize their correspondingvalues in the provenance

Prioritize which values to take-out in each level by frequency

Complexity

O(n2 · log n): recursively traverse the dependency tree and sort thevariables at each layer by their frequency in O(n · log n)

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 12 / 23

Page 30: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Factorization Example

organizationPOS=NN, REL=dobj

ofPOS=IN, REL=prep

authorsPOS=NNS, REL=pobj

publishedPOS=VBD, REL=rcmod

in

conferencesPOS=NNS, REL=pobj

databasePOS=NN, REL=nn

afterPOS=IN, REL=prep

2005POS=CD, REL=pobj

paperswho

the

[TAU]·[Tova M.]·[OASSIS...]·[SIGMOD]·[2014]+[TAU]·[Tova M.]·[Querying...]·[VLDB]·[2006]+[TAU]·[Tova M.]· [Monitoring..]·[VLDB]·[2007]+[TAU]·[Slava N.]·[OASSIS...]·[SIGMOD]·[2014]+[TAU]·[Tova M.]·[A sample...]·[SIGMOD]·[2014]

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 13 / 23

Page 31: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Factorization Example

organizationPOS=NN, REL=dobj

ofPOS=IN, REL=prep

authorsPOS=NNS, REL=pobj

publishedPOS=VBD, REL=rcmod

in

conferencesPOS=NNS, REL=pobj

databasePOS=NN, REL=nn

afterPOS=IN, REL=prep

2005POS=CD, REL=pobj

paperswho

the

[TAU] ·([Tova M.]·[OASSIS...]·[SIGMOD]·[2014]+[Tova M.]·[Querying...]·[VLDB]·[2006]+[Tova M.]· [Monitoring..]·[VLDB]·[2007]+[Slava N.]·[OASSIS...]·[SIGMOD]·[2014]+[Tova M.]·[A sample...]·[SIGMOD]·[2014])

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 13 / 23

Page 32: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Factorization Example

organizationPOS=NN, REL=dobj

ofPOS=IN, REL=prep

authorsPOS=NNS, REL=pobj

publishedPOS=VBD, REL=rcmod

in

conferencesPOS=NNS, REL=pobj

databasePOS=NN, REL=nn

afterPOS=IN, REL=prep

2005POS=CD, REL=pobj

paperswho

the

[TAU] ·([Tova M.] ·([OASSIS...]·[SIGMOD]·[2014]+[Querying...]·[VLDB]·[2006]+[Monitoring..]·[VLDB]·[2007]+[A sample...]·[SIGMOD]·[2014])+[Slava N.] · [OASSIS...] · [SIGMOD] · [2014])

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 13 / 23

Page 33: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Factorization Example

organizationPOS=NN, REL=dobj

ofPOS=IN, REL=prep

authorsPOS=NNS, REL=pobj

publishedPOS=VBD, REL=rcmod

in

conferencesPOS=NNS, REL=pobj

databasePOS=NN, REL=nn

afterPOS=IN, REL=prep

2005POS=CD, REL=pobj

paperswho

the

[TAU] ·([Tova M.] ·([VLDB] ·([2006] · [Querying...]+ [2007] · [Monitoring...]))

+ [SIGMOD] · [2014] ·([OASSIS...] + [A Sample...]))

+ [Slava N.] · [OASSIS...] · [SIGMOD] · [2014])

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 13 / 23

Page 34: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Factorization Example

organizationPOS=NN, REL=dobj

ofPOS=IN, REL=prep

authorsPOS=NNS, REL=pobj

publishedPOS=VBD, REL=rcmod

in

conferencesPOS=NNS, REL=pobj

databasePOS=NN, REL=nn

afterPOS=IN, REL=prep

2005POS=CD, REL=pobj

paperswho

the

TAU is the organization of

Tova M. who published

in VLDB

’Querying...’ in 2006 and

’Monitoring...’ in 2007

and in SIGMOD in 2014

’OASSIS...’ and ’A sample...’

and Slava N. who published

’OASSIS...’ in SIGMOD in 2014.

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 13 / 23

Page 35: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Outline

1 Introduction

2 Mappings and Answer Tree - Single Assignment

3 Factorization

4 Summarization

5 Experiments

6 Related Work and Conclusions

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 14 / 23

Page 36: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Summarization

Two Levels of Summarization[TAU] ·

A

([Tova M.] ·

B

([VLDB] ·([2006] · [Querying...]+ [2007] · [Monitoring...]))

+ [SIGMOD] · [2014] ·([OASSIS...] + [A Sample...]))

B

+ [Slava N.] · [OASSIS...] · [SIGMOD] · [2014])

A

Shorter Summarized Answer Based on A

TAU is the organization of 2 authors who published

4 papers in 2 conferences in 2006 - 2014

More Detailed Summarized Answer Based on B

TAU is the organization of Tova M. who published

4 papers in 2 conferences in 2006 - 2014 and Slava N.

who published ’OASSIS...’ in SIGMOD in 2014.

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 15 / 23

Page 37: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Outline

1 Introduction

2 Mappings and Answer Tree - Single Assignment

3 Factorization

4 Summarization

5 Experiments

6 Related Work and Conclusions

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 16 / 23

Page 38: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Sample Use-Cases

Q: Return the authors who published papers in VLDB before 2016 andafter 2007

A: Tova M. published 16 papers in VLDB in 2008 - 2015

Q: Return the authors who published papers in database conferences

A: Tova M. published 134 papers in 18 conferences

Q: Return the organization of authors who published papers in databaseconferences after 2005

A: TAU is the organization of 43 authors who published 170 papers in31 conferences in 2006 - 2015

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 17 / 23

Page 39: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Sample Use-Cases

Q: Return the authors who published papers in VLDB before 2016 andafter 2007

A: Tova M. published 16 papers in VLDB in 2008 - 2015

Q: Return the authors who published papers in database conferences

A: Tova M. published 134 papers in 18 conferences

Q: Return the organization of authors who published papers in databaseconferences after 2005

A: TAU is the organization of 43 authors who published 170 papers in31 conferences in 2006 - 2015

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 17 / 23

Page 40: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Sample Scalability ResultsComputation time as a function of the number of assignments.Overhead of only 16% w.r.t evaluation time.

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

0 1000 2000 3000 4000 5000

Tim

e (

sec)

Number of Assignments

Query 4

Query 5

Query 6

Query 7

Query 8

Query 9

Query 10

Query 11

Query 12

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 18 / 23

Page 41: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Breakdown of Computation Time

0

0.1

0.2

0.3

0.4

0.5

0.6

0 1000 2000 3000 4000 5000

Tim

e (

sec)

Domain of Unique Values Per Answer

Query 4 Query 5 Query 6 Qurey 7 Query 8 Query 9 Query 10 Query 11 Query 12

(a) Factorization time

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0 1000 2000 3000 4000 5000

Tim

e (

sec)

Domain of Unique Values Per Answer

Query 4 Query 5 Query 6 Qurey 7 Query 8 Query 9 Query 10 Query 11 Query 12

(b) Sentence gen. time

Summarization time was negligible.

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 19 / 23

Page 42: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Outline

1 Introduction

2 Mappings and Answer Tree - Single Assignment

3 Factorization

4 Summarization

5 Experiments

6 Related Work and Conclusions

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 20 / 23

Page 43: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Related Work

NL Interfaces:

Formulate the NL query and present the answers, e.g., [Fei Li et al.,15’], [Song et al., 15’]

Present the answers in NL based on the schema [Franconi et al., 14’]

Explain the query in NL [Koutrika et al., 10’]

Provenance:

Showing the provenance in graph form, e.g., [Ailamaki et al., 98’],[Davidson et al., 08’]

Allowing user control over granularity [Cohen-Boulakia et al., 08’]

Provenance factorization and Summarization, e.g., [Chapman et al.,08’], [Olteanu et al., 12’]

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 21 / 23

Page 44: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Summary

Main Contributions:

First to formulate the provenance of output tuples in NL

Employing both factorization and summarization to make provenancemore understandable

Devising a criterion for provenance factorization that accounts for itspresentation in NL

Future Work:

Extend the solution to UCQs, aggregation, nested queries, ...

Support more provenance models

Generalize the requirements from NL interfaces

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 22 / 23

Page 45: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Summary

Main Contributions:

First to formulate the provenance of output tuples in NL

Employing both factorization and summarization to make provenancemore understandable

Devising a criterion for provenance factorization that accounts for itspresentation in NL

Future Work:

Extend the solution to UCQs, aggregation, nested queries, ...

Support more provenance models

Generalize the requirements from NL interfaces

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 22 / 23

Page 46: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir

Thank YouQuestions?

Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 23 / 23


Recommended