Incremental Graph Queries for Cypher€¦ · Batch vs. incremental queries Batch queries (pull /...

Post on 13-Aug-2020

39 views 0 download

transcript

Budapest University of Technology and EconomicsDepartment of Measurement and Information SystemsDepartment of Telecommunications and Media Informatics

Budapest University of Technology and EconomicsMcGill University, Montréal

Incremental Graph Queries for Cypher

Gábor Szárnyas, József Marton

Live railway model

Live railway model

Live railway model

Live railway model

Live railway model

Proximity detection

Live railway model

Proximity detection

Live railway model

Proximity detection

Live railway model

Trailing the switch

Proximity detection

Live railway model

Live railway model

Live railway model

c d e

g

fdiv

2

a b

1

Live railway model

c d e

g

fdiv

2

NEXT NEXT

STRAIGHT TOP

ON

a b

1

NEXT

ON

NEXT

Proximity detection

Proximity detection

≤ 𝟐 segments

Proximity detection

seg1

NEXT: 1..2

t1

ON

Proximity detection

seg2

t2

ON≤ 𝟐 segments

Proximity detection

seg1

NEXT: 1..2

t1

ON

MATCH

(t1:Train)-[:ON]->(seg1:Segment)

-[:NEXT*1..2]->(seg2:Segment)

<-[:ON]-(t2:Train)

RETURN t1, t2, seg1, seg2

Proximity detection

seg2

t2

ON≤ 𝟐 segments

Proximity detection

seg1

NEXT: 1..2

t1

ON

MATCH

(t1:Train)-[:ON]->(seg1:Segment)

-[:NEXT*1..2]->(seg2:Segment)

<-[:ON]-(t2:Train)

RETURN t1, t2, seg1, seg2

Proximity detection

seg2

t2

ON≤ 𝟐 segments

Trailing the switch

Trailing the switch

seg div

t

STRAIGHT

ON

Trailing the switch

seg div

t

STRAIGHT

ON

MATCH (t:Train)-[:ON]->(seg:Segment)

<-[:STRAIGHT]-(sw:Switch)

WHERE sw.position = 'diverging'

RETURN t.number, sw

Trailing the switch

seg div

t

STRAIGHT

ON

MATCH (t:Train)-[:ON]->(seg:Segment)

<-[:STRAIGHT]-(sw:Switch)

WHERE sw.position = 'diverging'

RETURN t.number, sw

Trailing the switch

seg div

t

STRAIGHT

ON

MATCH (t:Train)-[:ON]->(seg:Segment)

<-[:STRAIGHT]-(sw:Switch)

WHERE sw.position = 'diverging'

RETURN t.number, sw

Evaluate continuously

Incremental queries

Incremental queries

Register a set of standing queries

Continuously evaluate queries on changes

Incremental queries

Register a set of standing queries

Continuously evaluate queries on changes

The Rete algorithm (1974)

o Originally for rule-based expert systems

o Indexes the graph and caches interim query results

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON

divSTRAIGHT

Trailing the switchO

N

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON

c d e

g

fdiv

2

NEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXTO

N

ON

divSTRAIGHT

Trailing the switchO

N

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON

c d e

g

fdiv

2

NEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXTO

N

ON

divSTRAIGHT

Trailing the switchO

N

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON

c d e

g

fdiv

2

NEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXTO

N

ON

divSTRAIGHT

Trailing the switchO

N

a

1

ON

e

2

ON

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON e2ON

a1ON

c d e

g

fdiv

2

NEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXTO

N

ON

divSTRAIGHT

Trailing the switchO

N

a

1

ON

e

2

ON

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON e2ON

a1ON

c d e

g

fdiv

2

NEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXTO

N

ON

divSTRAIGHT

Trailing the switchO

N

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON e2ON

a1ON

c d e

g

fdiv

2

NEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXTO

N

ON

divSTRAIGHT

Trailing the switchO

N

e divSTRAIGHT

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON e2ON

a1ON

c d e

g

fdiv

2

NEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXTO

N

ON

divSTRAIGHT

Trailing the switchO

N

e divSTRAIGHT

e divSTRAIGHT

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON e divSTRAIGHT

e2ON

a1ON

c d e

g

fdiv

2

NEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXTO

N

ON

divSTRAIGHT

Trailing the switchO

N

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON e divSTRAIGHT

e2ON

a1ON

c d e

g

fdiv

2

NEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXTO

N

ON

divSTRAIGHT

Trailing the switchO

N

e divSTRAIGHT

e2ON

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON e divSTRAIGHT

e2ON

a1ON

c d e

g

fdiv

2

NEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXTO

N

ON

divSTRAIGHT

Trailing the switchO

N

e divSTRAIGHT

e2ON

e div

2

STRAIGHT

ON

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON e divSTRAIGHT

e2ON

a1ON

e divSTRAIGHT

2ON

c d e

g

fdiv

2

NEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXTO

N

ON

divSTRAIGHT

Trailing the switchO

N

e divSTRAIGHT

e2ON

e div

2

STRAIGHT

ON

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON e divSTRAIGHT

e2ON

a1ON

divSTRAIGHTON

c d e

g

fdiv

2

NEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXTO

N

ON

divSTRAIGHT

Trailing the switchO

N

e2 divSTRAIGHTON

e2

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON e divSTRAIGHT

e2ON

a1ON

divSTRAIGHTON

e divSTRAIGHT

2ON

c d e

g

fdiv

2

NEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXTO

N

ON

divSTRAIGHT

Trailing the switchO

N

e2 divSTRAIGHTON

e2

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON e divSTRAIGHT

e2ON

a1ON

e divSTRAIGHT

2ON

e divSTRAIGHT

2ON

c d e

g

fdiv

2

NEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXTO

N

ON

divSTRAIGHT

Trailing the switchO

N

div

2

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON e divSTRAIGHT

e2ON

a1ON

e divSTRAIGHT

2ON

e divSTRAIGHT

2ON

div2

c d e

g

fdiv

2

NEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXTO

N

ON

divSTRAIGHT

Trailing the switchO

N

div

2

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON e divSTRAIGHT

e2ON

a1ON

e divSTRAIGHT

2ON

e divSTRAIGHT

2ON

div2

c d e

g

fdiv

2

NEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXTO

N

ON

divSTRAIGHT

Trailing the switchO

N

div

2

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON e divSTRAIGHT

e2ON

a1ON

e divSTRAIGHT

2ON

e divSTRAIGHT

2ON

div2

c e

g

fdivNEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXT

ON

divSTRAIGHT

Trailing the switchO

N

div

ON

2

d

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON e divSTRAIGHT

d2ON

a1ON

e divSTRAIGHT

2ON

e divSTRAIGHT

2ON

div2

c e

g

fdivNEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXT

ON

divSTRAIGHT

Trailing the switchO

N

div

ON

2

d

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON e divSTRAIGHT

d2ON

a1ON

e divSTRAIGHT

2ON

div2

c e

g

fdivNEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXT

ON

divSTRAIGHT

Trailing the switchO

N

div

ON

2

d

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON e divSTRAIGHT

d2ON

a1ON

div2

c e

g

fdivNEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXT

ON

divSTRAIGHT

Trailing the switchO

N

div

ON

2

d

πt.number, sw

σsw.position = ′diverging′

STRAIGHTON e divSTRAIGHT

d2ON

a1ON

c e

g

fdivNEXT NEXT

STRAIGHT TOPa b

1

NEXT NEXT

ON

divSTRAIGHT

Trailing the switchO

N

div

ON

2

d

Batch vs. incremental queries

Batch queries(pull / request-driven):

1. Client selects a query

2. Results are calculated

Query resultsobtained on demand

Batch vs. incremental queries

Batch queries(pull / request-driven):

1. Client selects a query

2. Results are calculated

Query resultsobtained on demand

Incremental queries(push / event-driven):

1. Client registers queries

2. Graph is changed

3. Results are maintained

4. Goto 2

Query results arealways available

Incremental query engines

CLIPS C structures NASA

Drools POJO Red Hat

VIATRA EMF BME / IncQuery Labs.

Incremental query engines

CLIPS C structures NASA

Drools POJO Red Hat

VIATRA EMF BME / IncQuery Labs.

INSTANS RDF Aalto University

i3QL POJO TU Darmstadt

IncQuery-D RDF BME

Incremental query engines

CLIPS C structures NASA

Drools POJO Red Hat

VIATRA EMF BME / IncQuery Labs.

INSTANS RDF Aalto University

i3QL POJO TU Darmstadt

IncQuery-D RDF BME

No implementations for property graphs yet

ingraph

An incremental, in-memory graph query engine

ingraph

An incremental, in-memory graph query engine

ingraphclient

ingraph

An incremental, in-memory graph query engine

ingraphclient

register queries

ingraph

An incremental, in-memory graph query engine

ingraphclient

register queries

update graph

ingraph

An incremental, in-memory graph query engine

ingraphclient

register queries

query results

update graph

ingraph

An incremental, in-memory graph query engine

ingraphclient

register queries

query results

change notifications

update graph

MATCH (t:Train)-[:ON]->(seg:Segment)

<-[:STRAIGHT]-(sw:Switch)

WHERE sw.position = 'diverging'

RETURN t.number, sw

openCypherquery

MATCH (t:Train)-[:ON]->(seg:Segment)

<-[:STRAIGHT]-(sw:Switch)

WHERE sw.position = 'diverging'

RETURN t.number, sw

openCypherquery

Querysyntax tree

MATCH (t:Train)-[:ON]->(seg:Segment)

<-[:STRAIGHT]-(sw:Switch)

WHERE sw.position = 'diverging'

RETURN t.number, sw

Queryparser

openCypherquery

Querysyntax tree

MATCH (t:Train)-[:ON]->(seg:Segment)

<-[:STRAIGHT]-(sw:Switch)

WHERE sw.position = 'diverging'

RETURN t.number, sw

Queryparser

openCypherquery

Relationalalgebramodel

Querysyntax tree

MATCH (t:Train)-[:ON]->(seg:Segment)

<-[:STRAIGHT]-(sw:Switch)

WHERE sw.position = 'diverging'

RETURN t.number, sw

Relationalalgebra builder

Queryparser

openCypherquery

Relationalalgebramodel

Querysyntax tree

MATCH (t:Train)-[:ON]->(seg:Segment)

<-[:STRAIGHT]-(sw:Switch)

WHERE sw.position = 'diverging'

RETURN t.number, sw

Relationalalgebra builder

Queryparser

openCypherquery

Relationalalgebramodel

Querysyntax tree

Relationalalgebra model

Rete network

Rete network model

Relationalalgebra model

Rete network

Rete network model

Transformerand optimizer

VIATRA

Relationalalgebra model

Rete network

Rete network model

Transformerand optimizer

VIATRA

Relationalalgebra model

Rete network

Rete network model

Transformerand optimizer

Querydeployer

VIATRA

FORMALIZATION OF OPENCYPHER

Relational algebra

Standard relational algebra

o 𝜋, 𝜎

o∪,∩,∖

o×,⋈

Common extensions

o 𝛿 – duplicate elimination

o 𝛾 – grouping

o 𝜏 – sorting

Graph-specific operators

Jürgen Hölsch, Michael Grossniklaus:An Algebra and Equivalences to Transform Graph Patterns in Neo4j, GraphQ 2016, EDBT,http://ceur-ws.org/Vol-1558/paper24.pdf

GetVertices: returns a graph relation containing all vertices of the underlying graph G

Expand: return the neighbors of a given node

Additional extensions

GetEdges: returns a graph relation containing all edges of the underlying graph G

Gábor Szárnyas, József Marton:openCypher specification, Technical reporthttp://docs.inf.mit.bme.hu/ingraph/pub/opencypher-report.pdf

While pattern matching, Neo4j makes sure to not include matches where the same graph relationship is found multiple times in a single pattern.

Uniqueness of edges

All-different operator

EXAMPLE QUERIES

Uniqueness of edges

Get neighbours

Filter out based on node prop name

Use multiple MATCH clauses to do a Cartesian product

Two subgraphs

SwitchMonitored

INCREMENTAL GRAPH QUERIESWITH OPENCYPHER

openCypher constructs

Standard constructs

o pattern matching

o filtering

o lists, maps

o data manipulation

o variable length paths

openCypher constructs

Standard constructs

o pattern matching

o filtering

o lists, maps

o data manipulation

o variable length paths

Legacy constructs

o indexing, constraints

o regular expressions

o some list functions, including reduce

omost predicate functions

o shortest path functions

o CASE expressions

o id()

openCypher constructs

Standard constructs

o pattern matching

o filtering

o lists, maps

o data manipulation

o variable length paths

Legacy constructs

o indexing, constraints

o regular expressions

o some list functions, including reduce

omost predicate functions

o shortest path functions

o CASE expressions

o id()Difficult to handle

incrementally

Challenges for incremental openCypher

Lists

o ['a', 1, 2, true]

o ['a', [1, [2]], true]

o UNWIND

Efficient aggregation

o min(), max()

o collect()

Bag semantics, ORDER BY, SKIP and LIMIT

o Idea: collect(x ORDER BY x.name)

Incremental queries – use cases

Standing queries on large & quickly changing graph

Runtime monitoring (train example)

Model validation: The Train Benchmark

Static analysis of JavaScript source code

Fraud detection

IT infrastructure monitoring

Future work

Path operatorGiacomo Bergami, Matteo Magnani, and Danilo Montesi, A join operator for property graphs, GraphQ 2016, EDBT, http://jackbergus.alwaysdata.net/paper_graph_graphq2017.pdf

Unwind operator: 𝜔Elena Botoeva et al.:A Formal Presentation of MongoDB,https://arxiv.org/abs/1603.09291

Open-Source Projects

Incremental Graph Engine:https://github.com/ftsrg/ingraph

Train Benchmark:https://github.com/ftsrg/trainbenchmark

BME-MODES3:https://github.com/ftsrg/bme-modes3

Available under EPL v1.0.