Facebook Graph Search by Ole martin mørk for jdays2013 Gothenburg

Post on 09-Jul-2015

326 views 0 download

description

FACEBOOK GRAPH SEARCH, Presentation at jdays2013 www.jdays.se

transcript

FACEBOOK GRAPH SEARCH

How to create your own graph search using Neo4j

jDays 2013 Ole-Martin Mørk

26/11/13

FACEBOOK GRAPH SEARCH

How to create your own graph search using Neo4j

jDays 2013 Ole-Martin Mørk

26/11/13

ABOUT ME

Ole-Martin Mørk Scientist Bekk Consulting AS Oslo, Norway twitter: olemartin

INTRODUCTION TO SEARCH

INTRODUCTION TO NEO4J

INTRODUCTION TO PARSING

GRAPH SEARCH

AGENDA

GRAF

LOVES

BETRAYS

KNOWS

KNOWS

KNOWS

NODE

PERSON

navn: Thomas alder: 24

ADRESSE

gate: Aker

nummer: 15

BODDE

RELASJON

BODDE fra: til:

RELASJONSDATABASER

“PATH EXISTS” RESPONSTID

-  One database containing 1000 persons

-  Max 50 friends

-  Detect if two random persons are connected via friends

Neo4j 1.000.000 2ms Neo4j 1.000 2ms

Antall personer Responstid Relational db 1.000 2000ms

Neo4j

GRAF

A

D

C B

E

G

H

F

I

CYPHER

GRAF

A

D

C B

E

G

H

F

I

Cypher

CYPHER

( ) --> ( )

CYPHER

(a) --> (b) a b

CYPHER

(a)-->(b)<--(c) a b c

CYPHER

(a) –[:kjenner]-> (b) a b kjenner

CYPHER SØK

START person=node:person(name=“Ole-Martin”) START school=node:school(“name:Norw*”) START student=node:student(“year:(3 OR 4 OR 5)”)

FACEBOOK GRAPH SEARCH WITH CYPHER

START me=node:person(name = “Ole-Martin”), location=node:location(location=“Göteborg”), cuisine=node:cuisine(cuisine=“Sushi”) MATCH (me)-[:IS_FRIEND_OF]->(friend)-[:LIKES]->(restaurant) -[:LOCATED_IN]->(location),(restaurant)-[:SERVES]->(cuisine) RETURN restaurant

Grammar

GRAMMAR

A “language” can be formally defined as “any system of formalized symbols, signs, etc. used for communication” A “grammar” can be defined as a “the set of structural rules that governs sentences, words, etc. in a natural language”

TEXT PARSING

PEG CFG

GRAMMAR

Alfred, who loved fishing, bought fish at the store downtown

(Alfred, (who loved (fishing)), (bought (fish (at (the store (downtown))))))

additionExp! : multiplyExp! ( '+' multiplyExp! | '-' multiplyExp! )* ! ; !!multiplyExp! : atomExp! ( '*' atomExp! | '/' atomExp! )* ! ; !!atomExp! : Number ! | '(' additionExp ')' ! ; !!Number ! : ('0'..'9')+ ! ; !

An additionExp is defined as a multiplyExp + or - a multiplyExp

A multiplyExp is defined as an atomExp * or / an atomExp

An atomExp is defined as a number or a parenthesized additionExp

Number is one or more character between 0-9

class CalculatorParser extends BaseParser<> { ! ! Rule Expression() { ! return Sequence( ! Term(), ! ZeroOrMore(AnyOf("+-"), Term()) ! ); ! } ! ! Rule Term() { ! return Sequence( ! Factor(), ! ZeroOrMore(AnyOf("*/"), Factor()) ! ); ! } ! ! Rule Factor() { ! return FirstOf( ! Number(), ! Sequence('(', Expression(), ')') ! ); ! } ! ! Rule Number() { ! return OneOrMore(CharRange('0', '9')); ! } ! } !

An expression is a sequence of Term followed by zero or more “+” or “-” followed by a Term

Term is a Factor followed by zero or more sequences of “*” or “/” followed by a factor

Factor is a number or a parenthesized expression

Number is a one or more characters between 0-9

PEG VS CFG

PEGs firstof operator vs CFG’s | operator PEG does not have a separate tokenizing step

CFG might come across as more powerful, but also more difficult to master PEG does not allow ambiguity in the grammar

PARBOILED

Parsing expression grammars parser

Lightweight

Easy to use

Implementation in Scala and Java

Rules are written in the programming language

PARBOILED

class CalculatorParser extends BaseParser<> { ! ! Rule Expression() { ! return Sequence( ! Term(), ! ZeroOrMore(AnyOf("+-"), Term()) ! ); ! } ! ! Rule Term() { ! return Sequence( ! Factor(), ! ZeroOrMore(AnyOf("*/"), Factor()) ! ); ! } ! ! Rule Factor() { ! return FirstOf( ! Number(), ! Sequence('(', Expression(), ')') ! ); ! } ! ! Rule Number() { ! return OneOrMore(CharRange('0', '9')); ! } ! } !

An expression is a sequence of Term followed by zero or more “+” or “-” followed by a Term

Term is a Factor followed by zero or more sequences of “*” or “/” followed by a factor

Factor is a number or a parenthesized expression

Number is a one or more characters between 0-9

I went for a walk downtown

Sequence( “I”, “went”, “for”, “a”, “walk”, “downtown”)

I went

for a walk downtown

Sequence( String(“I”), FirstOf(“went”, “wend”), Sequence(“for”, “a”, “walk”), FirstOf(“downtown”, “to the city”));

wend to the city

I went

for a walk downtown

Sequence( …, Optional(String(“today”)));

walked to the city today

Sequence( Today(), …, Today());

Rule Today() { return Optional(String(“today”)); }

I went

for a walk downtown

walked to the city today today

Rule AnyOf(java.lang.String characters) Creates a new rule that matches any of the characters in the given string.

Rule Ch(char c) Explicitly creates a rule matching the given character.

Rule CharRange(char cLow, char cHigh) Creates a rule matching a range of characters from cLow to cHigh (both inclusively).

Rule FirstOf(java.lang.Object... rules) Creates a new rule that successively tries all of the given subrules and succeeds when the first one of its subrules matches.

Rule IgnoreCase(char... characters) Explicitly creates a rule matching the given string in a case-independent fashion.

Rule NoneOf(char... characters) Creates a new rule that matches all characters except the ones in the given char array and EOI.

Rule NTimes(int repetitions, java.lang.Object rule) Creates a new rule that repeatedly matches a given sub rule a certain fixed number of times.

!

Sequence( …, FirstOf( Sequence(push(“downtown”), “downtown”, Sequence(push(“city”), “to the city”)));

I went

for a walk downtown

walked to the city today today

BEKK CV-db

A search for Java

yields 224 hits!

public Rule Expression() { ! return Sequence( ! Start(), ! FirstOf(People(), Projects(), Technologies()), ! OneOrMore( ! FirstOf( ! And(), ! Sequence( ! Know(), ! Subjects() ! ), ! Sequence( ! WorkedAt(), ! Customers() ! ), ! Sequence( ! Know(), ! Customers() ! ), ! Sequence( ! Know(), ! Technologies() !

") " ") " ") " ); !} !

start !"fag=node:fag(navn = "Neo4J"), "fag1=node:fag(navn = "Java"), "prosjekt2=node:prosjekt(navn ="Modernisering") !

match !"CONSULTANTS -[:KAN]-> fag, !"CONSULTANTS -[:KAN]-> fag1, !"CONSULTANTS -[:KONSULTERTE]-> prosjekt !

return !"distinct CONSULTANTS !

!

Demo

LEARN MORE

graphdatabases.com neo4j.org bit.ly/neo-cyp parboiled.org

?

Thank you!

@olemartin