Date post: | 10-Apr-2017 |
Category: |
Technology |
Upload: | damien-seguy- |
View: | 739 times |
Download: | 0 times |
A gremlin ate my graph
Barcelona, Spain, October, 30th 2015
Agenda
Discover the Graph
Steps with a gremlin
The state of the Gremlin
Damien Seguy
CTO at Exakat
PHP code as Dataset
Speaker
What is gremlin?
V : Vertices, or nodes or objects
E : Edges, or links or relations
G : graph, or the dataset
The Vertices
g.v represents all the vertices
g.v(1) is one of the nodes
Vertices always have an id
g.v(1) => v(1)
1
g.v(1).id => 1 g.v(1).name => apply_filter g.v(1).version => true g.v(1).compat => [4.0, 4.1, 4.2, 4.3]
// non-existing properties g.v(1).isFedAfterMidnight => null
PropertiesGraph is schemaless
apply_filter
Vertice discoveryUse map to discover the graph
g.v(2).map => {name=wp_die, leaf=true}
wp_die
The Edges
Edges have id, properties
also : start, end and label
g.E represents all the edges
g.e(1) is the edge with Id 1
g.e(1) => e(1) g.e(1).map => { } g.e(1).id => 1 g.e(1).label => CALLS
Edge discovery
wp_diewp_ajax_fetch_list CALLS
Edges link the vertices
g.e(5).outV => v(2) g.e(6).outV => v(3)
g.e(1).inV => v(4)
Running on the Edges
2
3
4
1
5
6
7
Directed graph
g.v(1).out => v(2) v(3)
g.v(1).in => v(4)
g.v(1).both => v(2) v(3)
v(4)
Following Edges
2
3
4
1
5
6
7
g.v(1).inE => e(7)
g.v(1).out.id => 2 3
g.v(2).in.in.id => 4
g.v(1).both.id => 2 3
4
Chaining
2
3
4
1
5
6
7
Wordpress Calls GraphThe graph of all Wordpress internal function calls
function:name
function:nameCALLS
function functionCALLS
g.v(19).out(‘CALLS’).name => wp_hash_password wp_cache_delete
g.v(19).in(‘CALLS’).name => reset_password wp_check_password
Calling home
wp_set_password
reset_password
wp_hash_password
wp_check_password
wp_cache_delete
CALLS
CALLS
CALLS
CALLS
g.v(30).out(‘CALLS’) .retain([g.v(30)]) .name => get_category_parents
Is it Recursive
get_permalink get_category_link
get_category_parentsid : 30
CALLS
CALLSCALLS
Is it Recursive
g.v(30).in(‘CALLS’) .retain([g.v(30)]) .name => get_category_parents
get_permalink get_category_link
get_category_parentsid : 30
CALLS
CALLSCALLS
g.v(47).out(‘CALLS’).except([g.v(47)]) .out('CALLS').retain([g.v(47)])
.name => wp_trash_comment
Ping-Pong Function
CALLS
wp_trash_commentid: 47
wp_delete_commentid : 148
CALLS
CALLS
CALLS
CALLS
CALLS
CALLS
CALLS
CALLS
Up to now
nodes and vertices : basic blocs
in and out (and both) : navigation
except(), retain(), in(‘label’) : filtering
Starting at the vertices
Traversing the graph
Finding nodes in the graph that satisfy criteria
Traversing involves listing nodes, following links, applying filters, processing data until all conditions are met
Starting point : g.V and g.E
Counting
g.V.count() => 2691 g.E.count() => 9013
function functionCALLS
Samplingg.V[0..3].name => do_activate_header do_action wpmu_activate_stylesheet comment_footer_die
g.V[0..3].id => 1 3
5 4
Filtering
g.V.has('name','wp_die') => v(25);
wp_die
Dying Functions
g.V.out('CALLS') .has('name','wp_die') .count() => 84
???? wp_dieCALLS
g.V.out('CALLS') .has('name','wp_die') .name =>
Dying Functions
???? wp_dieCALLS
PROCESSING
wp_die wp_die wp_die wp_die wp_die wp_die wp_die wp_die wp_die
g.V.has('name','wp_die') .in('CALLS') .name => wp_ajax_trash_post wp_ajax_delete_post wp_ajax_delete_meta wp_ajax_delete_link wp_ajax_delete_tag wp_ajax_delete_comment wp_ajax_oembed_cache wp_ajax_imgedit_preview we_ajax_fetch_list
Dying Functions
???? wp_dieCALLS
PROCESSING
g.V.as('start') .out('CALLS') .has('name','wp_die') .back('start') .name => wp_ajax_trash_post wp_ajax_delete_post wp_ajax_delete_meta wp_ajax_delete_link wp_ajax_delete_tag wp_ajax_delete_comment wp_ajax_oembed_cache wp_ajax_imgedit_preview we_ajax_fetch_list
Dying Functions
???? wp_dieCALLS
PROCESSING
Relay Functions
g.V.filter{ it.out('CALLS').count() == 1} .count() => 650
CALLS
CALLS
CALLS
Closures
Steps often offer possibility for closure
Closure is between {} , uses ‘it’ as current node, is written in Groovy (or else)
Closure often have a standard default behavior, so they are sometimes stealth
Applied to properties
Non standard functions
g.V.filter{ it.name != it.name.toLowerCase()} .count() => 73
Leaf and Roots
LEAF
ROOT
g.V.filter{ it.out('CALLS').any() == false} .count() => 407
g.V.filter{ it.in('CALLS').any() == false} .count() => 1304
Get Linking Indexg.V.transform{['name':it.name,
'links':it.in('CALLS').count()]}
=> ... {'name':wpmu_signup_stylesheet, 'links':0} {'name':show_blog_form, 'links':7} {'name':validate_blog_form, 'links':3} {'name':show_user_form, 'links':4}
Get Called Indexg.V.transform{['name':it.name,
'links':it.in('CALLS').count()] } .order{ it.a.links <=> it.b.links}
=> {'name':get_post, 'links':191} {'name':get_option, 'links':218} {'name':_deprecated_function, 'links':296} {'name':__, 'links':442} {'name':apply_filters, 'links':598}
Most linked Function groupCount(m)
m = [:]; g.V.groupCount(m); m; => { v[1] = 1, v[2] = 1, ... v[47] = 1, }
Most linked Function groupCount(m){key}{value}
m = [:]; g.V.groupCount(m){it.name} {it.in('CALLS').count()}; m; => { wp_restore_image = 1, press_this_media_buttons = 0, WP_Filesystem = 3, ...
}
Most linked Function groupCount(m){key}{value}
m = [:]; g.V.groupCount(m){it.name} {it.in('CALLS').count()};
m.sort{ -it.value }[0..2]; => { apply_filters = 598, __ = 442, _deprecated_function = 296 }
Most linked Function
m = [:];n = [:]; g.V.groupCount(m){it.name} {it.in('CALLS').count()} .groupCount(n){it.name} {it.out('CALLS').count()};
n.sort{ -it.value}[0..2]; => { redirect_canonical = 60, export_wp = 47, edit_post = 36 }
SideEffect StepssideEffect : emit incoming but allows for side computation
m = [:];n = [:]; g.V.groupCount(m){it.name} {it.in('CALLS').count()} .groupCount(n){it.name} {it.out('CALLS').count()} .count(); => 2692
Nature of Stepsfilter step : emit input if condition is satisfied : has(), filter(), retain(), except()
map step : transform input into another object : in(), out(), BOTH()
sideEffect step : emit input but allows for side computation : transform, groupCount, each, sideEffect
Branch and flatMap
Lonely Functions
g.V.filter{ it.in('CALLS').any() == false} .filter{ it.out('CALLS').any() == false} .sideEffect{ it.lonely = true; } .count() => 184
SELECT, UPDATE AND COUNT
Updating a nodeg.V.sideEffect{ incoming = it.in('CALLS').count(); } .each{ it.setProperty('incoming', incoming); it.setProperty('outgoing', it.out('CALLS').count());
}
Updating The Graph// removing deprecated functions
g.V.filter{ it.out('CALLS') .has('name', '_deprecated_function') .any() }
.each{ it.bothE.each{ g.removeEdge(it); }
g.removeVertex(it); }
State of Gremlin
Apache TinkerPop
http://tinkerpop.incubator.apache.org/
Version : 3.0.2
TP2 and TP3
groupCount{}{} map
group().by().by() ValueMap()
Vendors
StarDogsqlg
Gremlin Variants
Gremlin For PHP
https://github.com/PommeVerte/gremlin-php
Get up and running with Tinkerpop 3 and PHP : https://dylanmillikin.wordpress.com/2015/07/20/get-up-and-running-with-tinkerpop-3-and-php/
Using with Neo4j : REST API
Older API : neo4jPHP, rexpro-php
Thanks
[email protected] @exakat
http://www.slideshare.net/dseguy/
on the http://2015.phpconference.es//