Date post: | 05-Apr-2017 |
Category: |
Technology |
Upload: | lauren-hayward-schaefer |
View: | 80 times |
Download: | 0 times |
Intro to Graph Databases A Devoxx France Hands-on Lab
http://ibm.biz/devoxxfr_workbook
Lauren Schaefer @Lauren_Schaefer
April 7, 2017 #DevoxxFR
#IntroToGraph
Table of Contents What are graph databases and why should you care? ................................................................................. 3
How do you create a schema diagram and convert it to code? ................................................................... 6
How do you create a node or edge in a graph database? ............................................................................ 9
How do you read a node or edge in a graph database? ............................................................................. 12
How do you update a node or edge in a graph database? ......................................................................... 14
How do you delete a node or edge in a graph database? .......................................................................... 16
How do you implement a feature that requires a new graph query? ........................................................ 18
How do you create a recommendation engine? ........................................................................................ 23
What are graph databases and why should you care?
Learn this!
When creating a graph, nodes (also known as vertexes) are typically nouns (people,
places, and things) and edges are typically verbs (actions).
Graphs allow you to model the data and relationships just as they exist in real life
without having to map or abstract them to something else.
You can easily traverse graphs, which basically means to follow the connections
between your nodes, to find patterns in your data.
Graphs are helpful with a variety of use cases including generating recommendations,
finding the shortest path, modeling the internet of things, and detecting fraud.
Try this!
1. Sign up for IBM Bluemix by navigating to http://ibm.biz/devoxxfr and signing up. Be
sure to sign up using an email address you currently have access to as you’ll need to
verify your account. When verifying your account, you’ll be prompted for the region
you’d like to use. Select US South as your region.
2. Deploy the Lauren’s Lovely Landscapes app to Bluemix so you’ll have your own copy:
a. Navigate to http://ibm.biz/devoxxfr_deploy.
b. If you are not already authenticated at Bluemix, you might be prompted to do
so.
c. If you have not already selected an alias for your account, you might be
prompted to do so.
d. On the Deploy this application to Bluemix page, select IBM Bluemix US South
in the REGION dropdown.
e. Click Deploy when the button enables.
f. Wait for the deploy to finish. This can take a few minutes. During this time,
Bluemix creates a new project where you can track and plan, stores a copy of
the code in your new project, creates a delivery pipeline so you can configure
automatic deployments, creates an IBM Graph instance on Bluemix, and
deploys the app to Bluemix so you can see it running live.
g. Click VIEW YOUR APP to see the deployed version of the app.
h. The home page shows that no prints are currently available for sale. Click <for
developers> and then Insert the sample data. It might take a minute or two for
the data to insert.
i. When the app indicates the sample data has been created, click Lauren's Lovely
Landscapes in the top navigation bar.
j. Notice the prints listed on the home page. Your app is successfully deployed!
k. Take a few minutes to explore the app. Register as a new user and order a print.
Navigate to the <for developers> page to see the schema diagram as well as the
data stored in the graph.
Tweet this!
Just deployed an app that uses a #graphdatabase in @Lauren_Schaefer’s #IntroToGraph
lab at #DevoxxFr! [Include a screenshot of your app or, even better, a selfie of you with
your app!]
Get creative
Explore the app and consider how you could improve it by leveraging the power of a
graph database. Note your ideas here, on your blog, or on Twitter.
Additional resources
Video: Should I care about graph databases? https://youtu.be/oHTaCql9zbE
Video: How can I try out graph databases?
https://www.youtube.com/watch?v=tvo3X61FeRM&t
Video: Deploy an IBM Graph app to Bluemix
https://www.youtube.com/watch?v=x0LHRZiGL8A
From Relational to Neo4j https://neo4j.com/developer/graph-db-vs-rdbms/
Why Choose a Graph Database http://radar.oreilly.com/2013/07/why-choose-a-graph-
database.html
Guest View: Relational vs. Graph databases: Which to use and when?
http://sdtimes.com/guest-view-relational-vs-graph-databases-use/
Why graph databases are so effective in analytics projects
http://www.techrepublic.com/article/why-graph-databases-are-so-effective-in-
analytics-projects/
What is a Graph Database? https://neo4j.com/developer/graph-database/
Graph DBMS increased their popularity by 500% within the last 2 years
https://neo4j.com/news/graph-dbms-increased-their-popularity-by-500percent
50 Shades of Graph: How Graph Databases are Transforming Online Dating
https://www.forbes.com/sites/danwoods/2014/02/14/50-shades-of-graph-how-graph-
databases-are-transforming-online-dating/#194c26795081
Detecting complex fraud in real time with graph databases
https://developer.ibm.com/dwblog/2017/detecting-complex-fraud-real-time-graph-
databases/
No more joins: An overview of Graph database query languages
https://developer.ibm.com/dwblog/2017/overview-graph-database-query-languages/
How do you create a schema diagram and convert it to code?
Learn this!
Create a schema diagram for your graph by modeling the nouns and verbs, defining the
vertices and edges, defining the multiplicity, adding properties, and creating indexes.
When using Titan, edges can have a multiplicity MULTI, SIMPLE, MANY2ONE,
ONE2MANY, or ONE2ONE.
When using Titan, properties can have a cardinality of SINGLE, LIST (allows duplicates),
or SET (does not allow duplicates). The default cardinality is SINGLE.
When using Titan, indexes can be composite (meaning they can only be used for exact-
match queries) or mixed (meaning they can be used for a combination of property keys
or for querying for things more complex than exact matches). The supported mixed
index predicates for string data types are textContains, textContainsPrefix,
textContainsRegex, eq, neq, textPrefix, and textRegex. The supported mixed index
predicates for numbers are eq, neq, gt, gte, lt, and lte.
Try this!
1. Open the code for your app:
a. Navigate to http://ibm.biz/devoxxfr.
b. Log in (if you are not already authenticated).
c. On the dashboard for your apps, locate the row with the Lauren's Lovely
Landscapes app and click it.
Hint: Be sure to click the name of your app and not the route.
d. The app page opens in Bluemix.
e. Scroll down until you see the Continuous delivery tile.
f. Click the Edit code button. The web IDE opens with your project's code.
2. Add a new print that will be created as part of the app’s sample data.
a. Find a picture you’d like to be featured on your personal copy of Lauren’s Lovely
Landscapes. If you don’t have one handy, you can use one from Lauren’s
Twitter page (http://twitter.com/lauren_schaefer).
b. In the web IDE, expand the static directory and select the images directory.
c. Select File > Import > File or Zip Archive.
d. Import the image.
e. In the navigation pane, open graph.py.
f. Scroll to the insertSampleData function and locate the code around line 172
that creates prints.
g. Add a new line of code (be sure to use exactly the same spacing and tabs that
the code around it does) to create the print for the image you just imported.
For example, if you are creating a print for Paris, your code might look like the
following: createPrint('Paris', 'I <3 Paris!', 150.00,
'paris.jpg')
3. Deploy the code
a. Click the Deploy button ( ) in the toolbar at the top of the page.
b. The deploy will take a minute or two to complete. You will know it’s finished
when a green status dot appears:
c. When the deploy has finished, click the Open the Deployed App ( ) button.
4. Delete and insert the sample data
a. In your deployed app, click <for developers>.
b. Scroll to the The Data section and click the link to delete the data.
c. Click <for developers>.
d. Scroll to the The Data section and click the link to insert the sample data.
e. Click Lauren’s Lovely Landscapes in to the top navigation bar to go to the home
page.
f. Check out your new print on the home page! You successfully created a new
node in your app’s graph!
Tweet this!
Just updated my app that uses a #graphdatabase! Adding a node to my app was simple!
#IntroToGraph #DevoxxFr! [Include a screenshot of your app with the updated pic!]
Get creative
Think about how you could update the schema to make the app better. Perhaps you
want to add a new index on which to search. Or maybe you want to add some new
nodes or edges so you can store more data. Update schema.json to reflect your ideas.
To test out your changes, open constants.py and update the GRAPH_ID to be
something unique so that the next time you run your app, a new graph will be created
that uses your updated schema. Then deploy your app!
Make the app your own! Add more prints to the static/images directory and then
update the insertSampleData function to include your prints in the app’s sample data.
Additional resources
Video: Build a schema model for IBM Graph
https://www.youtube.com/watch?v=ZQbYSEaUrTo
Video: Convert a schema model to code for IBM Graph
https://www.youtube.com/watch?v=Km_uQi-iEEg
Video: Create and traverse an IBM Graph database
https://www.youtube.com/watch?v=cN8thqBV0HU
IBM Graph Documentation: https://ibm-graph-docs.ng.bluemix.net/
Titan Documentation: http://s3.thinkaurelius.com/docs/titan/1.0.0/index.html
Graph Data Modeling Guidelines: https://neo4j.com/developer/guide-data-modeling/
How do you create a node or edge in a graph database?
Learn this!
When using Gremlin to create a node or edge in a graph database, begin by creating a
new traversal. Then use the appropriate step (addV, addE, addInE, or addOutE),
including a list of properties as parameters for the step, in order to do the creation.
Try this!
Now it’s time for the fun stuff: the CRUD operations, starting with create. The following
instructions guide you through how the app creates a new vertex in the graph when a user
registers.
1. Observe the schema diagram below. To handle user registration, create a new user
vertex with the following properties: firstName, lastName, username, and email.
2. Next, open the Graph Query Editor for your graph instance in a new browser tab or
window:
a. Navigate to http://ibm.biz/devoxxfr.
b. On the dashboard, scroll down to the All Services section and click
LaurensLovelyLandscapesSample-Graph.
c. On the Manage tab (open by default), click Open. The Graph Query Editor opens
for your graph instance.
3. By default, the g graph is selected. Switch to the landscapes_graph where your data is
stored by clicking the down arrow beside g in the top navigation menu and clicking
landscapes_graph.
4. To create a new user, write a Gremlin query that includes sample data for each of the
properties shown in the schema diagram: label, firstName, lastName, username, and
email. Including a property type with an associated value user makes it possible to
search for vertexes with type user. In the Query Execution Box, type the following
Gremlin query: def gt = graph.traversal();
gt.addV(label, 'user', 'firstName', 'Lauren', 'lastName',
'Schaefer', 'username', 'lauren', 'email',
'[email protected]', 'type', 'user');
5. Click the Submit Query button ( ).
6. The results of the query open in a new box. Explore the JSON results. Note that a new
vertex has been created using the properties we indicated in the query.
7. Now that you have a working query that creates a new user vertex, explore the code. In
the file navigation pane of the web IDE you left open in another browser tab or window,
click graph.py to open it.
8. In graph.py, locate the createUser() function around line 258.
9. The function begins by calling the doesUserExist() function to check if a user with the
given username already exists. The function returns an error that is displayed to the
user if the username they requested is already taken.
10. If the username is available, the function continues and creates a new dictionary that
contains a Gremlin query. The query is based on the one we wrote above in step 4; the
only difference is that, instead of using sample data for the properties firstName,
lastName, username, and email, the code uses dynamic input based on the arguments
passed in to the function.
11. After the dictionary containing the query is created, the function is ready to call the
Graph API. Around line 278, the function makes a new POST request to /gremlin and
sends the dictionary containing the Gremlin query as part of the request.
12. Around line 280, the function checks to see if the request was successful (200 response
code) and the user vertex was created. If the request was not successful, the function
raises an error.
Tweet this!
Just executed my first #Gremlin query in @Lauren_Schaefer’s #IntroToGraph lab at
#DevoxxFr! [Include a screenshot of your query with the results visualization!]
Get creative
Return to the Graph Query Editor and write a Gremlin query that creates a new buys
edge between a user and a print. If you need a hint, check out the buyPrint function in
graphy.py.
Additional resources
Video: Create data elements in IBM Graph
https://www.youtube.com/watch?v=WHQCWk0lHW4
Gremlin Documentation http://tinkerpop.apache.org/docs/3.0.1-incubating/#graph-
traversal-steps
How do you read a node or edge in a graph database?
Learn this!
When using Gremlin to read a node or edge in a graph database, begin by creating a
new traversal. Then use the appropriate step (V or E) and the id of the node or edge
(for example, g.V(5)) or a has step (has, hasLabel, hasId, hasKey, hasValue, or hasNot)
to narrow the query to a particular node or set of nodes.
When using Gremlin, you can traverse in or out from a node or edge by using the vertex
steps (out, in, both, outE, inE, bothE, outV, inV, bothV, otherV).
Try this!
In this section, you'll explore the read operation. The following instructions guide you through
how the app reads a user vertex when displaying a user's profile information.
1. Observe the schema diagram below. To display a user's profile information, you need to
read a user vertex with its properties.
2. Open your browser tab or window that has the Graph Query Editor (instructions for how
to do so are described above in .) Ensure the landscapes_graph is selected
(instructions for how to do so are described above in .)
3. Write a Gremlin query to read a user vertex with the label user and the username
"jason." To do this, in the Query Execution Box, type the following Gremlin query: def gt = graph.traversal();
gt.V().hasLabel("user").has("username", "jason");
4. Click the Submit Query button ( ).
5. The results of the query open in a new box below. Explore the JSON results. Note that
the results are returned as a set of length one. This is because only one vertex has the
username "jason."
6. Now that you have a working query that reads a user vertex, explore the code. In the file
navigation pane of the web IDE you left open in another browser tab or window, open
graph.py.
7. In graph.py, locate the getUser() function around line 204.
8. The function begins by creating a new dictionary that contains a Gremlin query. The
query is based on the one you wrote in step 3 above; the only difference is that instead
of querying for username "jason," the code uses dynamic input based on the username
passed in to the function.
9. After the dictionary containing the query is created, the function is ready to call the
Graph API. Around line 208, the function makes a new POST request to /gremlin and
sends the dictionary containing the Gremlin query as part of the request.
10. Around line 209, the function checks to see if the request was successful (200 response
code). Then the function starts processing the results. The JSON results shown in the
query editor are found by accessing
json.loads(response.content)['result']['data']. The results are a
set of vertexes, so the function checks that the length of the results is greater than 0.
Because only one vertex should be returned when a particular username is queried, the
function sets user to results[0]. If everything was successful, the user vertex with
its properties is returned. Otherwise, the function raises an error.
Tweet this!
Writing queries for #graphdatabases is simple with #Gremlin! #IntroToGraph
#DevoxxFr! [Include a screenshot of your query with the results visualization!]
Get creative
Return to Graph Query Editor and write some new Gremlin traversals. For example, you
may want to write a traversal to find all of the orders for a given user. What can you
learn from traversing your graph?
Additional resources
Video: Read data elements in IBM Graph
https://www.youtube.com/watch?v=Dx2C_A5EICc
Gremlin Documentation http://tinkerpop.apache.org/docs/3.0.1-incubating/#graph-
traversal-steps
How do you update a node or edge in a graph database?
Learn this!
When using Gremlin to update a node or edge in a graph database, begin by creating a
new traversal. Then query for the vertex or edge you want to update and use
.property(propertyKey, propertyValue) to make the update.
Try this!
In this section, you'll explore the update operation. The following instructions walk you through
how the app updates a user vertex when a user makes changes to his profile information.
1. Observe the schema diagram below. To edit a user's profile information, update the
properties stored in a user vertex.
2. Open your browser tab or window that has the Graph Query Editor (instructions for how
to do so are described above in .) Ensure the landscapes_graph is selected
(instructions for how to do so are described above in .)
3. Let's write a Gremlin query to update a user vertex. We'll query for the vertex with label
user and the username "jason" and update that vertex with new property values. In the
Query Execution Box, input the following Gremlin query:
def gt = graph.traversal();
gt.V().hasLabel("user").has("username", "jason")
.property('firstName', 'Jasonupdate')
.property('lastName', 'Schaeferupdate')
.property('email', '[email protected]');
4. Click the Submit Query button ( ).
5. The results of the query open in a new box below. Explore the JSON results. Note that
the "Jason" vertex now has the property values we used in the query.
6. Now that you have a working query that updates a user vertex, explore the code. In the
file navigation pane of the web IDE you left open in another browser tab or window,
open graph.py.
7. In graph.py, locate the updateUser() function around line 219.
8. The function begins by creating a new dictionary that contains a Gremlin query. The
query is based on the one you wrote above in step 3 with a few differences: instead of
querying for username "jason" and updating the properties with sample data, the code
uses dynamic input based on the arguments passed in to the function.
9. After the dictionary containing the query is created, the function is ready to call the
Graph API. Around line 228, the function makes a new POST request to /gremlin and
sends the dictionary containing the Gremlin query as part of the request.
10. Around line 229, the function checks to see if the request was successful (200 response
code) and the user vertex was updated. If the request was not successful, the function
raises an error.
Tweet this!
I’m becoming a #Gremlin expert in @Lauren_Schaefer’s #IntroToGraph lab at
#DevoxxFr! #graphdatabase [Include a selfie of you looking like an expert]
Get creative
Consider what you would do if someone needed to update his mailing address on an
order he had already placed. Write a query to update the mailing address on an order.
Then create a new webpage that administrators could use to update the mailing address
of an order.
Occasionally, you’ll want to update the pricing and/or description of the prints. Create a
new webpage that would allow administrators to do just that.
Additional resources
Video: Update data elements in IBM Graph
https://www.youtube.com/watch?v=ryR6b6XpkVQ
Gremlin Documentation http://tinkerpop.apache.org/docs/3.0.1-incubating/#graph-
traversal-steps
How do you delete a node or edge in a graph database?
Learn this!
When using Gremlin to delete a node or edge in a graph database, begin by creating a
new traversal. Then query for the vertex(es) or edge(s) you want to delete and use the
drop step to do the deletion.
Try this!
Let's explore the delete operation. The following instructions guide you through how the app
deletes all of the edges and vertexes in the graph.
1. Let's begin by observing the schema diagram. To delete all of the vertexes and edges in
the graph, you'll need to delete all of the buys edges as well as the user and print
vertexes.
2. Open your browser tab or window that has the Graph Query Editor (instructions for how
to do so are described above in .) Ensure the landscapes_graph is selected
(instructions for how to do so are described above in .)
3. Write a Gremlin query to delete all of the edges and vertexes. Query all of the edges of
type buys and drop them. Then query all of the vertexes of type print or user and drop
them. In the Query Execution Box, input the following Gremlin query:
def g = graph.traversal();
g.E().has('type', 'buys').drop();
g.V().has('type', within('print','user')).drop();
4. Click the Submit Query button ( ).
5. The results of the query open in a new box below. Note that an empty list is displayed as
the query has dropped all of our vertexes and edges.
Hint: If you want to interact with your copy of the sample app, return to the <for
developers> page and insert the sample data.
6. Now that you have a working query that deletes all of the edges and vertexes, let's
explore the code. In the file navigation pane of the web IDE you left open in another
browser tab or window, click graph.py to open it.
7. In graph.py, locate the dropGraph() function around line 441.
8. The function begins by creating a new dictionary that contains a Gremlin query. The
query is based on the one we wrote above in step 3.
9. After the dictionary containing the query is created, the function is ready to call the
Graph API. Around line 447, the function makes a new POST request to /gremlin and
sends the dictionary containing the Gremlin query as part of the request.
10. Around line 448, the function checks to see if the request was successful (200 response
code) and the edges and vertexes were deleted. If the request was not successful, the
function raises an error.
Tweet this!
Just learned how to drop nodes and edges using #Gremlin! #IntroToGraph #DevoxxFr
#graphdatabase http://giphy.com/gifs/dance-hot-rap-avkW4UabDdJFS
Get creative
Consider how you would implement a feature to allow users to delete their profiles.
What would the user interface look like? Make some sketches and implement it.
Over time, administrators may want to remove prints that are no longer selling well.
Implement a feature to allow users to remove prints.
Additional resources
Video: Delete data elements in IBM Graph https://youtu.be/sowbww8if_8
Gremlin Documentation http://tinkerpop.apache.org/docs/3.0.1-incubating/#graph-
traversal-steps
How do you implement a feature that requires a new graph query?
Learn this!
To implement a new feature that requires a graph query, begin by observing the schema
diagram and determining how you will need to interact with the graph. Then, write and
test your query. Finally, write the code that queries your graph and the code your users
will interact with.
Try this!
Now that you know the basics of how to perform the CRUD operations using Gremlin, it's time
to code. In this section, you'll update the user profile page to display the user's orders.
1. Start by observing the schema diagram. To view the orders for a user, you'll need to
start with the user vertex for the authenticated user and then traverse the graph to read
property values stored in the buys edge (datetime, address1, address2, city, state, zip,
and payment method) and the associated print vertex (name and imgPath).
2. Open your browser tab or window that has the Graph Query Editor (instructions for how
to do so are described above in .) Ensure the landscapes_graph is selected
(instructions for how to do so are described above in .)
3. You'll need to write a Gremlin query to read all of the information about a given user's
orders. Try it on your own before continuing on.
Not sure where to start? Here are some hints:
Begin by querying for the vertex with label user and the username "jason." Name this
vertex "buyer." Then traverse out along the buys edge to find all of the print vertexes
that the user has bought. Next, search for all of the in edges that connect to the buyer
vertex (if you fail to check that the vertex is the buyer vertex, the query will return all
connected edges regardless of who bought the print). Finally, use the path step so you
can view the history of the traversal that includes the user vertex, the buys edge, and
the print vertex (if you don't use the path step, the results will only contain the final
user vertex).
Below is an example query. Check to see if yours returned the same results: def gt = graph.traversal();
gt.V().hasLabel("user").has("username", "jason").as('buyer')
.out("buys")
.inE("buys").outV().as('buyer2').where('buyer',eq('buyer2'))
.path();
From the diagram on the right, you can see the user has bought three prints. The JSON
results are displayed on the left. Each JSON object contains the information for one
order. The objects property contains the set of information we care about: the user
vertex, the print vertex, the buys edge, and a duplicate user vertex the traversal found
because it started and ended by searching for a user vertex.
Hint: If no results are returned, be sure you inserted the sample data on the <for
developers> page.
4. Now that you have confirmed the query works, let's code. In the file navigation pane of
the web IDE you left open in another browser tab or window, open graph.py.
5. Beneath the getUser() function around line 217, paste the following code:
def getUserOrders(username):
gremlin = { "gremlin": "def gt = graph.traversal();" + "gt.V().hasLabel(\"user\").has(\"username\", \"" + username + "\").as(\"buyer\")" + ".out(\"buys\")" + ".inE(\"buys\")" + ".outV().as(\"buyer2\").where(\"buyer\",eq(\"buyer2\"))" + ".path()"
} response = post(constants.API_URL + '/' + constants.GRAPH_ID + '/gremlin', json.dumps(gremlin)) if (response.status_code == 200): results = json.loads(response.content)['result']['data'] orders = [] if len(results) > 0: print 'Found orders for username %s: %s.' % (username, results)
for result in results: order = {} for object in result['objects']: if object['label']=='user': continue if object['label']=='buys': order['date'] = object['properties']['date'] order['firstName'] = object['properties']['firstName'] order['lastName'] = object['properties']['lastName'] order['address1'] = object['properties']['address1'] order['address2'] = object['properties']['address2'] order['city'] = object['properties']['city'] order['state'] = object['properties']['state'] order['zip'] = object['properties']['zip'] order['paymentMethod'] = object['properties']['paymentMethod'] continue if object['label']=='print': order['printName'] = object['properties']['name'][0]['value'] order['imgPath'] = object['properties']['imgPath'][0]['value'] continue
orders.append(order) return orders
raise ValueError('Unable to find orders for user with username %s' % username)
You can replace the query in the code above with the query you wrote in the previous
step.
Note: Spacing is very important when programming in Python. Be sure you use spaces
to appropriately indent the code.
Let's examine what this function does. First, the code creates a dictionary that holds the
Gremlin query. This query is similar to the one used in step 3 above, except instead of
querying for the username "jason," the code uses dynamic input based on the argument
passed in to the function. Second, the code includes the query in a POST request to the
Gremlin API. Third, the code processes the results. If the response is 200, the query was
successful. The JSON results just as we saw in the query editor can be found by
accessing json.loads(response.content)['result']['data'], so the
code stores them in results. The code creates a new list, named orders, where all of the
information that will be displayed to the end user about their orders is stored. Then the
code starts looping through each object in each result. The code checks the label of
each object (user, buys, or print) and then stores the appropriate properties in orders.
Finally, if everything goes well, the code returns orders. If not, the code raises a
ValueError.
6. Now that the back-end code is done, you need to get this order information to the front
end. In the left navigation pane in the web IDE, click wsgi.py to open it.
7. Locate the getProfile() function around line 189. This function is called whenever a user
accesses the profile page. The last line of the function returns the template for the
profile page. Update the arguments that are sent to bottle.template() so that the
orders are included: return bottle.template('profile', username = username, userInfo =
graph.getUser(username), orders = graph.getUserOrders(username))
Hint: Be sure the spacing before return remains the same.
8. Now that the profile template has access to the orders, you need to update it to display
the orders. In the left navigation pane in the web IDE, expand the views directory and
click profile.tpl to open it.
9. After the closing tag of the form but before the line to include the footer around line 54,
paste the following code:
<h2><em>Your Orders</em></h2>
% if len(orders) <= 0: You have not placed any orders. % else: % for order in orders: <hr> <div class="container"> <div class="row"> <div class="span6"> <h3>Order date:</h3> {{order['date']}}
<h3>Shipping address:</h3> {{order['firstName']}} {{order['lastName']}}<br> {{order['address1']}}<br> % if len(order['address2']) > 0: {{order['address2']}}<br> % end {{order['city']}}, {{order['state']}} {{order['zip']}}
<h3>Payment method:</h3> {{order['paymentMethod']}} </div>
<div class="span4"> <h3>Print:</h3> {{order['printName']}} <img src="/static/images/{{order['imgPath']}}"> </div>
</div> </div>
% end % end
Let's examine what this code does. The code begins by creating a second-level heading
titled Your Orders. The code then checks to see if the user has any orders. If the user
does not have any orders, the code displays an appropriate message to the user. If the
user has orders, the code loops through them. For each order, it creates a new
container and row that is split into two divs. The left div displays information about the
order and the right div displays information about the print.
10. You have finished implementing your code! Before you celebrate, let's test it. In the
toolbar at the top of the web IDE, click the Deploy the App from the Workspace button
( ). The deploy can take a minute or two. When the app has finished deploying, a
green status dot appears beside your app's name in the toolbar.
11. When the app finishes deploying, click the Open the Deployed App button ( ) in the
toolbar at the top of the web IDE. Your deployed version of Lauren's Lovely Landscapes
opens.
12. In your deployed app, sign in with username "jason" (if not already authenticated).
13. Click Edit Profile.
14. Observe the new Your Orders section you just implemented. Now it's time to celebrate!
Tweet this!
Just implemented & deployed a new feature in an app that leverages a #graphdatabase!
#IntroToGraph #DevoxxFr [Include a link to your app!]
Get creative
Users may want to order more than one print at a time. Sketch diagrams of how the
ordering process would change if users could add prints to their cart before checking
out. Examine the schema diagram to see if you would need to change any of the data
being stored. Determine if queries would also need to be updated. Then implement
this new feature!
Additional resources
Video: Intro to graph databases: The CRUD operations
https://www.youtube.com/watch?v=sefAL0Czu4I
Gremlin Documentation http://tinkerpop.apache.org/docs/3.0.1-incubating/#graph-
traversal-steps
Bottle: Python Web Framework Documentation https://bottlepy.org/docs/stable/
How do you create a recommendation engine?
Learn this!
1. Recommender/recommendation systems/platforms/engines are used to generate
personalized recommendations for users. They vary in complexity and accuracy.
2. Collaborative filtering generates recommendations by assuming that if users share
something in common with each other (for example, they’ve purchased the same item),
those users are likely to share something else in common with each other.
3. A major strength of graph databases is the ability to quickly generate real-time
recommendations through collaborative filtering.
Try this!
In this section, you'll update a product's page to display recommendations based on what other
users who bought that print also bought.
1. Build the query for the new recommendation engine
a. Open your browser tab or window that has the Graph Query Editor (instructions
for how to do so are described above in .) Ensure the landscapes_graph is
selected (instructions for how to do so are described above in .)
b. Let's build the query for our recommendation incrementally so that we can
discuss each step. Start by creating a new traversal. Then you'll search for the
print 'Alaska' and name the resulting vertex currentPrint so you can refer to it
later. In the Query Execution Box, input the following Gremlin query:
def gt = graph.traversal();
gt.V().hasLabel("print").has("name", "Alaska").as("currentPrint");
c. Click the Submit Query button ( ).
d. The query results open in a new box below. You can see one print vertex for
Alaska.
e. Adding on to the existing query. From the print vertex Alaska, traverse in along
the buys edges to find all of the users who have bought that print. In the Query
Execution Box, type the following Gremlin query: def gt = graph.traversal();
gt.V().hasLabel("print").has("name",
"Alaska").as("currentPrint")
.in("buys");
f. Click the Submit Query button ( ).
g. The query results open in a new box below. You can see that three users bought
the Alaska print: Jason, Deanna, and Joy.
h. Continuing to add to the existing query. From the collection of users who
bought Alaska, traverse out along the buys edges to find all the prints these
users have bought excluding Alaska ('currentPrint'). In the Query Execution Box,
type the following Gremlin query: def gt = graph.traversal();
gt.V().hasLabel("print").has("name",
"Alaska").as("currentPrint")
.in("buys").out("buys").where(neq("currentPrint"));
i. Click the Submit Query button ( ). The query results open in a new box below.
You can see the users have purchased four prints: Antarctica, Australia, Las
Vegas, and Japan. Note that the JSON results on the left list the prints multiple
times: each time the print is listed represents a purchase. The visual summary
on the right only displays each print once.
j. Add on to the existing query. Now that you know the recommended prints, you
need to group them together by name, sort them, and list the top three. In the
Query Execution Box, input the following Gremlin query: def gt = graph.traversal();
gt.V().hasLabel("print").has("name",
"Alaska").as("currentPrint")
.in("buys")
.out("buys").where(neq("currentPrint"))
.groupCount().by('name').order(local).by(valueDecr).limit(l
ocal, 3);
k. Click the Submit Query button ( ).
l. The query results open in a new box below. Las Vegas was purchased three
times, Antarctica was purchased two times, and Japan was purchased one time.
Australia was also purchased one time, so it is as equally valid of a
recommendation as Japan. You could optionally update the query to indicate
what the sorting order should be when the prints have been purchased the
same number of times, but we will skip this for now.
m. At this point, you could be finished since you've generated an ordered list of
recommendations. However, for the app to display the image associated with
each recommendation, you'll need the imgPath property in addition to the
name property. Update the query to add a new function named
byNameImgPath that handles storing both the image name and image path in
the query results. Replace by('name') with by(byNameImgPath) to call this
new function. In the Query Execution Box, type the following Gremlin query: def gt = graph.traversal();
java.util.function.Function byNameImgPath = { Vertex v ->
"" + v.value("name") + ":" + v.value("imgPath") };
gt.V().hasLabel("print").has("name",
"Alaska").as("currentPrint")
.in("buys")
.out("buys").where(neq("currentPrint"))
.groupCount().by(byNameImgPath).order(local).by(valueDecr).
limit(local, 3);
n. Click the Submit Query button ( ).
o. The query results open in a new box below. The results now display the imgPath
in addition to the name. The query is ready!
2. Write the code for the new recommendation engine
a. Now that you have confirmed the query successfully generates
recommendations, it's time to code! In the file navigation pane of the web IDE
you left open in another tab or window, click graph.py to open it.
Above the getRecommendedPrints() function around line 42, paste the following code: def getCommonlyPurchasedPrints(printName):
# Generate a list of commonly purchased prints by searching for what # the people who have bought this print also purchased gremlin = { # create a new traversal "gremlin": "def gt = graph.traversal();" + # create a function that handles storing both the image name and image path in the results "java.util.function.Function byNameImgPath = { Vertex v -> \"\" + v.value(\"name\") + \":\"
+ v.value(\"imgPath\") };" + # search for the node of the designated print and name it "currentPrint" "gt.V().hasLabel(\"print\").has(\"name\", \"" + printName + "\").as(\"currentPrint\")" + # go in to find all of the users who bought the designated print ".in(\"buys\")" + # go out to find all prints (excluding the designated print) that these users purchased ".out(\"buys\").where(neq(\"currentPrint\"))" + # group and sort to find the top 3 most commonly purchased prints ".groupCount().by(byNameImgPath).order(local).by(valueDecr).limit(local, 3);" }
response = post(constants.API_URL + '/' + constants.GRAPH_ID + '/gremlin', json.dumps(gremlin))
if (response.status_code == 200): results = json.loads(response.content)['result']['data'] if len(results) > 0: results = results[0] # We lose the sorting from the query results when we do json.loads. # Sort the results in descending order by value. results = sorted(results.items(), key=itemgetter(1), reverse=True) prints = [] for p in results: newPrint = {} newPrint['name'] = p[0].split(':', 1)[0] newPrint['imgPath'] = p[0].split(':', 1)[1] prints.append(newPrint) print 'Found print commonly purchased with %s: %s' % (printName, newPrint['name']) return prints
raise ValueError('An error occurred while getting a list of commonly purchased prints for print
%s: %s %s.' % (printName, response.status_code, response.content))
Note: Spacing is important when programming in Python. Be sure you are using
spaces to indent the code appropriately.
Let's examine what this function does. First, the code creates a dictionary that
holds the Gremlin query. This query is very similar to the one we generated in
the section above except instead of querying for the print called Alaska, the
code uses dynamic input based on the printName argument passed in to the
function. Second, the code includes the query in a POST request to the Gremlin
API. Third, the code processes the results. If the response is 200, the query was
successful. Because json.loads() loses the sorting from the query results, the
code resorts the results. The code creates a new list named prints where the
information about the recommended prints is stored. Then the code begins
looping through each result, storing the name and imgPath for each
recommended print. Finally, if everything goes well, the code returns prints. If
not, the code raises a ValueError.
b. Now that the back-end code is complete, it's time to get the recommended
prints to the frontend. In the left navigation pane in the web IDE, click wsgi.py
to open it.
c. Locate the getPrint() function around line 73. This function is called whenever a user accesses a print's details page. The last line of the try statement in the function returns the template for the print page. Update the arguments that are sent to bottle.template() in the try statement so the recommended prints are included:
return bottle.template('print',
username = request.get_cookie("account", secret=constants.COOKIE_KEY),
printInfo = printInfo,
commonlyPurchasedPrints = graph.getCommonlyPurchasedPrints(printName))
Hint: Be sure the spacing before return remains the same
d. Now that the print template has access to the recommended prints, you need to
update it to display them. In the left navigation pane in the web IDE, expand the
views directory and click print.tpl to open it.
e. After the closing tag of the form but before the line to include the footer around
line 28, paste the following code:
% if len(commonlyPurchasedPrints) > 0: <h3>Users who ordered this print also ordered...</h3> <div class='container'> <div class='row'> % for p in commonlyPurchasedPrints: <div class="preview span3"> <a href="{{p['name']}}"> {{p['name']}}<br> <img src="/static/images/{{p['imgPath']}}" class="thumb"> </a> </div> % end </div> </div> % end
Let's examine what this code does. The code begins by checking to see if there is
at least one commonly purchased print to recommend. If so, the code creates a
level-three heading with the text "Users who ordered this print also ordered ...
." The code then loops through the commonly purchased prints, displaying the
name and image for each.
3. Deploy the code for the new recommendation engine
a. Now that you've written the code for the new recommendation engine, let's
test it. In the toolbar at the top of the web IDE, click the Deploy the App from
the Workspace button ( ). The deploy might take a minute or two. When the
app is deployed, a green status dot appears beside your app's name in the
toolbar.
b. When the app is deployed, click the Open the Deployed App button ( ) in the
toolbar at the top of the web IDE. The deployed version of Lauren's Lovely
Landscapes opens.
c. In your deployed app, click Alaska to open the Alaska print's details page.
d. Scroll down to see the recommendations feature you just implemented!
Congrats!
Tweet this!
Just implemented a recommendation engine. So easy with a #graphdatabase!
#IntroToGraph #DevoxxFr [Include a selfie of you with your app or a link to your app!]
Get creative
Consider how you could create better recommendations for users of the Lauren’s Lovely
Landscapes app. Perhaps you want to limit the recommendations to only recent
purchases. Or maybe you want to group users together based on more than just
commonly purchased prints — maybe you want to consider a user's demographics or
social connections. Or maybe you want to use Watson's Visual Recognition to uncover
similarities in the prints themselves and make recommendations based on those
similarities. Update the existing queries to generate better recommendations.
How can you test if your recommendation system is working? Consider how you could
track users’ actions to see if they’re using your recommendations or to see how
accurate your recommendations are.
Additional resources
Video: Intro to graph databases: Building a recommendation engine
https://youtu.be/TkjpA7i94aM
Video: Make recommendations using IBM Graph https://youtu.be/cAFRpWoN6ZQ
Apache TinkerPop recipe for recommendations
http://tinkerpop.apache.org/docs/current/recipes/#recommendation
Collaborative Filtering https://en.wikipedia.org/wiki/Collaborative_filtering
Recommender System https://en.wikipedia.org/wiki/Recommender_system
Need a recommendation engine? Graph databases boost customer service with real-
time insight https://developer.ibm.com/dwblog/2017/recommendation-engine-
customer-insight-graph-database/