+ All Categories
Home > Technology > Intro to Graph Databases Workbook

Intro to Graph Databases Workbook

Date post: 05-Apr-2017
Category:
Upload: lauren-hayward-schaefer
View: 80 times
Download: 0 times
Share this document with a friend
28
Intro to Graph Databases A Devoxx France Hands-on Lab http://ibm.biz/devoxxfr_workbook Lauren Schaefer @Lauren_Schaefer April 7, 2017 #DevoxxFR #IntroToGraph
Transcript
Page 1: Intro to Graph Databases Workbook

Intro to Graph Databases A Devoxx France Hands-on Lab

http://ibm.biz/devoxxfr_workbook

Lauren Schaefer @Lauren_Schaefer

April 7, 2017 #DevoxxFR

#IntroToGraph

Page 2: Intro to Graph Databases Workbook

Table of Contents What are graph databases and why should you care? ................................................................................. 3

How do you create a schema diagram and convert it to code? ................................................................... 6

How do you create a node or edge in a graph database? ............................................................................ 9

How do you read a node or edge in a graph database? ............................................................................. 12

How do you update a node or edge in a graph database? ......................................................................... 14

How do you delete a node or edge in a graph database? .......................................................................... 16

How do you implement a feature that requires a new graph query? ........................................................ 18

How do you create a recommendation engine? ........................................................................................ 23

Page 3: Intro to Graph Databases Workbook

What are graph databases and why should you care?

Learn this!

When creating a graph, nodes (also known as vertexes) are typically nouns (people,

places, and things) and edges are typically verbs (actions).

Graphs allow you to model the data and relationships just as they exist in real life

without having to map or abstract them to something else.

You can easily traverse graphs, which basically means to follow the connections

between your nodes, to find patterns in your data.

Graphs are helpful with a variety of use cases including generating recommendations,

finding the shortest path, modeling the internet of things, and detecting fraud.

Try this!

1. Sign up for IBM Bluemix by navigating to http://ibm.biz/devoxxfr and signing up. Be

sure to sign up using an email address you currently have access to as you’ll need to

verify your account. When verifying your account, you’ll be prompted for the region

you’d like to use. Select US South as your region.

2. Deploy the Lauren’s Lovely Landscapes app to Bluemix so you’ll have your own copy:

a. Navigate to http://ibm.biz/devoxxfr_deploy.

b. If you are not already authenticated at Bluemix, you might be prompted to do

so.

c. If you have not already selected an alias for your account, you might be

prompted to do so.

d. On the Deploy this application to Bluemix page, select IBM Bluemix US South

in the REGION dropdown.

e. Click Deploy when the button enables.

f. Wait for the deploy to finish. This can take a few minutes. During this time,

Bluemix creates a new project where you can track and plan, stores a copy of

the code in your new project, creates a delivery pipeline so you can configure

automatic deployments, creates an IBM Graph instance on Bluemix, and

deploys the app to Bluemix so you can see it running live.

g. Click VIEW YOUR APP to see the deployed version of the app.

h. The home page shows that no prints are currently available for sale. Click <for

developers> and then Insert the sample data. It might take a minute or two for

the data to insert.

Page 4: Intro to Graph Databases Workbook

i. When the app indicates the sample data has been created, click Lauren's Lovely

Landscapes in the top navigation bar.

j. Notice the prints listed on the home page. Your app is successfully deployed!

k. Take a few minutes to explore the app. Register as a new user and order a print.

Navigate to the <for developers> page to see the schema diagram as well as the

data stored in the graph.

Tweet this!

Just deployed an app that uses a #graphdatabase in @Lauren_Schaefer’s #IntroToGraph

lab at #DevoxxFr! [Include a screenshot of your app or, even better, a selfie of you with

your app!]

Get creative

Explore the app and consider how you could improve it by leveraging the power of a

graph database. Note your ideas here, on your blog, or on Twitter.

Additional resources

Video: Should I care about graph databases? https://youtu.be/oHTaCql9zbE

Video: How can I try out graph databases?

https://www.youtube.com/watch?v=tvo3X61FeRM&t

Video: Deploy an IBM Graph app to Bluemix

https://www.youtube.com/watch?v=x0LHRZiGL8A

From Relational to Neo4j https://neo4j.com/developer/graph-db-vs-rdbms/

Why Choose a Graph Database http://radar.oreilly.com/2013/07/why-choose-a-graph-

database.html

Guest View: Relational vs. Graph databases: Which to use and when?

http://sdtimes.com/guest-view-relational-vs-graph-databases-use/

Why graph databases are so effective in analytics projects

http://www.techrepublic.com/article/why-graph-databases-are-so-effective-in-

analytics-projects/

What is a Graph Database? https://neo4j.com/developer/graph-database/

Graph DBMS increased their popularity by 500% within the last 2 years

https://neo4j.com/news/graph-dbms-increased-their-popularity-by-500percent

50 Shades of Graph: How Graph Databases are Transforming Online Dating

https://www.forbes.com/sites/danwoods/2014/02/14/50-shades-of-graph-how-graph-

databases-are-transforming-online-dating/#194c26795081

Page 5: Intro to Graph Databases Workbook

Detecting complex fraud in real time with graph databases

https://developer.ibm.com/dwblog/2017/detecting-complex-fraud-real-time-graph-

databases/

No more joins: An overview of Graph database query languages

https://developer.ibm.com/dwblog/2017/overview-graph-database-query-languages/

Page 6: Intro to Graph Databases Workbook

How do you create a schema diagram and convert it to code?

Learn this!

Create a schema diagram for your graph by modeling the nouns and verbs, defining the

vertices and edges, defining the multiplicity, adding properties, and creating indexes.

When using Titan, edges can have a multiplicity MULTI, SIMPLE, MANY2ONE,

ONE2MANY, or ONE2ONE.

When using Titan, properties can have a cardinality of SINGLE, LIST (allows duplicates),

or SET (does not allow duplicates). The default cardinality is SINGLE.

When using Titan, indexes can be composite (meaning they can only be used for exact-

match queries) or mixed (meaning they can be used for a combination of property keys

or for querying for things more complex than exact matches). The supported mixed

index predicates for string data types are textContains, textContainsPrefix,

textContainsRegex, eq, neq, textPrefix, and textRegex. The supported mixed index

predicates for numbers are eq, neq, gt, gte, lt, and lte.

Try this!

1. Open the code for your app:

a. Navigate to http://ibm.biz/devoxxfr.

b. Log in (if you are not already authenticated).

c. On the dashboard for your apps, locate the row with the Lauren's Lovely

Landscapes app and click it.

Hint: Be sure to click the name of your app and not the route.

d. The app page opens in Bluemix.

e. Scroll down until you see the Continuous delivery tile.

f. Click the Edit code button. The web IDE opens with your project's code.

2. Add a new print that will be created as part of the app’s sample data.

a. Find a picture you’d like to be featured on your personal copy of Lauren’s Lovely

Landscapes. If you don’t have one handy, you can use one from Lauren’s

Twitter page (http://twitter.com/lauren_schaefer).

b. In the web IDE, expand the static directory and select the images directory.

c. Select File > Import > File or Zip Archive.

d. Import the image.

Page 7: Intro to Graph Databases Workbook

e. In the navigation pane, open graph.py.

f. Scroll to the insertSampleData function and locate the code around line 172

that creates prints.

g. Add a new line of code (be sure to use exactly the same spacing and tabs that

the code around it does) to create the print for the image you just imported.

For example, if you are creating a print for Paris, your code might look like the

following: createPrint('Paris', 'I <3 Paris!', 150.00,

'paris.jpg')

3. Deploy the code

a. Click the Deploy button ( ) in the toolbar at the top of the page.

b. The deploy will take a minute or two to complete. You will know it’s finished

when a green status dot appears:

c. When the deploy has finished, click the Open the Deployed App ( ) button.

4. Delete and insert the sample data

a. In your deployed app, click <for developers>.

b. Scroll to the The Data section and click the link to delete the data.

c. Click <for developers>.

d. Scroll to the The Data section and click the link to insert the sample data.

e. Click Lauren’s Lovely Landscapes in to the top navigation bar to go to the home

page.

f. Check out your new print on the home page! You successfully created a new

node in your app’s graph!

Tweet this!

Just updated my app that uses a #graphdatabase! Adding a node to my app was simple!

#IntroToGraph #DevoxxFr! [Include a screenshot of your app with the updated pic!]

Get creative

Think about how you could update the schema to make the app better. Perhaps you

want to add a new index on which to search. Or maybe you want to add some new

nodes or edges so you can store more data. Update schema.json to reflect your ideas.

Page 8: Intro to Graph Databases Workbook

To test out your changes, open constants.py and update the GRAPH_ID to be

something unique so that the next time you run your app, a new graph will be created

that uses your updated schema. Then deploy your app!

Make the app your own! Add more prints to the static/images directory and then

update the insertSampleData function to include your prints in the app’s sample data.

Additional resources

Video: Build a schema model for IBM Graph

https://www.youtube.com/watch?v=ZQbYSEaUrTo

Video: Convert a schema model to code for IBM Graph

https://www.youtube.com/watch?v=Km_uQi-iEEg

Video: Create and traverse an IBM Graph database

https://www.youtube.com/watch?v=cN8thqBV0HU

IBM Graph Documentation: https://ibm-graph-docs.ng.bluemix.net/

Titan Documentation: http://s3.thinkaurelius.com/docs/titan/1.0.0/index.html

Graph Data Modeling Guidelines: https://neo4j.com/developer/guide-data-modeling/

Page 9: Intro to Graph Databases Workbook

How do you create a node or edge in a graph database?

Learn this!

When using Gremlin to create a node or edge in a graph database, begin by creating a

new traversal. Then use the appropriate step (addV, addE, addInE, or addOutE),

including a list of properties as parameters for the step, in order to do the creation.

Try this!

Now it’s time for the fun stuff: the CRUD operations, starting with create. The following

instructions guide you through how the app creates a new vertex in the graph when a user

registers.

1. Observe the schema diagram below. To handle user registration, create a new user

vertex with the following properties: firstName, lastName, username, and email.

2. Next, open the Graph Query Editor for your graph instance in a new browser tab or

window:

a. Navigate to http://ibm.biz/devoxxfr.

b. On the dashboard, scroll down to the All Services section and click

LaurensLovelyLandscapesSample-Graph.

c. On the Manage tab (open by default), click Open. The Graph Query Editor opens

for your graph instance.

3. By default, the g graph is selected. Switch to the landscapes_graph where your data is

stored by clicking the down arrow beside g in the top navigation menu and clicking

landscapes_graph.

Page 10: Intro to Graph Databases Workbook

4. To create a new user, write a Gremlin query that includes sample data for each of the

properties shown in the schema diagram: label, firstName, lastName, username, and

email. Including a property type with an associated value user makes it possible to

search for vertexes with type user. In the Query Execution Box, type the following

Gremlin query: def gt = graph.traversal();

gt.addV(label, 'user', 'firstName', 'Lauren', 'lastName',

'Schaefer', 'username', 'lauren', 'email',

'[email protected]', 'type', 'user');

5. Click the Submit Query button ( ).

6. The results of the query open in a new box. Explore the JSON results. Note that a new

vertex has been created using the properties we indicated in the query.

7. Now that you have a working query that creates a new user vertex, explore the code. In

the file navigation pane of the web IDE you left open in another browser tab or window,

click graph.py to open it.

8. In graph.py, locate the createUser() function around line 258.

9. The function begins by calling the doesUserExist() function to check if a user with the

given username already exists. The function returns an error that is displayed to the

user if the username they requested is already taken.

10. If the username is available, the function continues and creates a new dictionary that

contains a Gremlin query. The query is based on the one we wrote above in step 4; the

only difference is that, instead of using sample data for the properties firstName,

lastName, username, and email, the code uses dynamic input based on the arguments

passed in to the function.

11. After the dictionary containing the query is created, the function is ready to call the

Graph API. Around line 278, the function makes a new POST request to /gremlin and

sends the dictionary containing the Gremlin query as part of the request.

12. Around line 280, the function checks to see if the request was successful (200 response

code) and the user vertex was created. If the request was not successful, the function

raises an error.

Tweet this!

Just executed my first #Gremlin query in @Lauren_Schaefer’s #IntroToGraph lab at

#DevoxxFr! [Include a screenshot of your query with the results visualization!]

Get creative

Return to the Graph Query Editor and write a Gremlin query that creates a new buys

edge between a user and a print. If you need a hint, check out the buyPrint function in

graphy.py.

Page 11: Intro to Graph Databases Workbook

Additional resources

Video: Create data elements in IBM Graph

https://www.youtube.com/watch?v=WHQCWk0lHW4

Gremlin Documentation http://tinkerpop.apache.org/docs/3.0.1-incubating/#graph-

traversal-steps

Page 12: Intro to Graph Databases Workbook

How do you read a node or edge in a graph database?

Learn this!

When using Gremlin to read a node or edge in a graph database, begin by creating a

new traversal. Then use the appropriate step (V or E) and the id of the node or edge

(for example, g.V(5)) or a has step (has, hasLabel, hasId, hasKey, hasValue, or hasNot)

to narrow the query to a particular node or set of nodes.

When using Gremlin, you can traverse in or out from a node or edge by using the vertex

steps (out, in, both, outE, inE, bothE, outV, inV, bothV, otherV).

Try this!

In this section, you'll explore the read operation. The following instructions guide you through

how the app reads a user vertex when displaying a user's profile information.

1. Observe the schema diagram below. To display a user's profile information, you need to

read a user vertex with its properties.

2. Open your browser tab or window that has the Graph Query Editor (instructions for how

to do so are described above in .) Ensure the landscapes_graph is selected

(instructions for how to do so are described above in .)

3. Write a Gremlin query to read a user vertex with the label user and the username

"jason." To do this, in the Query Execution Box, type the following Gremlin query: def gt = graph.traversal();

gt.V().hasLabel("user").has("username", "jason");

4. Click the Submit Query button ( ).

5. The results of the query open in a new box below. Explore the JSON results. Note that

the results are returned as a set of length one. This is because only one vertex has the

username "jason."

6. Now that you have a working query that reads a user vertex, explore the code. In the file

navigation pane of the web IDE you left open in another browser tab or window, open

graph.py.

Page 13: Intro to Graph Databases Workbook

7. In graph.py, locate the getUser() function around line 204.

8. The function begins by creating a new dictionary that contains a Gremlin query. The

query is based on the one you wrote in step 3 above; the only difference is that instead

of querying for username "jason," the code uses dynamic input based on the username

passed in to the function.

9. After the dictionary containing the query is created, the function is ready to call the

Graph API. Around line 208, the function makes a new POST request to /gremlin and

sends the dictionary containing the Gremlin query as part of the request.

10. Around line 209, the function checks to see if the request was successful (200 response

code). Then the function starts processing the results. The JSON results shown in the

query editor are found by accessing

json.loads(response.content)['result']['data']. The results are a

set of vertexes, so the function checks that the length of the results is greater than 0.

Because only one vertex should be returned when a particular username is queried, the

function sets user to results[0]. If everything was successful, the user vertex with

its properties is returned. Otherwise, the function raises an error.

Tweet this!

Writing queries for #graphdatabases is simple with #Gremlin! #IntroToGraph

#DevoxxFr! [Include a screenshot of your query with the results visualization!]

Get creative

Return to Graph Query Editor and write some new Gremlin traversals. For example, you

may want to write a traversal to find all of the orders for a given user. What can you

learn from traversing your graph?

Additional resources

Video: Read data elements in IBM Graph

https://www.youtube.com/watch?v=Dx2C_A5EICc

Gremlin Documentation http://tinkerpop.apache.org/docs/3.0.1-incubating/#graph-

traversal-steps

Page 14: Intro to Graph Databases Workbook

How do you update a node or edge in a graph database?

Learn this!

When using Gremlin to update a node or edge in a graph database, begin by creating a

new traversal. Then query for the vertex or edge you want to update and use

.property(propertyKey, propertyValue) to make the update.

Try this!

In this section, you'll explore the update operation. The following instructions walk you through

how the app updates a user vertex when a user makes changes to his profile information.

1. Observe the schema diagram below. To edit a user's profile information, update the

properties stored in a user vertex.

2. Open your browser tab or window that has the Graph Query Editor (instructions for how

to do so are described above in .) Ensure the landscapes_graph is selected

(instructions for how to do so are described above in .)

3. Let's write a Gremlin query to update a user vertex. We'll query for the vertex with label

user and the username "jason" and update that vertex with new property values. In the

Query Execution Box, input the following Gremlin query:

def gt = graph.traversal();

gt.V().hasLabel("user").has("username", "jason")

.property('firstName', 'Jasonupdate')

.property('lastName', 'Schaeferupdate')

.property('email', '[email protected]');

4. Click the Submit Query button ( ).

5. The results of the query open in a new box below. Explore the JSON results. Note that

the "Jason" vertex now has the property values we used in the query.

6. Now that you have a working query that updates a user vertex, explore the code. In the

file navigation pane of the web IDE you left open in another browser tab or window,

open graph.py.

Page 15: Intro to Graph Databases Workbook

7. In graph.py, locate the updateUser() function around line 219.

8. The function begins by creating a new dictionary that contains a Gremlin query. The

query is based on the one you wrote above in step 3 with a few differences: instead of

querying for username "jason" and updating the properties with sample data, the code

uses dynamic input based on the arguments passed in to the function.

9. After the dictionary containing the query is created, the function is ready to call the

Graph API. Around line 228, the function makes a new POST request to /gremlin and

sends the dictionary containing the Gremlin query as part of the request.

10. Around line 229, the function checks to see if the request was successful (200 response

code) and the user vertex was updated. If the request was not successful, the function

raises an error.

Tweet this!

I’m becoming a #Gremlin expert in @Lauren_Schaefer’s #IntroToGraph lab at

#DevoxxFr! #graphdatabase [Include a selfie of you looking like an expert]

Get creative

Consider what you would do if someone needed to update his mailing address on an

order he had already placed. Write a query to update the mailing address on an order.

Then create a new webpage that administrators could use to update the mailing address

of an order.

Occasionally, you’ll want to update the pricing and/or description of the prints. Create a

new webpage that would allow administrators to do just that.

Additional resources

Video: Update data elements in IBM Graph

https://www.youtube.com/watch?v=ryR6b6XpkVQ

Gremlin Documentation http://tinkerpop.apache.org/docs/3.0.1-incubating/#graph-

traversal-steps

Page 16: Intro to Graph Databases Workbook

How do you delete a node or edge in a graph database?

Learn this!

When using Gremlin to delete a node or edge in a graph database, begin by creating a

new traversal. Then query for the vertex(es) or edge(s) you want to delete and use the

drop step to do the deletion.

Try this!

Let's explore the delete operation. The following instructions guide you through how the app

deletes all of the edges and vertexes in the graph.

1. Let's begin by observing the schema diagram. To delete all of the vertexes and edges in

the graph, you'll need to delete all of the buys edges as well as the user and print

vertexes.

2. Open your browser tab or window that has the Graph Query Editor (instructions for how

to do so are described above in .) Ensure the landscapes_graph is selected

(instructions for how to do so are described above in .)

3. Write a Gremlin query to delete all of the edges and vertexes. Query all of the edges of

type buys and drop them. Then query all of the vertexes of type print or user and drop

them. In the Query Execution Box, input the following Gremlin query:

def g = graph.traversal();

g.E().has('type', 'buys').drop();

g.V().has('type', within('print','user')).drop();

4. Click the Submit Query button ( ).

5. The results of the query open in a new box below. Note that an empty list is displayed as

the query has dropped all of our vertexes and edges.

Hint: If you want to interact with your copy of the sample app, return to the <for

developers> page and insert the sample data.

Page 17: Intro to Graph Databases Workbook

6. Now that you have a working query that deletes all of the edges and vertexes, let's

explore the code. In the file navigation pane of the web IDE you left open in another

browser tab or window, click graph.py to open it.

7. In graph.py, locate the dropGraph() function around line 441.

8. The function begins by creating a new dictionary that contains a Gremlin query. The

query is based on the one we wrote above in step 3.

9. After the dictionary containing the query is created, the function is ready to call the

Graph API. Around line 447, the function makes a new POST request to /gremlin and

sends the dictionary containing the Gremlin query as part of the request.

10. Around line 448, the function checks to see if the request was successful (200 response

code) and the edges and vertexes were deleted. If the request was not successful, the

function raises an error.

Tweet this!

Just learned how to drop nodes and edges using #Gremlin! #IntroToGraph #DevoxxFr

#graphdatabase http://giphy.com/gifs/dance-hot-rap-avkW4UabDdJFS

Get creative

Consider how you would implement a feature to allow users to delete their profiles.

What would the user interface look like? Make some sketches and implement it.

Over time, administrators may want to remove prints that are no longer selling well.

Implement a feature to allow users to remove prints.

Additional resources

Video: Delete data elements in IBM Graph https://youtu.be/sowbww8if_8

Gremlin Documentation http://tinkerpop.apache.org/docs/3.0.1-incubating/#graph-

traversal-steps

Page 18: Intro to Graph Databases Workbook

How do you implement a feature that requires a new graph query?

Learn this!

To implement a new feature that requires a graph query, begin by observing the schema

diagram and determining how you will need to interact with the graph. Then, write and

test your query. Finally, write the code that queries your graph and the code your users

will interact with.

Try this!

Now that you know the basics of how to perform the CRUD operations using Gremlin, it's time

to code. In this section, you'll update the user profile page to display the user's orders.

1. Start by observing the schema diagram. To view the orders for a user, you'll need to

start with the user vertex for the authenticated user and then traverse the graph to read

property values stored in the buys edge (datetime, address1, address2, city, state, zip,

and payment method) and the associated print vertex (name and imgPath).

2. Open your browser tab or window that has the Graph Query Editor (instructions for how

to do so are described above in .) Ensure the landscapes_graph is selected

(instructions for how to do so are described above in .)

3. You'll need to write a Gremlin query to read all of the information about a given user's

orders. Try it on your own before continuing on.

Not sure where to start? Here are some hints:

Begin by querying for the vertex with label user and the username "jason." Name this

vertex "buyer." Then traverse out along the buys edge to find all of the print vertexes

that the user has bought. Next, search for all of the in edges that connect to the buyer

vertex (if you fail to check that the vertex is the buyer vertex, the query will return all

connected edges regardless of who bought the print). Finally, use the path step so you

can view the history of the traversal that includes the user vertex, the buys edge, and

the print vertex (if you don't use the path step, the results will only contain the final

user vertex).

Page 19: Intro to Graph Databases Workbook

Below is an example query. Check to see if yours returned the same results: def gt = graph.traversal();

gt.V().hasLabel("user").has("username", "jason").as('buyer')

.out("buys")

.inE("buys").outV().as('buyer2').where('buyer',eq('buyer2'))

.path();

From the diagram on the right, you can see the user has bought three prints. The JSON

results are displayed on the left. Each JSON object contains the information for one

order. The objects property contains the set of information we care about: the user

vertex, the print vertex, the buys edge, and a duplicate user vertex the traversal found

because it started and ended by searching for a user vertex.

Hint: If no results are returned, be sure you inserted the sample data on the <for

developers> page.

4. Now that you have confirmed the query works, let's code. In the file navigation pane of

the web IDE you left open in another browser tab or window, open graph.py.

5. Beneath the getUser() function around line 217, paste the following code:

def getUserOrders(username):

gremlin = { "gremlin": "def gt = graph.traversal();" + "gt.V().hasLabel(\"user\").has(\"username\", \"" + username + "\").as(\"buyer\")" + ".out(\"buys\")" + ".inE(\"buys\")" + ".outV().as(\"buyer2\").where(\"buyer\",eq(\"buyer2\"))" + ".path()"

} response = post(constants.API_URL + '/' + constants.GRAPH_ID + '/gremlin', json.dumps(gremlin)) if (response.status_code == 200): results = json.loads(response.content)['result']['data'] orders = [] if len(results) > 0: print 'Found orders for username %s: %s.' % (username, results)

for result in results: order = {} for object in result['objects']: if object['label']=='user': continue if object['label']=='buys': order['date'] = object['properties']['date'] order['firstName'] = object['properties']['firstName'] order['lastName'] = object['properties']['lastName'] order['address1'] = object['properties']['address1'] order['address2'] = object['properties']['address2'] order['city'] = object['properties']['city'] order['state'] = object['properties']['state'] order['zip'] = object['properties']['zip'] order['paymentMethod'] = object['properties']['paymentMethod'] continue if object['label']=='print': order['printName'] = object['properties']['name'][0]['value'] order['imgPath'] = object['properties']['imgPath'][0]['value'] continue

orders.append(order) return orders

raise ValueError('Unable to find orders for user with username %s' % username)

Page 20: Intro to Graph Databases Workbook

You can replace the query in the code above with the query you wrote in the previous

step.

Note: Spacing is very important when programming in Python. Be sure you use spaces

to appropriately indent the code.

Let's examine what this function does. First, the code creates a dictionary that holds the

Gremlin query. This query is similar to the one used in step 3 above, except instead of

querying for the username "jason," the code uses dynamic input based on the argument

passed in to the function. Second, the code includes the query in a POST request to the

Gremlin API. Third, the code processes the results. If the response is 200, the query was

successful. The JSON results just as we saw in the query editor can be found by

accessing json.loads(response.content)['result']['data'], so the

code stores them in results. The code creates a new list, named orders, where all of the

information that will be displayed to the end user about their orders is stored. Then the

code starts looping through each object in each result. The code checks the label of

each object (user, buys, or print) and then stores the appropriate properties in orders.

Finally, if everything goes well, the code returns orders. If not, the code raises a

ValueError.

6. Now that the back-end code is done, you need to get this order information to the front

end. In the left navigation pane in the web IDE, click wsgi.py to open it.

7. Locate the getProfile() function around line 189. This function is called whenever a user

accesses the profile page. The last line of the function returns the template for the

profile page. Update the arguments that are sent to bottle.template() so that the

orders are included: return bottle.template('profile', username = username, userInfo =

graph.getUser(username), orders = graph.getUserOrders(username))

Hint: Be sure the spacing before return remains the same.

8. Now that the profile template has access to the orders, you need to update it to display

the orders. In the left navigation pane in the web IDE, expand the views directory and

click profile.tpl to open it.

9. After the closing tag of the form but before the line to include the footer around line 54,

paste the following code:

Page 21: Intro to Graph Databases Workbook

<h2><em>Your Orders</em></h2>

% if len(orders) <= 0: You have not placed any orders. % else: % for order in orders: <hr> <div class="container"> <div class="row"> <div class="span6"> <h3>Order date:</h3> {{order['date']}}

<h3>Shipping address:</h3> {{order['firstName']}} {{order['lastName']}}<br> {{order['address1']}}<br> % if len(order['address2']) > 0: {{order['address2']}}<br> % end {{order['city']}}, {{order['state']}} {{order['zip']}}

<h3>Payment method:</h3> {{order['paymentMethod']}} </div>

<div class="span4"> <h3>Print:</h3> {{order['printName']}} <img src="/static/images/{{order['imgPath']}}"> </div>

</div> </div>

% end % end

Let's examine what this code does. The code begins by creating a second-level heading

titled Your Orders. The code then checks to see if the user has any orders. If the user

does not have any orders, the code displays an appropriate message to the user. If the

user has orders, the code loops through them. For each order, it creates a new

container and row that is split into two divs. The left div displays information about the

order and the right div displays information about the print.

10. You have finished implementing your code! Before you celebrate, let's test it. In the

toolbar at the top of the web IDE, click the Deploy the App from the Workspace button

( ). The deploy can take a minute or two. When the app has finished deploying, a

green status dot appears beside your app's name in the toolbar.

Page 22: Intro to Graph Databases Workbook

11. When the app finishes deploying, click the Open the Deployed App button ( ) in the

toolbar at the top of the web IDE. Your deployed version of Lauren's Lovely Landscapes

opens.

12. In your deployed app, sign in with username "jason" (if not already authenticated).

13. Click Edit Profile.

14. Observe the new Your Orders section you just implemented. Now it's time to celebrate!

Tweet this!

Just implemented & deployed a new feature in an app that leverages a #graphdatabase!

#IntroToGraph #DevoxxFr [Include a link to your app!]

Get creative

Users may want to order more than one print at a time. Sketch diagrams of how the

ordering process would change if users could add prints to their cart before checking

out. Examine the schema diagram to see if you would need to change any of the data

being stored. Determine if queries would also need to be updated. Then implement

this new feature!

Additional resources

Video: Intro to graph databases: The CRUD operations

https://www.youtube.com/watch?v=sefAL0Czu4I

Gremlin Documentation http://tinkerpop.apache.org/docs/3.0.1-incubating/#graph-

traversal-steps

Bottle: Python Web Framework Documentation https://bottlepy.org/docs/stable/

Page 23: Intro to Graph Databases Workbook

How do you create a recommendation engine?

Learn this!

1. Recommender/recommendation systems/platforms/engines are used to generate

personalized recommendations for users. They vary in complexity and accuracy.

2. Collaborative filtering generates recommendations by assuming that if users share

something in common with each other (for example, they’ve purchased the same item),

those users are likely to share something else in common with each other.

3. A major strength of graph databases is the ability to quickly generate real-time

recommendations through collaborative filtering.

Try this!

In this section, you'll update a product's page to display recommendations based on what other

users who bought that print also bought.

1. Build the query for the new recommendation engine

a. Open your browser tab or window that has the Graph Query Editor (instructions

for how to do so are described above in .) Ensure the landscapes_graph is

selected (instructions for how to do so are described above in .)

b. Let's build the query for our recommendation incrementally so that we can

discuss each step. Start by creating a new traversal. Then you'll search for the

print 'Alaska' and name the resulting vertex currentPrint so you can refer to it

later. In the Query Execution Box, input the following Gremlin query:

def gt = graph.traversal();

gt.V().hasLabel("print").has("name", "Alaska").as("currentPrint");

c. Click the Submit Query button ( ).

d. The query results open in a new box below. You can see one print vertex for

Alaska.

e. Adding on to the existing query. From the print vertex Alaska, traverse in along

the buys edges to find all of the users who have bought that print. In the Query

Execution Box, type the following Gremlin query: def gt = graph.traversal();

gt.V().hasLabel("print").has("name",

"Alaska").as("currentPrint")

.in("buys");

f. Click the Submit Query button ( ).

Page 24: Intro to Graph Databases Workbook

g. The query results open in a new box below. You can see that three users bought

the Alaska print: Jason, Deanna, and Joy.

h. Continuing to add to the existing query. From the collection of users who

bought Alaska, traverse out along the buys edges to find all the prints these

users have bought excluding Alaska ('currentPrint'). In the Query Execution Box,

type the following Gremlin query: def gt = graph.traversal();

gt.V().hasLabel("print").has("name",

"Alaska").as("currentPrint")

.in("buys").out("buys").where(neq("currentPrint"));

i. Click the Submit Query button ( ). The query results open in a new box below.

You can see the users have purchased four prints: Antarctica, Australia, Las

Vegas, and Japan. Note that the JSON results on the left list the prints multiple

times: each time the print is listed represents a purchase. The visual summary

on the right only displays each print once.

j. Add on to the existing query. Now that you know the recommended prints, you

need to group them together by name, sort them, and list the top three. In the

Query Execution Box, input the following Gremlin query: def gt = graph.traversal();

gt.V().hasLabel("print").has("name",

"Alaska").as("currentPrint")

.in("buys")

.out("buys").where(neq("currentPrint"))

.groupCount().by('name').order(local).by(valueDecr).limit(l

ocal, 3);

k. Click the Submit Query button ( ).

l. The query results open in a new box below. Las Vegas was purchased three

times, Antarctica was purchased two times, and Japan was purchased one time.

Australia was also purchased one time, so it is as equally valid of a

recommendation as Japan. You could optionally update the query to indicate

what the sorting order should be when the prints have been purchased the

same number of times, but we will skip this for now.

m. At this point, you could be finished since you've generated an ordered list of

recommendations. However, for the app to display the image associated with

each recommendation, you'll need the imgPath property in addition to the

name property. Update the query to add a new function named

byNameImgPath that handles storing both the image name and image path in

the query results. Replace by('name') with by(byNameImgPath) to call this

new function. In the Query Execution Box, type the following Gremlin query: def gt = graph.traversal();

java.util.function.Function byNameImgPath = { Vertex v ->

"" + v.value("name") + ":" + v.value("imgPath") };

gt.V().hasLabel("print").has("name",

Page 25: Intro to Graph Databases Workbook

"Alaska").as("currentPrint")

.in("buys")

.out("buys").where(neq("currentPrint"))

.groupCount().by(byNameImgPath).order(local).by(valueDecr).

limit(local, 3);

n. Click the Submit Query button ( ).

o. The query results open in a new box below. The results now display the imgPath

in addition to the name. The query is ready!

2. Write the code for the new recommendation engine

a. Now that you have confirmed the query successfully generates

recommendations, it's time to code! In the file navigation pane of the web IDE

you left open in another tab or window, click graph.py to open it.

Above the getRecommendedPrints() function around line 42, paste the following code: def getCommonlyPurchasedPrints(printName):

# Generate a list of commonly purchased prints by searching for what # the people who have bought this print also purchased gremlin = { # create a new traversal "gremlin": "def gt = graph.traversal();" + # create a function that handles storing both the image name and image path in the results "java.util.function.Function byNameImgPath = { Vertex v -> \"\" + v.value(\"name\") + \":\"

+ v.value(\"imgPath\") };" + # search for the node of the designated print and name it "currentPrint" "gt.V().hasLabel(\"print\").has(\"name\", \"" + printName + "\").as(\"currentPrint\")" + # go in to find all of the users who bought the designated print ".in(\"buys\")" + # go out to find all prints (excluding the designated print) that these users purchased ".out(\"buys\").where(neq(\"currentPrint\"))" + # group and sort to find the top 3 most commonly purchased prints ".groupCount().by(byNameImgPath).order(local).by(valueDecr).limit(local, 3);" }

response = post(constants.API_URL + '/' + constants.GRAPH_ID + '/gremlin', json.dumps(gremlin))

if (response.status_code == 200): results = json.loads(response.content)['result']['data'] if len(results) > 0: results = results[0] # We lose the sorting from the query results when we do json.loads. # Sort the results in descending order by value. results = sorted(results.items(), key=itemgetter(1), reverse=True) prints = [] for p in results: newPrint = {} newPrint['name'] = p[0].split(':', 1)[0] newPrint['imgPath'] = p[0].split(':', 1)[1] prints.append(newPrint) print 'Found print commonly purchased with %s: %s' % (printName, newPrint['name']) return prints

raise ValueError('An error occurred while getting a list of commonly purchased prints for print

%s: %s %s.' % (printName, response.status_code, response.content))

Page 26: Intro to Graph Databases Workbook

Note: Spacing is important when programming in Python. Be sure you are using

spaces to indent the code appropriately.

Let's examine what this function does. First, the code creates a dictionary that

holds the Gremlin query. This query is very similar to the one we generated in

the section above except instead of querying for the print called Alaska, the

code uses dynamic input based on the printName argument passed in to the

function. Second, the code includes the query in a POST request to the Gremlin

API. Third, the code processes the results. If the response is 200, the query was

successful. Because json.loads() loses the sorting from the query results, the

code resorts the results. The code creates a new list named prints where the

information about the recommended prints is stored. Then the code begins

looping through each result, storing the name and imgPath for each

recommended print. Finally, if everything goes well, the code returns prints. If

not, the code raises a ValueError.

b. Now that the back-end code is complete, it's time to get the recommended

prints to the frontend. In the left navigation pane in the web IDE, click wsgi.py

to open it.

c. Locate the getPrint() function around line 73. This function is called whenever a user accesses a print's details page. The last line of the try statement in the function returns the template for the print page. Update the arguments that are sent to bottle.template() in the try statement so the recommended prints are included:

return bottle.template('print',

username = request.get_cookie("account", secret=constants.COOKIE_KEY),

printInfo = printInfo,

commonlyPurchasedPrints = graph.getCommonlyPurchasedPrints(printName))

Hint: Be sure the spacing before return remains the same

d. Now that the print template has access to the recommended prints, you need to

update it to display them. In the left navigation pane in the web IDE, expand the

views directory and click print.tpl to open it.

e. After the closing tag of the form but before the line to include the footer around

line 28, paste the following code:

% if len(commonlyPurchasedPrints) > 0: <h3>Users who ordered this print also ordered...</h3> <div class='container'> <div class='row'> % for p in commonlyPurchasedPrints: <div class="preview span3"> <a href="{{p['name']}}"> {{p['name']}}<br> <img src="/static/images/{{p['imgPath']}}" class="thumb"> </a> </div> % end </div> </div> % end

Page 27: Intro to Graph Databases Workbook

Let's examine what this code does. The code begins by checking to see if there is

at least one commonly purchased print to recommend. If so, the code creates a

level-three heading with the text "Users who ordered this print also ordered ...

." The code then loops through the commonly purchased prints, displaying the

name and image for each.

3. Deploy the code for the new recommendation engine

a. Now that you've written the code for the new recommendation engine, let's

test it. In the toolbar at the top of the web IDE, click the Deploy the App from

the Workspace button ( ). The deploy might take a minute or two. When the

app is deployed, a green status dot appears beside your app's name in the

toolbar.

b. When the app is deployed, click the Open the Deployed App button ( ) in the

toolbar at the top of the web IDE. The deployed version of Lauren's Lovely

Landscapes opens.

c. In your deployed app, click Alaska to open the Alaska print's details page.

d. Scroll down to see the recommendations feature you just implemented!

Congrats!

Tweet this!

Just implemented a recommendation engine. So easy with a #graphdatabase!

#IntroToGraph #DevoxxFr [Include a selfie of you with your app or a link to your app!]

Get creative

Consider how you could create better recommendations for users of the Lauren’s Lovely

Landscapes app. Perhaps you want to limit the recommendations to only recent

purchases. Or maybe you want to group users together based on more than just

commonly purchased prints — maybe you want to consider a user's demographics or

social connections. Or maybe you want to use Watson's Visual Recognition to uncover

similarities in the prints themselves and make recommendations based on those

similarities. Update the existing queries to generate better recommendations.

How can you test if your recommendation system is working? Consider how you could

track users’ actions to see if they’re using your recommendations or to see how

accurate your recommendations are.

Page 28: Intro to Graph Databases Workbook

Additional resources

Video: Intro to graph databases: Building a recommendation engine

https://youtu.be/TkjpA7i94aM

Video: Make recommendations using IBM Graph https://youtu.be/cAFRpWoN6ZQ

Apache TinkerPop recipe for recommendations

http://tinkerpop.apache.org/docs/current/recipes/#recommendation

Collaborative Filtering https://en.wikipedia.org/wiki/Collaborative_filtering

Recommender System https://en.wikipedia.org/wiki/Recommender_system

Need a recommendation engine? Graph databases boost customer service with real-

time insight https://developer.ibm.com/dwblog/2017/recommendation-engine-

customer-insight-graph-database/


Recommended