Flexible recommender systems based on graphs

Post on 23-Jan-2018

1,160 views 0 download

transcript

|

kernixdigital factory + data lab

Flexible recommender systems based on graphs

|

KERNIX

45co-workers 500

projects

2co-founders

3,5M€ revenue 15

years experience

10books

published

Digital factory Data lab

CO-FOUNDERS

Fabrice Métayer and François-Xavier

Bois, two EPITA engineers, gathered

their complementary profiles to create

Kernix in 2001.

ABOUT KERNIX

Kernix’s core business consists in a

digital factory and a data lab.

This double skill allows us to

accompany our clients from upstream

phases (consulting, study, POC) to

downstream phases (industrialization

by production teams).

|

3

DATA LAB

Clients Collaborations

EXPERTISE

Data Pipelines

Cop21

TerraRush

Predictive maintenance

ERDF

Data Vizualisation

SolarImpulse

Recommender systems

PriceMinister

WikiDistrict

Clickalto

HobbyStreet

Marketing Automation

Performics

RadiumOne

Open Data

Accessible.net

|

• Graph database– data stored as nodes

• label : “type” of data stored in the node

• properties : collection of information describing

the node

– nodes are linked together by edges

• type : describes the nature of the relation

– query language : allows to perform graph traversals

• Why graph-oriented recommender

systems ?– gather heterogeneous data in the same structure

– explicitly take advantage of relationships

– "meaningful" for humans

– easy implementation

– fast execution (no training)

GRAPH-ORIENTED RECOMMENDER SYSTEM

|

USE CASE 1 : HOBBYSTREET

|

Facilitate connections between craftsmen and private individuals• Craftsmen : propose workshops (different categories, dates, prices)

• Individuals : follow workshops/categories, sign up at workshops

• Hobbystreet : handle registrations, plannings, payments, propose customized suggestions

CONTEXT

|

DATA STRUCTURE

Username

city

Carftmanname

activity

Workshopname

description

GPS coordinates

Sessiondate, time

price

status

stock

Categoryname

activity

follows

proposes

related to instance of

participates

|

SUGGESTIONS : OVERALL STRATEGY

Category

User

Workshop 1

Category 1

Category 2

Workshop 2

Workshop 3

Workshop 4

Similar descriptions

User

Workshop 1

Workshop 2

Workshop 3

Workshop 4

Workshop 5

Workshop 6

from LSA

Similar users

User 1

Workshop 1

Workshop 2

Workshop 3

User 2

User 3

Workshop 4

Workshop 5

Workshop 6

Usim

|

USE CASE 2 : KONBINI

|

Context

“... multi format media company

producing its own mix of culture, art

and news content. It promotes

online journalism, advocating an

emphasis on pop culture and a

commitment to develop local

emerging talents.”

“... became one of the first

websites to put Social Media

platforms at the heart of their

strategy.”Issue: ~90% bounce rate (users going back after viewing a

page)

Solution: Recommend interesting articles on the visited

pages will help user experience.

|

Entities

French posts [693]Authors [56]

Categories [534] Mexican posts [149]

English posts [417]

Examples of node properties

blog_id: 9

post_id: 217628

post_date: 20151007

slug: rihanna-thinks-rachel...

boost: 0

viewed_count: 0

facebook_count: 148

twitter_count: 0

Multiple web sites [US,

England, Mexic, France]

US posts [364]

|

Recommendations principles

For each posts, we will recommend a list of other posts

based on relations shared with the initial post:

- semantic similarity of the contents [LSA]

- number of common categories

- number of common authors

And also on their own properties:

- the freshness

- social counts

- manual boost

Once the graph constructed, these recommendations

can be obtained thanks to a single Cypher query.

|

Conclusion and outlook

|

Stacks and Workflows

Konbini web siteHobbystreet web site

POST content GET recommendations POST content

Daily cached

recommendadions

GET recommendations

Live recommendation for dynamic

interactions

Cached recommendation for high

availability needs

|

Improve semantic analysis:

• exploit similarity of short descriptions (tweets, comments, …). PhD thesis on the subject.

Assess recommendation quality:

• A/B testing but Needs production deployment.

• Offline testing ? No real assessment on the impact of the recommendations performed.

• Rating of pool of testers ?

Outlook

THANK YOU !

Kernix Data Lab+33 (0)1 53 98 73 43

lab@kernix.com