Recommendation Engines: Some Practical and Theoretical Considerations

Post on 11-May-2015

563 views 2 download

Tags:

description

A talk given at Youku.com about the machine learning and scalability aspects of contemporary recommendation systems.

transcript

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Recommendation Engines: Some Practicaland Theoretical Considerations

Research Seminar at Youku.com

Peter Wittek

University of Boras & Tsinghua University

April 22, 2013

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Outline

1 Introduction

2 Collaborative Filtering

3 Sparsity and Scalability

4 Contextuality

5 Conclusions

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

What It Is About

Objective: Predict user preferencesContent-based filtering versus collaborative filtering

Users versus content/metadata indexingHybrid systems

Roots in information retrieval, relevance feedback

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

An Example: Slope One

A simple baseline methodEasy to implement, efficient

Slope One

2 ? User B

Item J? = 2 + (1.5 − 1) = 2.5

1 1.5

Item I

User A

1.5 − 1 = 0.5

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Background Operations

Memory-basedSimpleRobustCosine dissimilarity, Euclidean distance, etc.

Model-basedFull array of machine learning algorithms

The kernel tricka) b)

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Outline

1 Introduction

2 Collaborative Filtering

3 Sparsity and Scalability

4 Contextuality

5 Conclusions

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

User Recommendations

Even a few ratings are more accurate than metadaThe cold start problemAnalysing users’ behaviour, preferences, ratingsExplicit versus implicit dataScalability and sparsity

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Learning Methods

Simple ones: k-NN, decision treesMore intricate ones: matrix factorization, support vectors,artificial neural networks

A feed-forward neural network

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

An Example Pipeline

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Outline

1 Introduction

2 Collaborative Filtering

3 Sparsity and Scalability

4 Contextuality

5 Conclusions

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

The Matrix We Are Facing

High-dimensionalSparse, missing elements

0.01%-0.1% nonzeroLow-rank

User types

The problem as a sparse matrix

Users (≈ 107 − 108)? ? ? 4 ?? 1 ? ? ? Videos (≈ 108 − 109)? ? ? ? 51 ? ? ? 2

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Dealing With Sparsity

Rating from 1− 5That is three bits at best

For Netflix:log2 |users| = 18.8log2 |movies| = 14.1The numbers date to the competition (pre-2009).

Each entry will barely take three bytesFurther tweaks can halve the storage requirement

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Low Rank Approximation

Goal: Estimate ratings for unknown elements

Singular Value Decomposition

A = U S VTx x

Here S is a diagonal matrix containing the singular values indecreasing order.

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Conceptual Dynamics

The matrix is not staticIncremental biseration+Gaussian blurring+3D visualization

Snapshots on BBC videos

(a) (b)

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Scalability

Learning algorithms are computationally demandingSome parallelize wellApache Mahout originally grew out of a scalable CF library

Based on Apache HadoopMapReduce: scaling out on a large number of nodes ofcommodity hardware

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Real-Time Systems

Update operations and queriesParallel and distributed executionAcceleration by graphics hardware

Massively parallel architecture

GPUCompute Device

ComputeUnit

ComputeUnit

StreamCores

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Outline

1 Introduction

2 Collaborative Filtering

3 Sparsity and Scalability

4 Contextuality

5 Conclusions

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

What Is Contextuality

The users’ preferences are not staticThe preferences are a function of the present contextIndirect clues: current browsing history, recent purchasehistory, etc.Infer context-type/micro-profileSmall improvements over baseline methods have alreadybeen reported

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Enter Quantum Mechanics

Foraging theory: how to maximize net energy intake in apatchy environmentQuantum-like contextual patterns emerge from classicaldecisionsForaging decisions can translate to problems inrecommendation systems

Forget Bayes’ rule: Consider Luder’s rule

||Pb1|ψa1〉||2 = ||Pb1Pa1|ψ〉||2/||Pa1|ψ〉||2 (1)

Two operators are generally not commutative, leading to asequential, context-sensitive model of decision making.

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Outline

1 Introduction

2 Collaborative Filtering

3 Sparsity and Scalability

4 Contextuality

5 Conclusions

Peter Wittek Recommendation Engines

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Summary

A large part of recommendation systems is an engineeringproblem

Hybridise collaborative and content-based filteringAssemble and tune machine learning pipelinesMeasure prediction quality and monetary gainsContinue tuning

Exciting theoretical considerations await further research

Peter Wittek Recommendation Engines