Date post: | 26-Jan-2015 |
Category: |
Technology |
Upload: | xavier-amatriain |
View: | 103 times |
Download: | 1 times |
If you like this...
The Beauty of Computing
… with People
Xavier AmatriainTelefonica Research
November 201010 Years of Computer Science@ Free University of Bolzano
Outline
1. Introduction (to the talk, me, and Telefonica)
2. Computing with people: Information overload and Recommender Systems
3. Some of our latest research
4. Conclusions
We all know this...
But I am here to talk about things in Computer Science
you may not know...
CS can be fun
And creative...
And does not require to be an isolated geek
But first...
About me
Up until 2005
About me
The CLAM Project
About me
2005 2007
About me
The Allosphere
About me
2007 ..
About Telefonica and Telefonica R&D
About 71,000 professionals
About 257,000 professionals
Staff
Services
Finances Rev: 4,273 M€
Integrated ICT solutions for all
customers
Clients About 12 million
subscribers
About 265 million
customers
Basic telephone and data services
1989
SpainOperations in 25 countries
Geographies
Rev: 56.7 b€
2000 2009
About 149,000 professionals
About 68 million
customers
Wireline and mobile voice, data and
Internet services
(1) EPS: Earnings per share
Rev: 28,485 M€
Operations in16 countries
Telefonica is a fast-growing Telecom
Telco sector worldwide ranking by market cap (US$ bn)
Currently among the largest in the world
Source: Bloomberg, 06/12/09
Telefónica is the sixth worldwide operator in R&D effort and the first company in Spain
17
TELCO OPERATORR&D INVESTMENT 2008 (M€)
NTT 2.151,28
BT 1.157,49
France Telecom 900,00
Telstra 756,41
Telecom Italia 704,00
Telefonica 668,00
Deutsche Telekom 614,00
AT&T 598,57
Vodafone 289,63
KT 218,92
KDDI 155,30
SK Telecom 138,84
Telenor 103,16
TeliaSonera 102,53
COMPANYR&D INVESTMENT 2008 (M€)
Telefonica 668,00
Indra Sistemas 166,34
Almirall 98,20
Repsol YPF 83,00
Iberdrola 73,10
Acciona 71,30
Zeltia 58,09
Fagor Electrodomesticos 56,00
Industria de Turbo Propulsores
50,00
Abengoa 33,54
Gamesa 32,06
Ebro Puleva 11,58
Cie Automotive 11,51
Amper 11,11
Scientific Research
Multimedia CoreMobile and Ubicomp
DATA MINING
User Modelling & Data Mining
HCIR
Content Distribution & P2P Wireless Systems
Social Networks
Enough introductions already...
Part 2. Information Overload and Recommender Systems
Information Overload
More is Less
Less Decisions
Worse Decisions
Search engines don’t always hold the answer
What about curiosity?
What about discovery?
What about information to help take decisions?
The Age of Search has come to an end
●... long live the Age of Recommendation!● Chris Anderson in “The Long Tail”
● “We are leaving the age of information and entering the age of recommendation”
● CNN Money, “The race to create a 'smart' Google”:● “The Web, they say, is leaving the era of search and entering
one of discovery. What's the difference? Search is what you do when you're looking for something. Discovery is when something wonderful that you didn't know existed, or didn't know how to ask for, finds you.”
But, what areRecommender
Systems?
Read This!
Ask Prof. Francesco Ricci
The value of recommendations
Netflix: 2/3 of the movies rented are recommended
Google News: recommendations generate 38% more clickthrough
Amazon: 35% sales from recommendations
Choicestream: 28% of the people would buy more music if they found what they liked.
u
The “Recommender problem”
● Estimate a utility function that is able to automatically predict how much a user will like an item that is unknown for her. Based on:
● Past behavior● Relations to other users● Item similarity● Context● ...
Data mining + all those other things
● User Interface● System requirements (efficiency, scalability,
privacy....)● Business Logic● Serendipity● ....
The Netflix Prize
● 500K users x 17K movie titles = 100M ratings = $1M (if you “only” improve existing system by 10%! From 0.95 to 0.85 RMSE)● 49K contestants on 40K teams from
184 countries.
● 41K valid submissions from 5K teams; 64 submissions per day
● Wining approach uses hundreds of predictors from several teams
Approaches to Recommendation
●Collaborative Filtering● Recommend items based only on the users past behavior
● User-based● Find similar users to me and recommend what they liked
● Item-based● Find similar items to those that I have previously liked
●Content-based● Recommend based on features inherent to the items
●Social recommendations (trust-based)
What works
● It depends on the domain and particular problem● As a general rule, it is usually a good idea to combine:
Hybrid Recommender Systems
● However, in the general case it has been demonstrated that (currently) the best isolated approach is CF.
● Item-based in general more efficient and better but mixing CF approaches can improve result
● Other approaches can be hybridized to improve results in specific cases (cold-start problem...)
35
The CF Ingredients
● List of m Users and a list of n Items● Each user has a list of items with associated opinion
● Explicit opinion - a rating score (numerical scale)● Implicit feedback – purchase records or listening
history● Active user for whom the prediction task is performed● A metric for measuring similarity between users ● A method for selecting a subset of neighbors ● A method for predicting a rating for items not rated by the active user.
But …
Part 3. Some of our latest Research
User Feedback is Noisy
DID YOU HEAR WHAT I LIKE??!!
...and limits Our Prediction Accuracy
Experimental Setup
● 100 Movies selected from Netflix dataset doing a stratified random sampling on popularity
● Ratings on a 1 to 5 star scale● Special “not seen” symbol.
● Trial 1 and 3 = random order; trial 2 = ordered by popularity
● 118 participants
Results
● Users are inconsistent● Inconsistencies are not random and depend on
many factors ● More inconsistencies for mild opinions● More inconsistencies for negative opinions● How the items are presented affects
inconsistencies
● Inconsistencies produce natural noise● Natural noise limits our prediction accuracy
independently of the algorithm: Magic Barrier
Rate it again
● By asking users to rate items again we can remove noise in the dataset● Improvements of up to 14% in accuracy!
● Because we don't want all users to re-rate all items we design ways to do partial denoising● Data-dependent: only denoise extreme ratings● User-dependent: detect “noisy” users
Who Can we trust?
The Wisdom of the Few
X. Amatriain et al. "The wisdom of the few: a collaborative filtering approach based on expert opinions from the web", SIGIR '09
Expert-based CF
● expert = individual that we can trust to have produced thoughtful, consistent and reliable evaluations (ratings) of items in a given domain
● Expert-based Collaborative Filtering● Find neighbors from a reduced set of experts instead of
regular users.
1. Identify domain experts with reliable ratings
2. For each user, compute “expert neighbors”
3. Compute recommendations similar to standard kNN CF
Working Prototypes
Music recommendations, mobile geo-located recommendations...
User Study
● 57 participants, only 14.5 ratings/participant
● 50% of the users consider Expert-based CF to be good or very good
● Expert-based CF: only algorithm with an average rating over 3 (on a 0-4 scale)
Context Overload
≠
Mobile phones are “personal”
Where is the nearest florist?
Where is that really cool cocktail barI went to last month?
Interesting things close to me?
Events near me?
Context-aware Recommendations
● A clear area of research and interest for companies: recommend me something that I like and is relevant in my current context.● Context = any variable that adds a new dimension
to the 2D user-item problem (e.g. time, geolocation, weather...)
User micro-profiles
● Our proposal is to represent a user by a hierarchy of micro-profiles where each micro-profile represents a class in the context variable
Multiverse Recommendation
● A different approach: represent the contextual recommendation problem by n-dimensional matrices (aka Tensors)
Master Planner
Automatic and personalized tourist route recommendations, a new approach to discovering the world
Tourism 2.0
● Tourism is not the same since the web appeared:– People search for
information on where to go online (reading blogs, in their social networks...)
– People buy tickets and hotel packages online
– People post pictures and discuss tips online
Tourism 3.0 – Going Mobile
● The mobile web and smartphones are introducing yet another revolution
– Tourists can now access information on the go:● Looking for information on a sight● Tips on where to go next● Information about the weather● ....
Master Planner
● I am in Bolzano, it's November and sunny, I have 6 hours to visit things and I am interested on music, art, literature, and sports
● I need: An automatic tourist route recommender system
Master Planner
● Completely automatic personalized/contextualized tourist recommender system
● Generates automatic city models using web resources
● Generates automatic user models from regular user profiles
● Personalizes/contextualizes generic city models
● Recommends optimized personalized routes taking into account constraints using AI techniques
Friending 3.0
Recommending contacts in Social Networks
The importance of finding contacts
● The ability to attract people to a social network is the key to its success
● The main reason people get hooked to a particular SN is because they find relevant “friends”
The concept of “friend”
● The idea of “friend” is different for each SN– People do not connect on Facebook for the
same reasons than in Twitter or Linkedin
● Even in a particular SN, different people connect for different reasons:
– Social proximity (friend of friend)
– Geoproximity (person who lives nearby)
– Content (person that talks about interesting stuff)
– Popularity (to connect to influential people)
– ....
Friending 3.0
● Automatic Personalized Friend Recommending System
● Basic rationale– Combines different factors
and personalizes the combination for each user:
● Social proximity● Geoproximity● Popularity● Content similarity● ...
But friends are not only for fun...
They can be very helpful sometimes!
Adriana
Catalan
Can we improve the search and discovery experience by providing a readily available connection to
their social network?
WHAT IS PORQPINE?
● Distributed social web search engine● Locally caches the page & records user
interactions (e.g., bookmarking). ● Searches by querying caches of friends
● Pages that friends have “interacted with” are ranked higher
Personalized
Distrib
uted
Lazy collaboration
Socially aware
Contex
tawar
e
SSB
iPhone optimized web-application + Facebook app
When launched it centers on the users current physical location
Displays all queries/questions posted by other users in that location
As users pan/zoom the set of queries is updated
Users can post new queries or interact with queries of others
Live Field Study in-the-wild
Apr 2009, 16 users, 1 week, ireland
Sept 2009, 34 users, 1 month, ireland
Part 4. Conclusions
Conclusions
● Computer Science is not only a good choice from a career perspective, it's also fun, creative, and engaging (Hope I have convinced you by now)
● One of the amazing things is that you can now apply CS research to any domain (I am meeting the world's best chef next week to brainstorm)
● An important current trend is to use CS to better understand people and improve their lives
● The goal of Recommender Systems is precisely that: understand you in your context and help you take better decisions
Thanks!
Questions?
Xavier [email protected]
http://xavier.amatriain.nethttp://technocalifornia.blogspot.com
http://twitter.com/xamat