Date post: | 02-Jan-2016 |
Category: |
Documents |
Upload: | kevin-mccarthy |
View: | 221 times |
Download: | 4 times |
LARS : A Location-Aware Recommender System
ICDE ‘12
1
1. Introduction
• Traditional recommender systems– triple(user, rating, item)– (user id U) + (limit K)
• return K recommended items to U
• Locations– destinations check-in(Facebook, Foursquare)– user zip code(MovieLens)
2
1.1 Motivation: A Study of Location-Based Ratings
• Preference locality
3
1.1 Motivation: A Study of Location-Based Ratings
• Travel locality
4
1.2 LARS - A Location-Aware Recommender
• (user, ulocation, rating, item)• (user, rating, item, ilocation)• (user,ulocation, rating, item, ilocation)
5
1.2 LARS - A Location-Aware Recommender
• next……• 2. an overview of LARS• 3. spatial user ratings for non-spatial items• 4. non-spatial user ratings for spatial items• 5. spatial user ratings for spatial items• 6. experimental analysis
6
2.1 LARS Query Model
• (user id U) + (limit K) + (location L)——> return K recommended items to U
• query :– snapshot (one-time) queries– continuous queries
7
2.2 Item-Based Collaborative Filtering
• Phase I: Model Building– 计算 item 间的相似度 sim
– 对于每个 item• 模型只会存储前 n 个相似度最高的 sim 值• n 为 user 个数
• Phase II: Recommendation Generation
8
3 Spatial User Ratings For Non-spatial Items
• (user, ulocation, rating, item)• requirements
– Locality( 局部性 ) :能对地点感知– Scalability( 可扩展性 ) :能够对大量的用户进行
运算– Influence() :用户能够改变感知的区域大小
9
3.1 Data Structure
• partial pyramid structure— 局部锥形结构
10
3.1 Data Structure
11
3.2 Query Processing
• query processing steps– 1. 从最底层找起– 2. 如果没找到
• 去上一层找– 3. 直到找到为止
12
3.2 Query Processing
• Continuous queries– ( 一边移动 & 一边查询 )– 1. 如果没有离开上一次查询时所在的 grid
• 还是原来熟悉的结果– 2. 否则
• 去上一层找,找到为止
13
3.3 Data Structure Maintenance
• 当有 new users,ratings,items 时• Trigger: N% ( 才会启动 Maintenance)
• The maintenance will be amortized( 均摊 )
• Step I: Model Rebuild• Step II: Merging( 合并 )/Splitting( 分裂 )
Maintenance
14
3.3.1 Cell Merging
• Impoves scalability– storage
• less CF models size( 只需储存于高层,底层不储存 )• 主要标准
– computational overhead• less maintenance computation 维护次数减少• less continuous query processing computation• 次要标准
• Hurts locality
15
3.3.1 Cell Merging
• Two percentage values– locality_loss– scalability_gain
• A system parameter M∈ [0,1]• Merges if :
– M 越小,则越倾向于合并
16
3.3.1 Cell Merging
• Calculating locality_Loss– Sample– Compare
17
3.3.1 Cell Merging• Calculating scalability_gain
– ( child cells ) / ( child cells + parent cell )– st
• 还是举之前的栗子– scalability_gain
• 4 child cells == 2GB• parent cell == 2GB• scalability_gain=50%
18
3.3.1 Cell Merging• locality_loss=25%• scalability_gain=50%
• Assuming M=0.7• • but (0.3*50%)<(0.7*25%)• will not merge
19
3.3.2 Cell Splitting
• 其效用与 Cell Merging 相反– Improves locality– Hurts scalability
• 计算与 Cell Merging 基本相同– locality_gain– scalability_loss
20
Merging Splitting
locality_loss locality_gain
scalability_gain scalability_loss
4 Non-spatial User Ratings For Spatial Items
• (user, rating, item, ilocation)• travel locality• travel penalty
– – expensive computational overhead– so,employs “early termination”
21
4.1 Query Processing
• Algorithm– 1. 找出全部 item 中, TravelPenalty 最小的 k 个 item ,将 k 个 item
按照 RecScore 从大到小排序,形成表 R– 2. 设 LowestRecScore 为 R 中最小的 ( 也就是第 K 个 ) RecScore 值– 3. 找出剩余 item 中 TravelPenalty 最小的 item
• 4. 设 MaxPossibleScore = MAX_RATING – TravelPenalty• 5.IF MaxPossibleScore <= LowestRecScore
– 6. 不再找了,直接 return R• 7. 算出此 item 的 RecScore• 8.IF RecScore > LowestRecScore
– RecScore 替换 LowestRecScore 进入 R– 重新找一个 LowestRecScore – 回到 3
22
4.2 Incremental Travel Penalty Computation
• Incremental KNN– online– Exact– Expensive
• Penalty Grid– Offline– Less exact– Efficient
23
5 Spatial User Ratings For Spatial Items
• (user, ulocation, rating, item, ilocation)• user partitioning & travel penalty
– can be used together– with very little change
24
6 Experiment
• test recommendation quality– Foursquare : real dataset– MovieLens : real dataset
• test scalability and query efficiency– Synthetic : synthetically generated dataset
25
6 Experiment
• CF: item-based collaborative filtering• LARS-T: LARS with only travel penalty• LARS-U: LARS with only user partitioning• LARS: LARS with both techniques
• default parameter– M == 0.3– k == 10– the number of pyramid levels (h) == 8
26
6.1 Recommendation Quality for Varying Pyramid Levels
27
• 80% 训练, 20% 验证:• Measure ( Quality )
– 统计预测的推荐结果 进入真实评分前 k( 默认 k=10) 的次数• 层次分太细,每个 grid 中 rating 太少
6.2 Recommendation Quality for Varying Values of k
28
6.3 Storage Vs. Locality
29
Note : M 越小 ,越倾向于合并 M 越大,越倾向于分裂
6.4 Scalability
30
Default : M=0.3LARS is acceptable.
存储大小
平均维护时间
6.5 Query Processing Performance
31
单次查询:LARS vs LARS-ULARS vs LARS-T通过对比可以发现之前两种技术所带来的时间上的优势
响应时间
平均响应时间
连续查询:CF 最快(那当然了 -_-# )除此之外, LARS 最快