Beyond Classification: Latent User Interests Profiling...

transcript

Beyond Classification: Latent User Interests Profiling from Visual Contents Analysis

Longqi Yang, Cheng-Kang (Andy) Hsieh, Deborah Estrin

Social Network Online Purchases Communications

Our interests are manifested online …

Posted/Shared Contents People Connected/Followed Items Purchased

Preferences learning using small data

Online Posts

Private Communication

Shared Images

Personal Image Gallery

Preference Profile

News Search Engine Dietary Entertainment

Text/label-centric approach is widely studied

restaurant

tourism

landscape

Topic ModelingStructure Prediction

Classification/Labeling/Image-to-text

Travel Animal Art

But preferences are sometimes not just about text...

Intra-categorical variance: Hard to capture with text/label!

User A

User B

TravelImages

Research question

Images’ predictive power for users’ preferences beyond labels

Task 1: Pairwise ComparisonTask 2: Prediction

Pairwise Comparison

User A

User B

Discriminative Power of images

IMG1 IMG2 IMGn

IMG1 IMG2 IMGm

…...

Same Label

Prediction: Consistency of Preferences

User 1

User N

Timeline

Predict/Retrieve

IMG1 IMG2 IMGn

IMG1 IMG2 IMGm

…... …...

Same Label

Dataset

Travel boards

Background corpus Analysis

1,800 3,990

5,790 Travel boards

❶ ≥ 100 pins

❷ ∃ pins after June 2014

User Modeling and Image Representation

Pretrained Siamese Network

Pretrained Places CNN

im 410 dim205

Pretrained cluster centers (200) 200 dim

User Profile

B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. “Learning Deep Features for Scene Recognition using Places Database.” Advances in Neural Information Processing Systems 27 (NIPS), 2014

User Modeling and Image Representation

A (CNN)

B (CNN)

− ≈ 0

− > 𝑚

𝓛 =𝟏𝟐𝒍𝑫

𝟐 +𝟏𝟐 𝟏 − 𝒍 𝐦𝐚𝐱(𝟎,𝒎− 𝑫)𝟐

𝒍 = 𝟏

𝒍 = 𝟎

Pairwise Comparison

User A

User B Effects of background distribution!

IMG1 IMG2 IMGn…...

Travel Images

IMG1 IMG2 IMGm…...

Pairwise Comparison

Document 1

Document 2

“and” 10%

“and” 11%

“fatuous” 0.001%

“fatuous” 1.001%

Background “and” 11% “fatuous” 0.001%

User A User B

Pairwise Comparison

Background corpus

𝛿9:;< 𝜎>(𝛿9:;<)Log-odds-ratio Uncertainty

𝑧9:;< =𝛿9:;<

𝜎>(𝛿9:;<)

Pairwise Comparison

Confidence)Level:)95%

Confidence)Level:)99%

𝐦𝐚𝐱 𝒛𝒌𝑨;𝑩 For all user pairs among 3,990 boards

Prediction

User 1

User N

Timeline

Sampled 100 pins

50 pins for test10~50 pins for train

IMG1 IMG2…... …...IMG51 IMG100

Prediction

𝑴𝑹𝑹 =𝟏𝑵G

𝟏𝒓𝒂𝒏𝒌𝒊

𝒊L𝟏

Conclusion

Online Posts

Private Communication

Shared Images

Personal Image Gallery

Preference Profile

Small data fueled preferences learning – what can we do next?

v Utilities of images beyond text/labels.

v Multi-modal data fusion

v End-to-end learning

http://www.cs.cornell.edu/~ylongqi

http://smalldata.io/

@ylongqi

ylongqi@cs.cornell.edu

For more information

Beyond Classification: Latent User Interests Profiling...

Documents