+ All Categories
Home > Documents > A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space...

A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space...

Date post: 14-Dec-2015
Category:
Upload: willie-poulton
View: 227 times
Download: 6 times
Share this document with a friend
20
A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo Musto Presenter Sawood Alam <[email protected]>
Transcript
Page 1: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

A Vector Space Model for Automatic Indexing

G. Salton, A. Wong and C. S. Yang

Enhanced Vector Space Models for Content-based Recommender Systems

Cataldo Musto

PresenterSawood Alam <[email protected]>

Page 2: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

A Vector Space Model for Automatic Indexing

G. Salton, A. Wong and C. S. YangCornell University

Page 3: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

Introduction

• In document retrieval, best indexing space is where each entity lies far away from others

• Density of the object space becomes a measure of indexing system

• Retrieval performance correlate inversely with space density

Page 4: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

Document Space

• Di = (di1, di2, di3, …, dij)

Page 5: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

Document Space (cont.)

Page 6: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

Document Space (cont.)

Page 7: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

Indexing Performance vs. Space Density

Page 8: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

Cluster Density vs. Indexing Performance

Page 9: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

Discrimination Value Model

Page 10: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

Discrimination Value Model (cont.)

Page 11: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

Discrimination Value Model Summary

Page 12: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

Average Recall vs. Precision

Page 13: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

Summary Recall vs. Precision

Page 14: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

Enhanced Vector Space Models for Content-based Recommender Systems

Cataldo MustoDept. of Computer Science

University of Bari, [email protected]

Page 15: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

Introduction

• Vector Space Models (VSM) in Information Retrieval is an established practice

• Investigate the impact of vector space models in Information Filtering– Recommender system

Page 16: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

Problems of VSM

• High dimensionality– Becoming more serious due to emerging social

apps and micro-blogging, generating lots of web content and new vocabulary

• Inability to manage document semantics– Order of the term occurrence in the document

Page 17: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

Components

• Context vector for each term– Values in {-1, 0, 1}

• Vector Space representation of a term (t)• Vector Space representation of a document (d)• Vector Space representation of a user profile (pu)

Page 18: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

Indexing Technique

• Random Indexing-based model• Weighted Random Indexing-based model• Semantic Vector-based model• Weighted Semantic Vector-based model

Page 19: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

Experimental Evaluation

Page 20: A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo.

Conclusions

• First prototype with naive weighting scheme is comparable to other content based filtering techniques like Bayesian classifier

• Other complex weighting schemes should perform better

• User profiles may be studied based on Linked Data rather than keyword based user profiles


Recommended