+ All Categories
Home > Technology > Opinion-Based Entity Ranking

Opinion-Based Entity Ranking

Date post: 24-May-2015
Category:
Upload: kavita-ganesan
View: 2,630 times
Download: 0 times
Share this document with a friend
Description:
A brief description of the Opinion-Based Entity Ranking paper published in the Information Retrieval Journal, Volume 15, Number 2, 2012. Slides By Kavita Ganesan.
Popular Tags:
24
Opinion-Based Entity Ranking Ganesan & Zhai 2012, Information Retrieval, Vol 15, Number 2 Kavita Ganesan (www.kavita-ganesan.com) University of Illinois @ Urbana Champaign Journal Project Page
Transcript
Page 1: Opinion-Based Entity Ranking

Opinion-Based Entity RankingGanesan & Zhai 2012, Information Retrieval, Vol 15, Number 2

Kavita Ganesan (www.kavita-ganesan.com)

University of Illinois @ Urbana ChampaignJournalProject Page

Page 2: Opinion-Based Entity Ranking

The problem

Currently: No easy or direct way of finding entities (e.g. products, people, businesses) based on online opinions

You need to read opinions about different entities to find entities that fulfill personal criteriae.g. finding mp3 players with ‘good sound quality’

Page 3: Opinion-Based Entity Ranking

The problem

Currently: No easy or direct way of finding entities (e.g. products, people, businesses) based on online opinions

You need to read opinions about different entities to find entities that fulfill personal criteria (e.g. finding mp3 players with ‘good sound quality’Time consuming process & impairs

user productivity!

Page 4: Opinion-Based Entity Ranking

Proposed Idea

Use existing opinions to rank entities based on a set of unstructured user preferences

Example of user preferences: Finding a hotel: “clean rooms, heated pools” Finding a restaurant: “authentic food, good ambience”

Page 5: Opinion-Based Entity Ranking

How to rank entities based on opinions?

Most obvious way: use results of existing opinion mining methods Find sentiment ratings on various aspects ▪ For example, for an mp3 player: find ratings for screen, sound,

battery life aspects▪ Then, rank entities based on these discovered aspect ratings

Problem is that this is Not practical!▪ Costly – It is costly to mine large amounts of textual content▪ Prior knowledge – You need to know the set of queriable aspects in

advance. So, you may have to define aspects for each domain either manually or through text mining

▪ Supervision – Most of the existing methods rely on some form of supervision like the presence of overall user ratings. Such information may not always be available.

Page 6: Opinion-Based Entity Ranking

Proposed Approach

Leverage Existing Text Retrieval Models Why?

Retrieval models can scale up to large amounts of textual content

The models themselves can be tweaked or redefined

This does not require costly information extraction or text mining

Page 7: Opinion-Based Entity Ranking

The Basic Setup

Leveraging robust text retrieval models

Entity 1 Entity 1 Reviews

Entity 2 Entity 2 Reviews

Entity 3 Entity 3 Reviews

retrieval models

(BM25, LM, PL2)

User Preferences(query)

rank

rank

rank

Keyword matchbetween user prefs

& textual reviews

Indexed

Page 8: Opinion-Based Entity Ranking

The Basic Setup

Leveraging robust text retrieval models

Entity 3 Entity 3 Reviews

Entity 2 Entity 2 Reviews

Entity 1 Entity 1 Reviews

retrieval models

(BM25, LM, PL2)

User Preferences(query)

rank

rank

rank

Keyword matchbetween user prefs

& textual reviews

Indexed

Page 9: Opinion-Based Entity Ranking

Opinion Based Ranking vs. Document Retrieval

Based on the basic setup, this ranking problem seems similar to regular document retrieval problem

However, there are important differences:1. The query is meant to express a user's preferences in keywords

Query is expected to be longer than regular keyword queries Query may contain sub-queries expressing preferences for different

aspects It may actually be beneficial to model these semantic aspects

2. Ranking is to capture how well an entity satisfies a user's preferences Not the relevance of a document to a query (as in regular retrieval) The matching of opinion/sentiment words would be important in this

case

Page 10: Opinion-Based Entity Ranking

Focus of this work

Investigate use of text retrieval models for the task of Opinion-Based Entity Ranking

Explore some extensions over IR models

Propose evaluation method for the ranking task

User Study To determine if results make sense to users Validate effectiveness of evaluation method

Page 11: Opinion-Based Entity Ranking

Extension 1: Modeling Aspects in Query

In standard text retrieval we cannot distinguish the multiple preferences in a query.For example: “clean rooms, cheap, good service” Would be treated as a long keyword query even though

there are 3 preferences in the query Problem with this is that an entity may score highly

because of matching one aspect extremely well

To improve this: We try to score each preference separately and then

combine the results

Page 12: Opinion-Based Entity Ranking

Extension 1: Modeling Aspects in Query

“clean rooms” “cheap” “good service”

“clean rooms, cheap, good service”

retrieval model

Results

retrieval modelscored

separately

result set 1 result set 2 result set 3

Resultsresults

combined

Aspect Queries

Page 13: Opinion-Based Entity Ranking

Extension 2: Opinion Expansion

In standard retrieval models the matching of an opinion word & a standard topic word is not distinguished

However, with Opinion-Based Entity Ranking: It is important to match opinion words in the

query, but opinion words tend to have more variation than topic words

Solution: Expand a query with similar opinion words to help emphasize the matching of opinions

Page 14: Opinion-Based Entity Ranking

Extension 2: Opinion Expansion

Fantastic battery life

QueryGood battery life

Great battery life

Excellent battery life

Similar Meaning to “Fantastic battery life”

Review documents

Page 15: Opinion-Based Entity Ranking

Extension 2: Opinion Expansion

Fantastic, good, great,excellent…

battery life

Expanded Query

Good battery life

Great battery life

Excellent battery life

Similar Meaning to “Fantastic battery life”

Review documents

Fantastic battery life

QueryAdd synonyms of word “fantastic”

Page 16: Opinion-Based Entity Ranking

Evaluation of Ranking Task

Document Collection

Gold Standard: Relevance Judgement

User Queries

Evaluation Measure

Page 17: Opinion-Based Entity Ranking

Document Collection: Reviews of Hotels – Tripadvisor Reviews of Cars – Edmunds

Evaluation of Ranking Task

Free text reviews

Numerical aspect ratings

Gold standard

Page 18: Opinion-Based Entity Ranking

Gold Standard: Needed to asses performance of ranking task

For each entity & for each aspect (in dataset): Average numerical ratings across reviews. This will

give the judgment score for each aspect Assumption:

Since the numerical ratings were given by users, this would be a good approximation to actual human judgment

Evaluation of Ranking Task

Page 19: Opinion-Based Entity Ranking

Gold Standard:Ex. User looking for cars with “good performance” Ideally, the system should return cars with▪ High numerical ratings on performance aspect▪ Otherwise, we can say that the system is not doing well in

ranking

Evaluation of Ranking Task

Should have high ratings on performance

Page 20: Opinion-Based Entity Ranking

User Queries Semi synthethic queries Not able to obtain natural sample of queries

Ask users to specify preferences on different aspects of car & hotel based on aspects available in dataset▪ Seed queries▪ Ex. Fuel: “good gas mileage”, “great mpg”

Randomly combine seed queries from different aspects forms synthetic queries▪ Ex. Query 1: “great mpg, reliable car”▪ Ex. Query 2: “comfortable, good performance”

Evaluation of Ranking Task

Page 21: Opinion-Based Entity Ranking

Evaluation of Ranking Task

Evaluation Measure: nDCG This measure is ideal because it is based on

multiple levels of ranking The numerical ratings used as judgment scores has

a range of values and nDCG will actually support this.

Page 22: Opinion-Based Entity Ranking

Users were asked to manually determine the relevance of system generated rankings to a set of queries

Two reasons for user study: Validate that results made sense to real users

On average, users thought that the entities retrieved by the system were a reasonable match to the queries

Validate effectiveness of gold standard rankings Gold standard ranking has relatively strong agreement with

user rankings. This means the gold standard based on numerical ratings is a good approximation to human judgment

User Study

Page 23: Opinion-Based Entity Ranking

Results: QAM & Opinion Expansion

PL2 LM BM250.0%1.0%2.0%3.0%4.0%5.0%6.0%7.0%8.0%9.0%

QAM QAM + OpinExp

PL2 LM BM250.0%

0.5%

1.0%

1.5%

2.0%

2.5%

QAM QAM + OpinExp

Hotels Cars

Improvement in ranking using QAMImprovement in ranking using QAM + OpinExp

Most effective on BM25 (p23)

Most effective on BM25 (p23)

Page 24: Opinion-Based Entity Ranking

Summary

Lightweight approach to ranking entities based on opinions Use existing text retrieval models

Explored some enhancements over retrieval models Namely opinion expansion & query aspect modeling Both showed some improvement in ranking

Proposed evaluation method using user ratings User study shows that the evaluation method is sound This method can be used for future evaluation tasks


Recommended