+ All Categories
Home > Education > seminar topic

seminar topic

Date post: 12-Nov-2014
Category:
Upload: dipple
View: 3,916 times
Download: 0 times
Share this document with a friend
Description:
 
Popular Tags:
46
Personalization in Information Retrieval, Extraction and Access Workshop On Ontology, NLP, Personalization And IE/IR - IIT Bombay, Mumbai 15-17 July 2008 Vasudeva Varma www.iiit.ac.in/~vasu
Transcript
Page 1: seminar topic

Personalization in Information Retrieval, Extraction and AccessWorkshop On Ontology, NLP, Personalization And IE/IR - IIT Bombay, Mumbai 15-17 July 2008

Vasudeva Varma

www.iiit.ac.in/~vasu

Page 2: seminar topic

2

Search Engine Heat is On!

IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H

2

� Applications of Search Technologies

�Web search

�Product search

�Service search

�Domain Search

� Already a BIG Market

� HUGE Opportunity

5/30/2008

Page 3: seminar topic

3

Agenda

5/30/2008IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H

3

� Evolution of Search Engines

� Information Retrieval Vs. Extraction Vs. Access

� Personalization in IR, IE and IA

� Applications in Personalized IA

� Conclusions

Page 4: seminar topic

4

Evolution of Search Engines

IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H

4

� Crawling and Indexing

� Topic directories

� Clustering and Classification

� Hyperlink analysis

� Resource discovery and vertical portals

� Semantic Web

� ???

5/30/2008

Page 5: seminar topic

5

Current IR engines fail – why?

IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H

5

� Wide variation in retrieval results � User topic

� Retrieval system

� Different approaches work for different systems.

� No way to determine which approach will work for a particular query.

Solution:

� Deeper analysis of the content and Query

5/30/2008

Page 6: seminar topic

6

Motivation for Deeper Analysis

IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H

6

� Texts are one of the major sources of

information and knowledge.

However, they are not transparent.

They have to be systematically integrated with

the other sources like data bases, numerical data,

etc.

NLP/IR/IE for better analysis

IA for better presentation5/30/2008

Page 7: seminar topic

7

Agenda

5/30/2008IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H

7

� Evolution of Search Engines

� Information Retrieval Vs. Extraction Vs. Access

� Personalization in IR, IE and IA

� Applications in Personalized IA

� Conclusions

Page 8: seminar topic

8

IR vs. IE vs. IA

IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H

8

� To search and retrieve documents in response to queries for information

Vs.

� To extract information that fits pre-defined database schemas or templates, specifying the output formats

Vs.

� To make the required information accessible to the user in theirchoice of language, mode, level of detail and format

5/30/2008

Page 9: seminar topic

IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H9

Collection of Texts

IR System

Characterization of Texts

Queries

5/30/2008

Page 10: seminar topic

IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H10

Collection of Texts

IR System

Characterization of Texts

Queries

Interpretation

Knowledge

5/30/2008

Page 11: seminar topic

IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H11

Collection of Texts

Passage

IR System

Characterization of Texts

Queries

Interpretation

Knowledge

5/30/2008

Page 12: seminar topic

IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H12

Collection of Texts

Passage

IR System

Characterization of Texts

Queries

Interpretation

Knowledge

IE System

Texts Templates

Structures

of

Sentences

NLP

5/30/2008

Page 13: seminar topic

I

IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H13

Passage

IR System

Interpretation

Knowledge

IE System

5/30/2008

Machine

Translation

Summarization

Visualization

Tools

Information Access

Technologies

Snippet

Generation

NL Generation

Page 14: seminar topic

14

Agenda

5/30/2008IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H

14

� Evolution of Search Engines

� Information Retrieval Vs. Extraction Vs. Access

� Personalization in IR, IE and IA

� Applications in Personalized IA

� Conclusions

Page 15: seminar topic

15

Limitations of Current IR Systems15

� All users get same results for a given query –independent of:

� Previous search history

� Current Search Context

� Treat all users the same

� Does one size fits all?

5/30/2008IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H

Page 16: seminar topic

16

Personalized Web Search16

� Automatic adjustment of information content, structure, and presentation tailored to an individual user.

� Characteristics: Age, Gender, Special Interest Groups, Topic

� Personalize Search Results using � Personal content

� Past Activities (long term and short term)

� Variations:� Explicit or Implicit profile setup

� Explicit or Implicit relevance feedback

� Client side or server side storage of information (privacy implications)

� User control over amount of personalization

5/30/2008IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H

Page 17: seminar topic

17

Overview of Personalized Search

5/30/2008IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H

17

Typically a 3 step process:

1. Obtain results (n>>10)

2. Computer Similarity (results, User)

3. Re-rank the results

Page 18: seminar topic

18 5/30/2008IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H

18

Page 19: seminar topic

19 5/30/2008IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H

19

Page 20: seminar topic

20

Techniques

IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H

20

� Co-active Techniques

� Pro-active Techniques

� Collaborative Filtering

� User Profile based Result Pruning

� User Profile based Query Expansion

5/30/2008

Page 21: seminar topic

21

Problem Description

� Personalized Search - Issues

� What to use to Personalize?

� How to Personalize?

� When not to Personalize?

� How to know Personalization helped?

Page 22: seminar topic

22

Problem Description

� We focus on the issue How to Personalize?

� Problem Statement

� How to learn to personalize for future searches using past search history

� How to model and represent past search contexts

� How to use it to improve search results

Page 23: seminar topic

23

Solution - Outline

� Model and Represent past user feedback – Learning user profile� Use implicit feedback

� Long term learning

� User contexts – triples � {user,query,{relevant documents}}

� Improve Search Results – Reranking� Get Initial Search results

� Take top few and rescore using user profile and rearrange

Page 24: seminar topic

24

Contributions

� I Search : A suite of approaches for Personalized Web Search

� Proposed Personalized search approaches

� Baseline

� Basic Retrieval methods

� Automatic Evaluation

� Analysis of Query Log

Page 25: seminar topic

25

Review of Personalized Search

Personalized Search

Query logs Machine learning Language modeling Community based Others

Page 26: seminar topic

26

I Search : A suite of Techniques for

Personalized IR

� Suite of Approaches???

� Statistical Language modeling based approaches

� Simple N-gram based methods

� Noisy Channel Model based method

� Machine learning based approach

� Ranking SVM based method

� Personalization without relevance feedback

� Simple N-gram based method

Page 27: seminar topic

27

Statistical Language Modeling based Approaches:Overview

� From user contexts, capture statistical properties of texts

� Use the same to improve search results

� Different Contexts� Unigram and Bigrams

� Simple N-gram based approaches

� Relationship between query and document words

� Noisy Channel based approach

Page 28: seminar topic

28

Simple N-gram based approaches

� N-gram : general term for words

� 1-gram : unigram, 2-gram : bigram

� Capture statistical properties of text

� Single words (Unigrams)

� Two adjacent words (Bigrams)

Page 29: seminar topic

29

Learning user profile

Given Past search history

Hu = {(q1, rf1), (q2, rf2), …, (qn, rfn)}

� rfall = contentation of all rf

� For each unigram wi

� User profile

Page 30: seminar topic

30

Sample user profile

Page 31: seminar topic

31

Reranking

� In general LM for IR

� Our Approach

Page 32: seminar topic

32

Noisy Channel based Approach

� Documents and Queries different information spaces

� Queries – short, concise

� Documents – more descriptive

� Most methods to retrieval or personalized web search do not model this

� Capture relationship between query and document words

Page 33: seminar topic

33

Machine Learning based Approaches:Introduction

� Most machine learning for IR - Binary classification problem – “relevant” and “non-relevant”

� Click through data � Click is not an absolute relevance but relative relevance

� i.e., assuming clicked – relevant, un clicked - irrelevant is wrong.

� Clicks – biased

� Partial relative relevance - Clicked documents are more relevant than the un clicked documents.

Page 34: seminar topic

34

Personalized Search without Relevance Feedback:Introduction

� Can personalized be done without relevance feedback about which documents are relevant

� How much informative are the queries posed by users

� Is information contained in the queries enough to personalize?

Page 35: seminar topic

35

Approach

� Past queries of the user available

� Make effective use of past queries

� Simple N-gram based approach

Page 36: seminar topic

36

Experiment Results

� Language Modeling – Best Results! � Interesting framework Personalized Search

� Simple N-gram based approaches also worked well

� Noisy Channel model worked best� Extracting Synthetic Queries helped

� Different Training schemes� IBM Model1 Vs GIZA++� Snippet Vs Document

� Machine Learning – competitive results� Different Features and weights

� Without Relevance Feedback – Very encouraging results� Simple Approach worked well

� Sparsity – Query log was useful

Page 37: seminar topic

37

Agenda

5/30/2008IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H

37

� Evolution of Search Engines

� Information Retrieval Vs. Extraction Vs. Access

� Personalization in IR, IE and IA

� Applications in Personalized IA

� Conclusions

Personalized Search Engine for Mobile Phones

Personalized Summarization (for Mobile Devices)

Page 38: seminar topic

38 (C) Vasudeva Varma, IIIT Hyderabad, India38

“Personalized” Search Engine for mobile devices

� To develop a “personalized” Search Engine for mobile devices that will produce more relevant results based on the queryand the “context”

� What we mean by “Personalized” search?

� user will be able to configure the search interfaces (Explicit feedback)

� System will observe user behavior and customize itself to suit user’s needs (Implicit feedback)

� What we mean by “Context”?

� User, time, location, …

Goal is to make Search accessible on Nokia mobile devices and make use of

the mobile aspects for personalization.

Page 39: seminar topic

39 (C) Vasudeva Varma, IIIT Hyderabad, India39

Scope of the Application

Client Side Server Side

Page 40: seminar topic

40 (C) Vasudeva Varma, IIIT Hyderabad, India40

Problem Re-Definition

� Dynamic user behavior tracking� An observer that keeps track of all “relevant” user actions

� Client module

� Analysis of user actions� Interpret the user actions to derive user interests (categories of interests)

so that more relevant results are displayed

� Construction of user profile implicitly� Implicit Supervised learning

� Personalization� Based on Query

� Based on User Profile

� Based on other parameters such as time, location

Page 41: seminar topic

41 (C) Vasudeva Varma, IIIT Hyderabad, India41

Solution Overview

Page 42: seminar topic

42 (C) Vasudeva Varma, IIIT Hyderabad, India42

Personalized Summarization: Motivation

� The success that search engine providers have found on the PC have failed to translate to the mobile phone. why?

� Because trying to force a PC-based search experience inside a mobile device falls short on a key area of usability

� Search queries typically return hundreds of potential hits.

� Making sense of such output is difficult.

� The results may or may not be of user interest.

� We are looking for a faster and easier way to access precise information on our mobile devices.

Page 43: seminar topic

43 (C) Vasudeva Varma, IIIT Hyderabad, India43

Challenges

� Can we offer users a more simple, friendly and intuitive experience?

� We are looking forward to provide more information with less payload in form of a summary which will take care of� context

� history

� preferences

� device capabilities

� social network

Page 44: seminar topic

44 (C) Vasudeva Varma, IIIT Hyderabad, India

44

System Model

Search Engine

Page 45: seminar topic

45

Summary

5/30/2008IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H

45

� Current Search Engines are inadequate and current know-how is only the tip of an ice-berg

� IR, IE and IA areas have enjoyed huge commercial success and have a huge growth potential

� Personalization is perhaps the next big wave

� Various personalization techniques are available -yet this is a very fertile research field

� The two personalization application shown are just examples of many possibilities.

Page 46: seminar topic

Vasudeva Varma, IIIT Hyderabad

[email protected] or www.iiit.ac.in/~vasu

Thank You – Questions?

5/30/2008

46

IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H


Recommended