+ All Categories
Home > Documents > Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian...

Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian...

Date post: 11-Aug-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
71
Web-Search Ranking with Initialized Gradient Boosted Regression Trees Ananth Mohan Zheng Chen Kilian Weinberger [email protected] [email protected] [email protected] Department of Computer Science & Engineering Washington University in St. Louis St. Louis, MO 63130, USA
Transcript
Page 1: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Web-Search Ranking with Initialized Gradient Boosted

Regression Trees

Ananth Mohan Zheng Chen Kilian Weinberger

[email protected] [email protected] [email protected]

Department of Computer Science & EngineeringWashington University in St. Louis

St. Louis, MO 63130, USA

Page 2: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Agenda

• Introduction

• Past Work

• Proposed Approach• Introduced RF

• Introduce GBRT.

• iGBRT

• Result for iGBRT

• Classification vs. Regression

• Statistics of the data sets

• Final Results

• Conclusion

2

Page 3: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Introduction

• Learn to Rank Challenge• Given a query, documents have to be ranked according to their relevance to the query.

• Point-wise , light weight.

• A machine learning algorithm is trained to predict the relevance from the feature vector, and during test time the documents are ranked according to these predictions.

• We investigate Random Forests (RF) as a low-cost alternative algorithm to Gradient Boosted Regression Trees. Its yield surprisingly accurate ranking results comparable to or better than GBRT.

3

Page 4: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Introduction (Cont.)

• We combine the two algorithms by first learning a ranking function with RF and using it as initialization for GBRT.

• We refer to this setting as iGBRT.

4

Page 5: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Past Work

The past years have seen many different approaches to web search ranking

• Adaptations of support vector machines (Joachims, 2002; Chapelle and Keerthi, 2010)

• Neural networks (Burges et al., 2005)

• gradient boosted regression trees (GBRT) (Zheng et al., 2007b)

• learning paradigm (Li et al., 2007; Gao et al., 2009; Burges, 2010)

5

Page 6: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Proposed Approach

• Notation and data set

• introduce RF.

• introduce GBRT.

• Check the results of RF and GBRT

• Both algorithms are combined as initialized gradient boosted regression trees (iGBRT).

• Check the results with iGBRT

6

Page 7: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Notations

We assume data of triples D = {(x1; q1; y1),…..,(xn; qn; yn)}

x = documents , q = queries , y = label

D = {(x1; y1),…..,(xn; yn)}

T(.) = trained predictor

Cart(S, k, d) ≈ argmin ∑ (h(zi) – ri)2

h ∈ Td , ( zi , ri ) ∈ S

S ⊆ D , k < f , d > 0

Td = set of all CART trees

7

Page 8: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Data Set

Yahoo Learning to Rank Challenge was based on two data sets

• Set 1 = 473134 documents

• Set 2 = 19944 documents

• Five folds of the Microsoft MSLR data set.

8

Page 9: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Random forest

• The fundamental concept underlying Random Forests is bagging.

• In bagging, a learning algorithm is applied multiple times to a subset of D and the results are averaged.

• Random Forests is essentially bagging applied to CART with full depth (d = ∞), where at each split only K uniformly chosen features are evaluated to find the best splitting point.

• The construction of a single tree is independent from earlier trees.

• So making Random Forests an inherently parallel algorithm.

9

Page 10: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Random forest

10

Page 11: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Gradient Boosted Regression Trees

• Gradient Boosted Regression Trees is also based on tree averaging.

• GBRT sequentially adds small trees (d = 4).

• In each iteration, the new tree to be added that are responsible for the current remaining regression error.

• T(xi) = current prediction of sample xi.

• continuous loss function L(T(x1),…..,T(xn)) , which reaches at its minimum if T(xi) = yi

• Throughout the paper we use the square loss: L = ½ ∑ni=1 (T(xi) – yi)

2.

• T(xi) ← T(xi) - α (L / T(xi )) • α = learning rate, L = squared loss , gradient step

11

Page 12: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Gradient Boosted Regression Trees

12

Page 13: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

GBRT vs RF various settings for α

13

Page 14: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Why iGBRT

Why not GBRT only

• In each iteration the gradient is only approximated.

• for true convergence, the learning-rate αneeds to be infinitesimally small

• requiring an unrealistically large number of iterations MB >> 0.

Why initialized with RF

• RF is known to be very resistant towards overfitting and therefore makes a good optimization starting point.

• RF is insensitive to parameter settings and does not require additional parameter tuning.

14

Page 15: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Initialized Gradient Boosted Regression Trees

15

Page 16: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Results with iGBRT

16

Page 17: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

17

Page 18: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Classification vs. Regression

• All our algorithms used regression to approximate the relevance of a document.

• Li et al. (2007) proposed a learning to rank paradigm that is based on classification instead of regression.

• Instead of learning a function T(xi) ≈ yi, the authors utilize the fact that the original relevance scores are discrete, yi∈ {0, 1, 2, 3, 4 }.

• Generate four binary classification problems indexed by c = 1,…., 4.

• The cth classification problem predicts if the document is less relevant than c.

• We carefully choose classifiers Tc(.) to return well defined probabilities (i.e. 0 < Tc(.) < 1).

• If we define the constant functions T0(.) = 0 and T5(.) = 1.

• we can combine all classifiers T0,….., T5 to compute the probability that a document xi has a relevance of r ∈ {0,….,4}: P ( rel(xi) = r) = P ( rel(xi) < r+1 ) - P ( rel(xi) < r)

18

Page 19: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

19

Page 20: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Statistics of the data sets.

20

Page 21: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Performance of GBRT, RF and iGBRT with ERR

21

Page 22: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Performance of GBRT, RF and iGBRT with NDCG

22

Page 23: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Conclusion

• We compared three algorithms with regression and classification settings.

• RF picked its same parameters through out paper and outperforms GBRT.

• For further refinement of results of RF we introduced iGBRT.

• we demonstrated that classification tends to be a better paradigm for web-search ranking than regression.

• iGBRT in a classification setting consistently achieves state-of-the-art performance on all publicly available web-search data sets that we are aware of.

23

Page 24: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

References

1. Breiman. Classication and regression trees. Chapman & Hall/CRC, 1984.

2. https://www.youtube.com/watch?v=D_2LkhMJcfY&t=223s

3. https://www.youtube.com/watch?v=DCZ3tsQIoGU&t=146s

4. http://proceedings.mlr.press/v14/chapelle11a/chapelle11a.pdf

5. https://www.youtube.com/watch?v=ErDgauqnTHk

24

Page 25: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

(Group J)

Seminar Data Analytics IInternational Masters Program in Data

AnalyticsUniversity of HildesheimSummer Semester 2018

Famakin Olawole Taiwo 25

Page 26: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Mining Text SnippetsFor Images On The Web

Kannan, A., Baker, S., Ramnath, K., Fiss, J., Lin, D., Vanderwendem L., & Wang, X.J. (2014)

In the proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM

Famakin Olawole Taiwo 26

Page 27: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Outline• Introduction

• Related Work

• Snippet Mining Algorithm

• Evaluation of Snippet

• Applications

• Conclusion

• References

Famakin Olawole Taiwo 27

Page 28: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Introduction

What is an imageAn image refers to a binary representation of visual information such as drawings, pictures, graphs, logos, or individual video frames

Text miningThis is referred to as the process of examining massive collections of written resources to generate new information, and to transform the unstructured text into structured data for use in further analysis.-- It identifies :• Facts, Relationships, Assertionsthat would otherwise remain buried in the mass of textual big data.

Famakin Olawole Taiwo 28

Page 29: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

IntroductionMaking more sense of this

• Harness power of text mining

• Top k snippet algorithm (proposed)

• Gain relevant and interesting information regarding an image

Famakin Olawole Taiwo 29

Page 30: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Introduction

Focus• Show and implement mining text algorithm

• Obtain relevant and useful text snippets regardingimages on the web.

• To also show off applications built with the use of this algorithm obtained.

*Note that these stories are generally not contained in the image captions (which are most often just descriptive), but the captions can help identify the most interesting stories.*

Famakin Olawole Taiwo 30

Page 31: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Related Works

Image caption generation

• Y. Mori, H. Takahashi, and R. Oka. Image-to-word transformation based on dividing and vector quantizing images with words.

• G. Kulkarni, V. Premraj, S. Dhar, S. Li, Y. Choi,A. C. Berg, and T. L. Berg.

Baby talk: Understanding and generating simple image descriptions.

• R. Mason and E. Charniak. Annotation of online shopping images without labelled training examples.

Focuses on associating word tags with images.

Famakin Olawole Taiwo 31

Page 32: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Related Works

Document Summarization

• O. Buyukkokten, H. Garcia-Molina, and A. Paepcke.Seeing the whole in parts: text summarization for web browsing on handheld devices.

• W. T. Chuang and J. Yang. Extracting sentence segments for text summarization: a machine learning approach.

• J. Goldstein, V. Mittal, J. Carbonell, and M. Kantrowitz. Multi-document summarization by sentence extraction

Focuses on summarizing documents, either by identifying key phrases and sentences that are reflective of the focus of the document

Famakin Olawole Taiwo 32

Page 33: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Snippet Mining Algorithm

This algorithm has been based on the concept and notion that if an image Is interesting a lot of people would love to embed and write about it on their websites, blogs and articles.

For each image (worked upon) we mine the web for all the webpages containing it, in order to identify text snippets that are relevant and interesting and also form a diverse set of text.

This results to clustering of this images into near duplicate groups (image set or duplicate image set)

{ MURL, PURL, HTML}Famakin Olawole Taiwo 33

Page 34: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Snippet Mining Algorithm

Scalable Image Set Identification

The goal here is to cluster images so that each cluster consists of images that are near duplicate to each other.

To achieve thisWe adopt a two step clustering method using hashing techniques within map reduce frameworks.

• To cover large variation within a duplicate image cluster while minimizing false positives.

• scalable for clustering billions of images on the web

Famakin Olawole Taiwo 34

Page 35: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Snippet Mining Algorithm

Forming candidate snippetsAs stated earlier….

An Image set is represented by == {MURL, PURL, HTML}

In addition we parse HTML to obtain a linear ordering of the text and image nodes

(WPURL)

For each text node in WPURL, a candidate snippet is generated

------------

Famakin Olawole Taiwo 35

Page 36: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Snippet Mining Algorithm

Forming candidate snippets

Representing images!!

For each image node which corresponds to MURL

We extract its associate Alt or Src text

<MPURL,LPURL >

Famakin Olawole Taiwo 36

Page 37: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Snippet Mining Algorithm

Top K Snippet Selection

This is an objective function of the problem; given any image, the probability of the top snippets is the product of the relevance and the interestingness.

Famakin Olawole Taiwo 37

Page 38: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Snippet Mining Algorithm

Top K Snippet Selection

What this does here is to regularize the objective function so as to reduce any overfitting.

Famakin Olawole Taiwo 38

Page 39: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Snippet Mining Algorithm

Relevance and interestingness

Famakin Olawole Taiwo 39

Page 40: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Snippet Mining Algorithm

Measure of spam

When a snippet contains a lot of repeated words, it is less likely to be relevant or interesting.

Linguistic Features: The interestingness of a sentence often depends on its linguistic structure. We use four linguistic features:

(1) The length of the sentence, with the intuition that longer sentences are more interesting.

(2) whether the sentence begins demonstrative (such as beginning with \this" or \these"

(3) whether the sentence is first person, beginning with \I" or\we" and

(4) whether the sentence is definitional, i.e., begins with a pronoun and then includes the word \is" or \are" afterwards.

Famakin Olawole Taiwo 40

Page 41: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Evaluation of Snippet Mining Algorithm

For the purpose of evaluation, two baseline methods have been adopted to compare the results achieved from proposed algorithm.

• Query by image and Webpage summarization

• Img2Text using Visual Features

Reason for comparison

*No prior work on extracting a set of text snippets for an image on the web*

Famakin Olawole Taiwo 41

Page 42: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Evaluation of Snippet Mining

Query-by-Image and Webpage Summarization (Qbl/WS)

• Finds all occurrences of an image on the web

• Adopts a webpage summarization to generate snippets

Specifically compared to adoption in approach (images.google.com)

Famakin Olawole Taiwo 42

Page 43: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Evaluation of Snippet Mining

Im2 Text using Visual Features

• Matches image to a database of million flicker images with captions

• Transfer the captions from its best matches

Specifically compared to the adoption in im2Text: Describing Images Using 1 Million Captioned Photographs approach from (Ordonez, Kulkarni, and Berg in)

Famakin Olawole Taiwo 43

Page 44: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Evaluation of Snippet Mining

Dataset

Selected popular images on the web :

• Top 10,000 textual queries were run in a popular search engine

• We picked the 50 images from top ranking results returned.

Famakin Olawole Taiwo 44

Page 45: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Evaluation of Snippet Mining Algorithm (Results)

Famakin Olawole Taiwo 45

Page 46: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Evaluation of Snippet Mining Algorithm (Results)

Famakin Olawole Taiwo 46

Page 47: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Evaluation of Snippet Mining Algorithm (Results)

Famakin Olawole Taiwo 47

Page 48: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Coverage of images

• People

• Products

• Arts and Culture

• Music and Movies

• Travel

• Science

• Personal Photos

• Foreign Language

• Commercial

• Icons

Examples of common types of images for which our algorithm either finds or does not find enough high-quality text snippets.

Famakin Olawole Taiwo 48

Page 49: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Applications

The database text snippets derived from the use of this algorithm can be implemented with possible applications.

• They can be implemented to improve image search relevance

• They might also be used to filter more interesting images from the less interesting ones.

Proposed Applications

• Web Image Augmentation

• Semantic Image Browsing

Famakin Olawole Taiwo 49

Page 50: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

ApplicationsWeb Image Augmentation :

*Bing bar Plugin*

Famakin Olawole Taiwo 50

Page 51: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

ApplicationsSemantic Image Browsing :

Famakin Olawole Taiwo 51

Page 52: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Conclusion

1) We have presented a scalable mining algorithm to obtain

a set of text snippets for images on the web.

2) There is a possibility to display the snippets along with image search results.

3)Potential developed applications can feed of the snippets generated to enhance functionality.

4) Snippet data can be useful for improving image search relevance.

5)Algorithm cannot return relevant snippets in languages other than English.

Famakin Olawole Taiwo 52

Page 53: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Future works

1) To analyse the snippets in more detail, for example by clustering, to and groups

of related images.

The results could be used to broaden the set of snippets and concepts associated

with an image, possibly leading to deeper understanding of the content of the

images, and more interesting browsing experiences.

2) This algorithm can be improved to return snippets regarding personal images of

people, to aid countries who do not have a system in place (database of people)

Famakin Olawole Taiwo 53

Page 54: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

References

Anon, 2017. linguamatics.com. [Online] Available at: https://www.linguamatics.com/what-is-text-mining-nlp-machine-learning [Accessed 22 03 2018].

Anon, 2018. computerhope.com. [Online] Available at: https://www.computerhope.com/jargon/i/image.htm [Accessed 24 01 2018].

Christopher J. O, B., G, B. & Jurisica, I., 2013. Data integration in the life sciences. Berlin: Springer.

Kannan, A. et al., 2014. Mining Text Snippets For Images On The Web.

Famakin Olawole Taiwo 54

Page 56: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Agenda1. Introduction

2. Related work

3. Selecting Responses

3.1. LSTM model

3.2. Challenges

4. Response Set Generation

4.1.Semantic intent clustering

5. Suggestion Diversity

6. Results

7. Conclusions

8. References56

Page 57: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

1. Introduction

• Provide text assistance for email reply composition.

• Targeted at mobile.

• Responses can be sent on their own.

57

Page 58: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

2. Related Work

• Extracting meaning from previous message.

• Generating language.

• Grammatical transformation between response.

• Matching style/tone.

58

Page 59: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

3. Model

• Sequence to sequence Learning model.

• First proposed in the context of machine translation.

• Recurrent neural networks (encoder-decoder)

59

Page 60: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

RNN (encoder-decoder)

60

Page 61: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

4. Training

• Training data is a corpus of email reply pairs.

• Both encoder and decoder are trained together (end to end).

61

Page 62: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Top Responses.

62

Page 63: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

5. Challenges.

• Response quality

• How to ensure that the individual response options are always high quality in language and content.

• Utility

• How to select multiple options to show a user so as to maximize the likelihood that one is chosen.

• Scalability

• How to efficiently process millions of messages per day while remaining within the latency requirements of an email delivery system.

• Privacy

• How to develop this system without ever inspecting the data except aggregate statistics.

63

Page 64: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

6. Semantic Intent Clustering

• Partition all response messages into “semantic” clusters.

• All messages within a cluster share the same semantic meaning.

• For Example:

• “Ha ha” and “oh that’s funny!” are associated with the funny cluster.

64

Page 65: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Response Message.

65

Page 66: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Diversity.

• LSTM first processes an incoming message and then select the best responses.

• Responses are highly rated together.

• The job of diversity component is to select a more varied set of suggestions.

66

Page 67: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Diversity Selection

67

Page 68: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Diversity Result.

68

Page 69: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Deployment and Coverage.

• This feature is deployed in inbox by gmail.

• It is used to assist with more than 10% , of all mobile replies.

69

Page 70: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

Conclusions.

• Sequence-to-sequence produces plausible email replies in many common scenarios, when trained on an email corpus.

• Smart reply is deployed in inbox by Gmail and generates more than 10% of mobile replies.

• A novel end-to-end system for automatically generating short, complete email responses.

• The core of the system is a state-of-the-art deep LSTM model that can predict full responses, given an incoming email message.

70

Page 71: Web-Search Ranking with Initialized Gradient Boosted ... · Ananth Mohan Zheng Chen Kilian Weinberger mohana@wustl.edu zheng.chen@wustl.edu kilian@wustl.edu Department of Computer

References.

71


Recommended