Date post: | 16-Apr-2017 |
Category: |
Software |
Upload: | christianuhlcc |
View: | 309 times |
Download: | 0 times |
Beyond Text Similarity_Tune your search for your Business DomainSearch Meetup Munich 26.10.2016Christian Uhl
Agenda
Moving from simple text matching towards custom scoring• Recap: Text similarity and why this
stops working in the travel domain• Using recommendations and user
interaction feedback• Performance!• Protect yourself against
regressions
2
Practical scoring and text similarity is not
enough this time
3
Text Similarity
4
Elasticsearch • Lucene Practical Scoringscore(q,d) = queryNorm(q) · coord(q,d) · ∑ ( tf(t in d) · idf(t)² · t.getBoost() · norm(t,d) ) (t in q)
Text Similarity
5
Elasticsearch • Lucene Practical Scoringscore(q,d) = queryNorm(q) · coord(q,d) · ∑ ( tf(t in d) · idf(t)² · t.getBoost() · norm(t,d) ) (t in q)
Text Similarity
6
Elasticsearch • Lucene Practical Scoringscore(q,d) = queryNorm(q) · coord(q,d) · ∑ ( tf(t in d) · idf(t)² · t.getBoost() · norm(t,d) ) (t in q)
Text Similarity
7
Elasticsearch • Lucene Practical Scoringscore(q,d) = queryNorm(q) · coord(q,d) · ∑ ( tf(t in d) · idf(t)² · t.getBoost() · norm(t,d) ) (t in q)
Text Similarity
8
Elasticsearch • Lucene Practical Scoringscore(q,d) = queryNorm(q) · coord(q,d) · ∑ ( tf(t in d) · idf(t)² · t.getBoost() · norm(t,d) ) (t in q)
Text Similarity
9
Inverse Document FrequencySearch for “Da Vinci Paris”
Few Da Vincis in the World, but Paris occurs a lot.
But is a “Da Vinci” in Valencia more relevant than any other Hotel in Paris?
Text Similarity
10
Elasticsearch • Lucene Practical Scoringscore(q,d) = queryNorm(q) · coord(q,d) · ∑ ( tf(t in d) · idf(t)² · t.getBoost() · norm(t,d) ) (t in q)
Text Similarity
11
Elasticsearch • Lucene Practical Scoringscore(q,d) = queryNorm(q) · coord(q,d) · ∑ ( tf(t in d) · idf(t)² · t.getBoost() · norm(t,d) ) (t in q)
Text Similarity
12
Text Similarity
Search for “Paris”
• Paris, Illinois
• Paris, Texas• Paris, France• even more WTF
I don‘t visit no cheese eating surrender monkeys – Uncle Sam
Summary
13
Lucene Practical Scoring is finely crafted and tuned to find occurrences in large text bodys
We do not have large text bodys
Well, s***
Bring in the real world
14
Bring in the real World
15
Change the score!
• Instead of just relying on the practical scoring function, add other parameters
• Use values from the real world that reflect the relevance of a given document in the whole document space
Bring in the real World
16
Our users were kind enough to provide valuable feedback about our data
• They rate and recommend things (Hotels)
• They click on things (Everywhere*)
*except ads
We also have a geospatial relation between hotels and destinations
Bring in the real World
17
Rescore!
• Hotels by recommendations
• Destinations by clicks and hotel count
• POIs by clicks
Bring in the real World
18
Dont sort by Average rating!
Bring in the real World
19
Dont sort by average rating!http://www.evanmiller.org/how-not-to-sort-by-average-rating.html
• Score = (Positive ratings) - (Negative ratings)
A(200,100)B(100,1)
A > B ??
Bring in the real World
20
Dont sort by average rating!http://www.evanmiller.org/how-not-to-sort-by-average-rating.html
• Score = Average rating = (Positive ratings) / (Total ratings)
A(200,100) = 0,66B(3,1) = 0,75
B > A ??
Bring in the real World
21
Dont sort by average rating!http://www.evanmiller.org/how-not-to-sort-by-average-rating.html
Maybe use the lower bound of Wilson score confindence interval for a Bernoulli Parameter!
Bring in the real World
22
Dont fear the math
Doesn’t custom scoring kill performance?
23
Bring in the real World
24
Yes it does.
We started with script score function to determine a better score during search time. Very bad idea 500ms – 1s queries, CPUs screaming for mercy
Bring in the real World
25
Rescoring!
• Generate a search result with ES/Lucene Standard
• Rescore the top 40• Fetch the top n of that
Bring in the real World
26
Protect yourself against regressions
27
Testing
28
Regression Testing
• Record ~4500 searches users did that brought in money
• Generate tests that make sure for each search term the relevant result is in the result set
• Define a threshold for OK (qalitative tests)
• Execute on CI!
Testing
29
Testing
30
Testing
31
32
“Unless you‘re a Library you should use additional real word
data for scoring”-me