Faceting with Lucene
Block Join Query
Agenda
1. Why we need special faceting for
Block Join queries?
1. Proposed Block Join facet component.
PRIVILEGED AND CONFIDENTIAL
Introducing myself
Oleg Savrasov, PhD
A programmer
Working for Grid Dynamics
(griddynamics.com)
Work and live in Saint-Petersburg,
Russia
Online shopping
Jerrica is looking for a dress
Huge amount of dresses
Facet filters help
Facet
filters
Reduced
amount
Tasks to be solved
● Performant Search
● Facet
calculation/filtering
FacetComponent ?
Product has many SKU
Aggregated facet counts
Facets should
count products,
not SKU.
Expected
facets:
COLOR
Blue : 1
Red : 1
SIZE
S : 1
M : 1
Flat documents don’t help
False positive match for
+COLOR:Blue +SIZE:M
Separate SKU documents
q = *:*
facet.field = COLOR
facet.field = SIZE
COLOR
Blue : 1
Red : 2
SIZE
S : 2
M : 1
Wrong
numbers!
There is
only one
product
Search products only
q = *:*
fq = scope:product
facet.field = COLOR
facet.field = SIZE
COLOR : 0
SIZE : 0
No such
fields in
product
documents
Aggregated facet counts
Facets should
count products,
not SKU.
Expected
facets:
COLOR
Blue : 1
Red : 1
SIZE
S : 1
M : 1
Solr Block Join Support (since Lucene 3.4.0)
Gre
en
Blu
e
Yello w
Ye
llo w
Blu
e
Gre
en
Pro
du
ct
Gre
en
Ye
llo w
Pro
du
ct
Gre
en
Blu
e
Yello w
Ye
llo w
Pro
du
ct
docId
1 1 1
Query: {!parent which="scope:product"}COLOR:Blue
1 1
scope:product
COLOR:Blue
ToParentQuery 1 1
Child docs Parent doc
Block1
SOLR-5743 Faceting with Block Join support
● Create BlockJoinFacetComponent
● Only DocValues fields are
supported
● Facet counts should correspond to
amount of parent documents
● ToParentQuery is expected
Faceting over DocSet slicesG
reen
Blu
e
Ye
llo w
Ye
llo w
Blu
e
Gre
en
Pro
du
ct
Gre
en
Ye
llo w
Pro
du
ct
Gre
en
Blu
e
Ye
llo w
Ye
llo w
Pro
du
ct
docId
10 1 0 0 1 0
DocSet Slice
DocSet Slice counts
COLOR Blue : 2
Aggregated counts
COLOR Blue : +1
Block Join Facet Component
BlockJoinFacetCollector
Facets counting
It works!
q =
{!parent
which="scope:product"}COLOR:Blue
child.facet.field = SIZE
<response>
...
<lst name="facet_counts">
<lst name="facet_fields">
<lst name="SIZE">
<int name="S">14</int>
<int
name="L">22</int>
<int
name="XL">17</int>
</lst>
</lst>
</lst>
</response>
The dress is found
Further improvements
● Thorough profiling
● Performance improvements
● Algorithmic improvements
References
http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-
in-lucene
http://blog.mikemccandless.com/2012/01/searching-relational-content-with.html
http://blog.griddynamics.com/2013/09/solr-block-join-support.html
Big thanks!
Do you have any questions?
Please vote for SOLR-5743.