codeburst

Bursts of code to power through your day. Web Development articles, tutorials, and news.

Follow publication

Elasticsearch by Example: Part 5

Facet queries often allow for multiple selections on a facet; we refactor our solution to accommodate.

This article is part of a series, starting with Elasticsearch by Example: Part 1, exploring the Elasticsearch database / search engine.

In the previous examples, the queries we wrote presumed that we only make a single choice for a facet; on an user interface this would look like radio buttons or a drop-down list. But looking at many sites with faceted search, they use checkboxes with many choices for a facet. For example, one can choose more than one TV screen size on the BestBuy website.

With this requirement in mind, we need to do some refactoring.

Refactored Search

Refactoring the search is relatively simple. The following query finds small or medium black shirts. The should clause (at least one of the containing clauses needs to be satisfied) is the new element in this query.

note: These queries can get fairly long and repetitive; luckily in most situations these queries are dynamically generated through code.

POST: ENDPOINT/shirts/shirt/_search

{
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"nested": {
"path": "keyword_facets",
"query": {
"bool": {
"filter": [
{ "term": { "keyword_facets.facet_name": "size" } },
{ "term": { "keyword_facets.facet_value": "S" } }
]
}
}
}
},
{
"nested": {
"path": "keyword_facets",
"query": {
"bool": {
"filter": [
{ "term": { "keyword_facets.facet_name": "size" } },
{ "term": { "keyword_facets.facet_value": "M" } }
]
}
}
}
}
]
}
},
{
"nested": {
"path": "keyword_facets",
"query": {
"bool": {
"filter": [
{ "term": { "keyword_facets.facet_name": "color" } },
{ "term": { "keyword_facets.facet_value": "black" } }
]
}
}
}
}
]
}
}
}

Aggregation Problem

As before, adding the aggregation clause returns the facets and choices (and counts) that are in the result, e.g., color (black, red) and size (S, M, L). The idea is that we could use this data to render the checkboxes in the user interface.

Thinking about our shirt example, say the user clicks on the black color checkbox to only show black shirts. The resulting aggregation block would have all the sizes of the black shirts so that we could re-render the size checkboxes correctly. Maybe there are no small black shirts; so there is no need to show the S option.

The problem, however, is that the only color in the aggregation block would be black (that is the only color in the result). If we re-render the color facet choices using the returned data, we would only show the black checkbox; thus prohibiting one from selecting more than one color.

It would be nice if we could have an approach where we can show sensible options (eliminating nonsensical ones) and at the same time allowing for multiple choices on a facet.

Refactoring Aggregation

Buried in the document that inspired this series, On-Site Search Design Patterns for E-Commerce: Schema Structure, Data Driven Ranking & More, there is a link to a stackoverflow thread that provides a solution.

In the previous shirt example (selecting the black shirt checkbox) the solution involves running two queries:

  • As before, return the list of all black shirts with an aggregation query.
  • Return the shirts without using any filters on color (in this case would be all shirts) with an aggregation query.

The first query would be used for:

  • Building the list of matching items.
  • Building the list of options for the size facet.

The second query would be used for:

  • Building the list of options for the color facet.

Thinking about this specific case, the second query aggregation result would include both the black and red colors as there are both colored shirts in the result.

The general pattern is that for N facets with selections we would have N+1 queries; first a query including each facet and the one query for each facet that excludes itself.

The first query is used for:

  • Building the list of matching items.
  • Building the list of options for facets that do not have selections.

The remaining queries (one per facet with selections) are used for:

  • Building the list of options for that specific facet.

A general observation is that options only appear in an aggregation result if there is at least one item that would match that option and the remaining criteria (excluding criteria involving the respective facet).

Wrap Up

We now have a fully-functional scalable strategy for doing faceted searches using the Elasticsearch database / search engine.

While there are more to mine from the article, On-Site Search Design Patterns for E-Commerce: Schema Structure, Data Driven Ranking & More, that inspired this series, will leave that for another time and place.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Published in codeburst

Bursts of code to power through your day. Web Development articles, tutorials, and news.

Written by John Tucker

Broad infrastructure, development, and soft-skill background

Responses (6)

Write a response