How to use Slop with Phrase Search in Elasticsearch 6 using NodeJS
Introduction
This step-by-step tutorial explains how you can use slop and Phrase Search in Elasticsearch. The flexibility of the slop
parameter allows you to view matches that are close—not exact–to your actual Phrase Searches. The more general search allows more leniency in users that might only know a couple keywords in what they are searching for.
If you’re already familiar with how slop
and Phrase Searches work together, click here to go straight to Just the Code.
How Slop Works
Let’s look at an example that uses an index called store
, which represents a small grocery store. This store
index contains a type called products
which lists the store’s products. To keep things simple, our example dataset will only contain a handful of products with just the following fields: id, price, quantity, and department. The code below shows the JSON used to create the dataset:
id | name | price | quantity | department |
---|---|---|---|---|
1 | Multi-Grain Cereal | 4.99 | 4 | Packaged Foods |
2 | 1lb Ground Beef | 3.99 | 29 | Meat and Seafood |
3 | Dozen Apples | 2.49 | 12 | Produce |
4 | Chocolate Bar | 1.29 | 2 | Packaged Foods, Checkout |
5 | 1 Gallon Milk | 3.29 | 16 | Dairy |
6 | 0.5lb Jumbo Shrimp | 5.29 | 12 | Meat and Seafood |
7 | Wheat Bread | 1.29 | 5 | Bakery |
8 | Pepperoni Pizza | 2.99 | 5 | Frozen |
9 | 12 Pack Cola | 5.29 | 6 | Packaged Foods |
10 | Lime Juice | 0.99 | 20 | Produce |
11 | 12 Pack Cherry Cola | 5.59 | 5 | Packaged Foods |
12 | 1 Gallon Soy Milk | 3.39 | 10 | Dairy |
13 | 1 Gallon Vanilla Soy Milk | 3.49 | 9 | Dairy |
14 | 1 Gallon Orange Juice | 3.29 | 4 | Juice |
Here is the json we used to define the mapping if our index:
1 2 3 4 5 6 7 8 9 10 11 12 | { "mappings": { "products": { "properties" : { "name": { "type": "text"}, "price": { "type": "double"}, "quantity": { "type": "integer"}, "department": { "type": "keyword"} } } } } |
Let’s search for 1 gallon containers of milk. There are plenty to choose from according to our dataset. They are:
1 Gallon Milk
1 Gallon Soy Milk
1 Gallon Vanilla Soy Milk
We have another gallon of something, but it’s not milk.
- 1 Gallon Orange Juice
That’s important to note because if we search for “1 Gallon” only, our results will include the “1 Gallon Orange Juice” along with the varieties of milk.
We could also see the “1 Gallon Orange Juice” in the search results if we used slop
with Phrase Search
.
Our original goal is to display all of the “1 Gallon Milk” products in our search results. Phrase Search works perfectly for a “1 Gallon Milk” query. We need to do something else though to see all of the available varieties of “1 Gallon Milk.”
Let’s add a slop
value of 1. Here’s where the elasticity really shines. The slop
value setting of 1 allows a deviation of 1 term. In this case, “Soy” happens to pop up in the Phrase Search results. We didn’t know this ahead of time. The slop
value of 1 lets any additional word that appears together with “1 Gallon Milk” show up in the search results. Therefore, the “1 Gallon Soy Milk” product will be included.
To get “1 Gallon Vanilla Soy Milk” to appear in our search results, we need to add a slop
value of 2. That’s because “Vanilla,” and “Soy” are two terms away from our original Phrase Search of “1 Gallon Milk.”
Let’s revisit the single additional search term option we talked about first. Look at the example of a Phrase Search query used with a slop
value of 1:
File: app.js
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | var elasticsearch = require("elasticsearch"); var client = new elasticsearch.Client({ hosts: ["http://localhost:9200"] }); /* Match Phrase Search With Slop */ client.search({ index: 'store', type: 'products', body: { "query": { "match_phrase": { "name": { query: "1 Gallon Milk", slop: 1 } } } } }).then(function(resp) { console.log("Successful query!"); console.log(JSON.stringify(resp, null, 4)); }, function(err) { console.trace(err.message);`` }); |
Response:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | $ node app.js Successful query! { "took": 46, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 2, "max_score": 2.5518394, "hits": [ { "_index": "store", "_type": "products", "_id": "5", "_score": 2.5518394, "_source": { "id": "5", "name": "1 Gallon Milk", "price": 3.29, "quantity": 16, "department": [ "Dairy" ] } }, { "_index": "store", "_type": "products", "_id": "12", "_score": 1.3851595, "_source": { "id": "12", "name": "1 Gallon Soy Milk", "price": 3.39, "quantity": 10, "department": [ "Dairy" ] } } ] } } |
The Phrase Search used with a slop
value of 2 returns additional results.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | $ node app.js Successful query! { "took": 10, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 3, "max_score": 2.5518394, "hits": [ { "_index": "store", "_type": "products", "_id": "5", "_score": 2.5518394, "_source": { "id": "5", "name": "1 Gallon Milk", "price": 3.29, "quantity": 16, "department": [ "Dairy" ] } }, { "_index": "store", "_type": "products", "_id": "12", "_score": 1.3851595, "_source": { "id": "12", "name": "1 Gallon Soy Milk", "price": 3.39, "quantity": 10, "department": [ "Dairy" ] } }, { "_index": "store", "_type": "products", "_id": "13", "_score": 1.3125904, "_source": { "id": "13", "name": "1 Gallon Vanilla Soy Milk", "price": 3.49, "quantity": 9, "department": [ "Dairy" ] } } ] } } |
All gallons of milk showed up in our search results because we used Phrase Search along with a slop
value of 2. That’s two terms of deviation from the original Phrase Search.
The benefit of using slop
with Phrase Search is that you get to see the relevancy score based on your original search query. For example, we entered a Phrase Search of “1 Gallon Milk.” Its relevancy of "_score" : 2.5518394
is higher than “1 Gallon Soy Milk” which has a relevancy of "_score" : 1.3851595
. Lastly, and with an even lower relevancy of "_score" : 1.0879787
, is the “1 Gallon Vanilla Soy Milk” because it is two terms away from the original Phrase Search term.
As you can see, slop
is useful in many circumstances. When you want to search for terms that are as close together as possible without them having to be exactly together side-by-side, use slop
.
You might want to give something like product descriptions a high slop
value, maybe even 100. The relevancy score would place results with your exact Phrase Search terms higher, yet you would still see all the results that contain those terms.
Conclusion
In this tutorial, you learned how to use slop with Phrase Search in Elasticsearch using NodeJS. The slop
parameter offers flexibility in searching because it allows you to create deviations that include the terms in your original Phrase Search. What’s more, when you add a slop
with a high value to Phrase Search with longer content, such as product descriptions, you return data that you can analyze for future decision making in marketing and inventory management, for example. Learn more about slop
by reading Elasticsearch’s documentation.
Just the Code
Here’s the code example illustrating the slop
parameter with a value of 2 and Phrase Search (always use match_phrase
for multiple word search terms).
File: app.js
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | var elasticsearch = require("elasticsearch"); var client = new elasticsearch.Client({ hosts: ["http://localhost:9200"] }); /* Match Phrase Search With Slop */ client.search({ index: 'store', type: 'products', body: { query: { match_phrase: { name: { query: "1 Gallon Milk", slop: 2 } } } } }).then(function(resp) { console.log("Successful query!"); console.log(JSON.stringify(resp, null, 4)); }, function(err) { console.trace(err.message);`` }); |
Pilot the ObjectRocket Platform Free!
Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.
Get Started