How to use Slop with Phrase Search in Elasticsearch 6
Introduction
This step-by-step tutorial explains how you can use slop and Phrase Search in Elasticsearch. The flexibility of the slop
parameter allows you to view matches that are close—not exact–to your actual Phrase Searches. The more general results may bring to attention varieties of certain products. The knowledge might help you to best streamline inventory management and make profitable marketing decisions.
If you’re already familiar with how slop
and Phrase Searches work together, click here to go straight to Just the Code.
How Slop Works
First, let’s set up a JSON to prepare for our dataset example. We have a store
index and products
type with the list of grocery items. The fields include product “id,” as well as it’s “name,” associated “price,” “quantity” available, and the “department” indicating where each product is located.
| id | name | price | quantity | department
| – | – | – | – | –
1 | Multi-Grain Cereal | 4.99 | 4 | Packaged Foods
2 | 1lb Ground Beef | 3.99 | 29 | Meat and Seafood
3 | Dozen Apples | 2.49| 12 | Produce
4 | Chocolate Bar | 1.29 | 2 | Packaged Foods| Checkout
5 | 1 Gallon Milk | 3.29 | 16 | Dairy
6 | 0.5lb Jumbo Shrimp | 5.29 | 12 | Meat and Seafood
7 | Wheat Bread | 1.29 | 5 | Bakery
8 | Pepperoni Pizza | 2.99 | 5 |Frozen
9 | 12 Pack Cola | 5.29 | 6 | Packaged Foods
10| Lime Juice | 0.99 | 20 | Produce
11| 12 Pack Cherry Cola | 5.599 | 5 | Packaged Foods
Here’s how we mapped our index for the JSON:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | $ curl -H "Content-Type: application/json" -XPUT 127.0.0.1:9200/store -d ' > { > "mappings": { > "products": { > "properties" : { > "name": { "type": "text"}, > "price": { "type": "double"}, > "quantity": { "type": "integer"}, > "department": { "type": "keyword"} > } > } > } > } > ' |
Let’s search for 1 gallon containers of milk. There are plenty to choose from according to our dataset. They are:
1 Gallon Milk
1 Gallon Soy Milk
1 Gallon Vanilla Soy Milk
We have another gallon of something, but it’s not milk.
- 1 Gallon Orange Juice
That’s important to note because if we search for “1 Gallon” only, our results will include the “1 Gallon Orange Juice” along with the varieties of milk.
We could also see the “1 Gallon Orange Juice” in the search results if we used slop
with Phrase Search
.
Our original goal is to display all of the “1 Gallon Milk” products in our search results. Phrase Search works perfectly for a “1 Gallon Milk” query. We need to do something else though to see all of the available varieties of “1 Gallon Milk.”
Let’s add a slop
value of 1. Here’s where the elasticity really shines. The slop
value setting of 1 allows a deviation of 1 term. In this case, “Soy” happens to pop up in the Phrase Search results. We didn’t know this ahead of time. The slop
value of 1 lets any additional word that appears together with “1 Gallon Milk” show up in the search results. Therefore, the “1 Gallon Soy Milk” product will be included.
To get “1 Gallon Vanilla Soy Milk” to appear in our search results, we need to add a slop
value of 2. That’s because “Vanilla,” and “Soy” are two terms away from our original Phrase Search of “1 Gallon Milk.”
Let’s revisit the single additional search term option we talked about first. Look at the example of a Phrase Search query used with a slop
value of 1:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 | $ curl -H "Content-Type: application/json" -XGET 127.0.0.1:9200/store/products/_search?pretty -d '{ "query": { "match_phrase": { "name": { "query": "1 Gallon Milk", "slop": 1} } } } ' { "took" : 3, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 2.5518394, "hits" : [ { "_index" : "store", "_type" : "products", "_id" : "5", "_score" : 2.5518394, "_source" : { "id" : "5", "name" : "1 Gallon Milk", "price" : 3.29, "quantity" : 16, "department" : [ "Dairy" ] } }, { "_index" : "store", "_type" : "products", "_id" : "12", "_score" : 1.3851595, "_source" : { "id" : "12", "name" : "1 Gallon Soy Milk", "price" : 3.39, "quantity" : 10, "department" : [ "Dairy" ] } } ] } } |
The Phrase Search used with a slop
value of 2 returns additional results.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 | $ curl -H "Content-Type: application/json" -XGET 127.0.0.1:9200/store/products/_search?pretty -d ' > { > "query": { > "match_phrase": { > "name": { "query": "1 Gallon Milk", "slop": 2} > } > } > } > ' { "took" : 0, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : 2.5518394, "hits" : [ { "_index" : "store", "_type" : "products", "_id" : "5", "_score" : 2.5518394, "_source" : { "id" : "5", "name" : "1 Gallon Milk", "price" : 3.29, "quantity" : 16, "department" : [ "Dairy" ] } }, { "_index" : "store", "_type" : "products", "_id" : "12", "_score" : 1.3851595, "_source" : { "id" : "12", "name" : "1 Gallon Soy Milk", "price" : 3.39, "quantity" : 10, "department" : [ "Dairy" ] } }, { "_index" : "store", "_type" : "products", "_id" : "13", "_score" : 1.0879787, "_source" : { "id" : "13", "name" : "1 Gallon Vanilla Soy Milk", "price" : 3.49, "quantity" : 9, "department" : [ "Dairy" ] } } ] } } |
All gallons of milk showed up in our search results because we used Phrase Search along with a slop
value of 2. That’s two terms of deviation from the original Phrase Search.
The benefit of using slop
with Phrase Search is that you get to see the relevancy score based on your original search query. For example, we entered a Phrase Search of “1 Gallon Milk.” Its relevancy of "_score" : 2.5518394
is higher than “1 Gallon Soy Milk” which has a relevancy of "_score" : 1.3851595
. Lastly, and with an even lower relevancy of "_score" : 1.0879787
, is the “1 Gallon Vanilla Soy Milk” because it is two terms away from the original Phrase Search term.
As you can see, slop
is useful in many circumstances. When you want to search for terms that are as close together as possible without them having to be exactly together side-by-side, use slop
.
You might want to give something like product descriptions a high slop
value, maybe even 100. The relevancy score would place results with your exact Phrase Search terms higher, yet you would still see all the results that contain those terms.
Conclusion
In this tutorial, you learned how to use slop with Phrase Search in Elasticsearch. The slop
parameter offers flexibility in searching because it allows you to create deviations that include the terms in your original Phrase Search. What’s more, when you add a slop
with a high value to Phrase Search with longer content, such as product descriptions, you return data that you can analyze for future decision making in marketing and inventory management, for example. Learn more about slop
by reading Elasticsearch’s documentation.
Just the Code
Here’s the code example illustrating the slop
parameter with a value of 2 and Phrase Search (always use match_phrase
for multiple word search terms).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | $ curl -H "Content-Type: application/json" -XGET 127.0.0.1:9200/store/products/_search?pretty -d ' > { > "query": { > "match_phrase": { > "name": { "query": "1 Gallon Milk", "slop": 2} > } > } > } > ' |
Pilot the ObjectRocket Platform Free!
Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.
Get Started