How to use Pagination on a Query in Elasticsearch 6 using Curl
Introduction
When you’re doing a search in Elasticsearch, you may have a large number of results returned from your query. It can be hard to wade through such a lengthy result set; fortunately, Elasticsearch provides the ability to limit the number of results your query returns through pagination. In this step-by-step tutorial, you’ll learn how to implement pagination on a query in Elasticsearch using curl. If you’re already familiar with the concept of pagination and would prefer to dive into the sample code, feel free to skip ahead to Just the Code.
Use pagination with the from
and size
parameters
Let’s look at an example of using pagination on an Elasticsearch query. For our example, we’ll create a sample index called store
, which represents a small grocery store. Our store
index contains a type called products
which lists all of the store’s products. We’ll keep our dataset simple by including just a handful of products with a small number of fields: id, price, quantity, and department. The JSON shown below can be used to create our dataset:
id | name | price | quantity | department | |
---|---|---|---|---|---|
1 | Multi-Grain Cereal | 4.99 | 4 | Packaged Foods | |
2 | 1lb Ground Beef | 3.99 | 29 | Meat and Seafood | |
3 | Dozen Apples | 2.49 | 12 | Produce | |
4 | Chocolate Bar | 1.29 | 2 | Packaged Foods | Checkout |
5 | 1 Gallon Milk | 3.29 | 16 | Dairy | |
6 | 0.5lb Jumbo Shrimp | 5.29 | 12 | Meat and Seafood | |
7 | Wheat Bread | 1.29 | 5 | Bakery | |
8 | Pepperoni Pizza | 2.99 | 5 | Frozen | |
9 | 12 Pack Cola | 5.29 | 6 | Packaged Foods | |
10 | Lime Juice | 0.99 | 20 | Produce | |
11 | 12 Pack Cherry Cola | 5.59 | 5 | Packaged Foods | |
12 | 1 Gallon Soy Milk | 3.39 | 10 | Dairy | |
13 | 1 Gallon Vanilla SoyMilk | 3.49 | 9 | Dairy | |
14 | 1 Gallon Orange Juice | 3.29 | 4 | Juice |
The following code shows the mapping:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | $ curl -H "Content-Type: application/json" -XPUT 127.0.0.1:9200/store -d ' > { > "mappings": { > "products": { > "properties" : { > "name": { "type": "text"}, > "price": { "type": "double"}, > "quantity": { "type": "integer"}, > "department": { "type": "keyword"} > } > } > } > } > ' |
Now that we’ve created our dataset, we can put together a query. Let’s imagine we want to query for all of our store’s products where the price is greater than $1.00. While our sample dataset is small, this query would return a huge number of results for a real store:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | $ curl -H "Content-Type: application/json" -XGET 127.0.0.1:9200/store/products/_search?pretty -d ' > { > "query": { > "range": { > "price": { > "gte": 1.00 > } > } > } > } > ' { "took" : 13, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 13, ... ... ... |
We’ve trimmed down the results for size, but you can see by the total hits that our query matched 13 out of 14 documents in our dataset– that’s nearly our entire dataset. In a real-world situation, the returned results would be enormous in size. To illustrate how we can limit our returned results to a more reasonable size, we’ll use pagination to only look at the first two results. To accomplish this task, all we need to do is set the from
and size
parameters when we search. The code shown below may seem a bit complex, but the explanations that follow will clarify what’s going on:
1 2 3 4 5 6 7 8 9 10 11 12 13 | $ curl -H "Content-Type: application/json" -XGET 127.0.0.1:9200/store/products/_search?pretty -d ' > { > "from": 0, > "size": 2, > "query": { > "range": { > "price": { > "gte": 1.00 > } > } > } > } > ' |
Let’a take a closer look at what we just did in the code. We’ll begin with the size
parameter, which we set to 2. This simply tells Elasticsearch how many results to return. The from
parameter is a little more involved. This parameter tells Elasticsearch how many results to offset by, when you want to go beyond the first page of results. It’s important to know that the from
parameter is zero-based. This means that the first result is considered to be in position 0, the second result is in position 1, and so forth. If you have some experience with programming, this is probably a very familiar concept; otherwise, it may seem a bit counterintuitive. In this context, the zero-based approach means that you’d specify "from": 0
if you want Elasticsearch to return results starting from the very beginning. If you specified "from": 1
, Elasticsearch would skip the first result and return results beginning with the second one.
Note: Keep in mind that the default for
"from"
is 0, and the default for"size"
is 10. Because of these default values, you don’t have to worry that thousands of results will be returned if you accidentally omit the"from"
and"size"
parameters– you’ll simply get the first 10 results.
Now that we’ve reviewed our query, let’s take a look at the results:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | $ curl -H "Content-Type: application/json" -XGET 127.0.0.1:9200/store/products/_search?pretty -d ' > { > "from": 0, > "size": 2, > "query": { > "range": { > "price": { > "gte": 1.00 > } > } > } > } > ' { "took" : 16, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 13, "max_score" : 1.0, "hits" : [ { "_index" : "store", "_type" : "products", "_id" : "14", "_score" : 1.0, "_source" : { "id" : "14", "name" : "1 Gallon Orange Juice", "price" : 3.29, "quantity" : 4, "department" : [ "Juice" ] } }, { "_index" : "store", "_type" : "products", "_id" : "5", "_score" : 1.0, "_source" : { "id" : "5", "name" : "1 Gallon Milk", "price" : 3.29, "quantity" : 16, "department" : [ "Dairy" ] } } ] } } |
As you can see, our query successsfully returned only the first two results. If we wanted the next two results after that (in other words, the second page of results), we’d set "size": 2
and "from": 2
. Our "size"
parameter stays the same, but you’ll notice that we now set "from": 2
. This tells Elasticsearch that we’ve already seen results 0 and 1, so we can skip those and start with the third result, which is in position 2. Let’s run this query and make sure we get the expected results:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | $ curl -H "Content-Type: application/json" -XGET 127.0.0.1:9200/store/products/_search?pretty -d ' > { > "from": 2, > "size": 2, > "query": { > "range": { > "price": { > "gte": 1.00 > } > } > } > } > ' { "took" : 2, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 13, "max_score" : 1.0, "hits" : [ { "_index" : "store", "_type" : "products", "_id" : "8", "_score" : 1.0, "_source" : { "id" : "8", "name" : "Pepperoni Pizza", "price" : 2.99, "quantity" : 5, "department" : [ "Frozen" ] } }, { "_index" : "store", "_type" : "products", "_id" : "9", "_score" : 1.0, "_source" : { "id" : "9", "name" : "12 Pack Cola", "price" : 5.29, "quantity" : 6, "department" : [ "Packaged Foods" ] } } ] } } |
As we expected, we got the next page of results. With this type of query, it’s easy to page through a large set of results by incrementing the from
parameter.
Note You might have noticed that the results did not come back in order of indexing. For additional information on sorting in Elasticsearch, consult their documentation.
Conclusion
There’s no doubt that a large result set can be difficult to for users to process. Pagination allows you to limit your result set to the size of your choice, paging through your results as far as needed. With the step-by-step instructions described in this tutorial, you should have no trouble implementing pagination on a query in Elasticsearch using curl.
Just the Code
If you’re already familiar with the concept of pagination, here’s all the code you need to implement pagination on a query in Elasticsearch using curl:
1 2 3 4 5 6 7 8 9 10 11 12 13 | $ curl -H "Content-Type: application/json" -XGET 127.0.0.1:9200/store/products/_search?pretty -d ' > { > "from": 0, > "size": 2, > "query": { > "range": { > "price": { > "gte": 1.00 > } > } > } > } > ' |
1 2 3 4 5 6 7 8 9 10 11 12 13 | $ curl -H "Content-Type: application/json" -XGET 127.0.0.1:9200/store/products/_search?pretty -d ' > { > "from": 2, > "size": 2, > "query": { > "range": { > "price": { > "gte": 1.00 > } > } > } > } > ' |
Pilot the ObjectRocket Platform Free!
Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.
Get Started