How to use Pagination on a Query in Elasticsearch 6 using Curl

Introduction

When you’re doing a search in Elasticsearch, you may have a large number of results returned from your query. It can be hard to wade through such a lengthy result set; fortunately, Elasticsearch provides the ability to limit the number of results your query returns through pagination. In this step-by-step tutorial, you’ll learn how to implement pagination on a query in Elasticsearch using curl. If you’re already familiar with the concept of pagination and would prefer to dive into the sample code, feel free to skip ahead to Just the Code.

Use pagination with the from and size parameters

Let’s look at an example of using pagination on an Elasticsearch query. For our example, we’ll create a sample index called store, which represents a small grocery store. Our store index contains a type called products which lists all of the store’s products. We’ll keep our dataset simple by including just a handful of products with a small number of fields: id, price, quantity, and department. The JSON shown below can be used to create our dataset:

idnamepricequantitydepartment
1Multi-Grain Cereal4.994Packaged Foods
21lb Ground Beef3.9929Meat and Seafood
3Dozen Apples2.4912Produce
4Chocolate Bar1.292Packaged FoodsCheckout
51 Gallon Milk3.2916Dairy
60.5lb Jumbo Shrimp5.2912Meat and Seafood
7Wheat Bread1.295Bakery
8Pepperoni Pizza2.995Frozen
912 Pack Cola5.296Packaged Foods
10Lime Juice0.9920Produce
1112 Pack Cherry Cola5.595Packaged Foods
121 Gallon Soy Milk3.3910Dairy
131 Gallon Vanilla SoyMilk3.499Dairy
141 Gallon Orange Juice3.294Juice

The following code shows the mapping:

$ curl -H "Content-Type: application/json" -XPUT 127.0.0.1:9200/store -d '
> {
> "mappings": {
> "products": {
> "properties" : {
> "name": { "type": "text"},
> "price": { "type": "double"},
> "quantity": { "type": "integer"},
> "department": { "type": "keyword"}
> }
> }
> }
> }
> '

Now that we’ve created our dataset, we can put together a query. Let’s imagine we want to query for all of our store’s products where the price is greater than $1.00. While our sample dataset is small, this query would return a huge number of results for a real store:

$ curl -H "Content-Type: application/json" -XGET 127.0.0.1:9200/store/products/_search?pretty -d '
> {
> "query": {
> "range": {
> "price": {
> "gte": 1.00
> }
> }
> }
> }
> '

{
"took" : 13,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 13,
...
...
...

We’ve trimmed down the results for size, but you can see by the total hits that our query matched 13 out of 14 documents in our dataset– that’s nearly our entire dataset. In a real-world situation, the returned results would be enormous in size. To illustrate how we can limit our returned results to a more reasonable size, we’ll use pagination to only look at the first two results. To accomplish this task, all we need to do is set the from and size parameters when we search. The code shown below may seem a bit complex, but the explanations that follow will clarify what’s going on:

$ curl -H "Content-Type: application/json" -XGET 127.0.0.1:9200/store/products/_search?pretty -d '
> {
> "from": 0,
> "size": 2,
> "query": {
> "range": {
> "price": {
> "gte": 1.00
> }
> }
> }
> }
> '

Let’a take a closer look at what we just did in the code. We’ll begin with the size parameter, which we set to 2. This simply tells Elasticsearch how many results to return. The from parameter is a little more involved. This parameter tells Elasticsearch how many results to offset by, when you want to go beyond the first page of results. It’s important to know that the from parameter is zero-based. This means that the first result is considered to be in position 0, the second result is in position 1, and so forth. If you have some experience with programming, this is probably a very familiar concept; otherwise, it may seem a bit counterintuitive. In this context, the zero-based approach means that you’d specify "from": 0 if you want Elasticsearch to return results starting from the very beginning. If you specified "from": 1, Elasticsearch would skip the first result and return results beginning with the second one.

Note: Keep in mind that the default for "from" is 0, and the default for "size" is 10. Because of these default values, you don’t have to worry that thousands of results will be returned if you accidentally omit the "from" and "size" parameters– you’ll simply get the first 10 results.

Now that we’ve reviewed our query, let’s take a look at the results:

$ curl -H "Content-Type: application/json" -XGET 127.0.0.1:9200/store/products/_search?pretty -d '
> {
> "from": 0,
> "size": 2,
> "query": {
> "range": {
> "price": {
> "gte": 1.00
> }
> }
> }
> }
> '

{
"took" : 16,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 13,
"max_score" : 1.0,
"hits" : [
{
"_index" : "store",
"_type" : "products",
"_id" : "14",
"_score" : 1.0,
"_source" : {
"id" : "14",
"name" : "1 Gallon Orange Juice",
"price" : 3.29,
"quantity" : 4,
"department" : [
"Juice"
]
}
},
{
"_index" : "store",
"_type" : "products",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"id" : "5",
"name" : "1 Gallon Milk",
"price" : 3.29,
"quantity" : 16,
"department" : [
"Dairy"
]
}
}
]
}
}

As you can see, our query successsfully returned only the first two results. If we wanted the next two results after that (in other words, the second page of results), we’d set "size": 2 and "from": 2. Our "size" parameter stays the same, but you’ll notice that we now set "from": 2. This tells Elasticsearch that we’ve already seen results 0 and 1, so we can skip those and start with the third result, which is in position 2. Let’s run this query and make sure we get the expected results:

$ curl -H "Content-Type: application/json" -XGET 127.0.0.1:9200/store/products/_search?pretty -d '
> {
> "from": 2,
> "size": 2,
> "query": {
> "range": {
> "price": {
> "gte": 1.00
> }
> }
> }
> }
> '

{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 13,
"max_score" : 1.0,
"hits" : [
{
"_index" : "store",
"_type" : "products",
"_id" : "8",
"_score" : 1.0,
"_source" : {
"id" : "8",
"name" : "Pepperoni Pizza",
"price" : 2.99,
"quantity" : 5,
"department" : [
"Frozen"
]
}
},
{
"_index" : "store",
"_type" : "products",
"_id" : "9",
"_score" : 1.0,
"_source" : {
"id" : "9",
"name" : "12 Pack Cola",
"price" : 5.29,
"quantity" : 6,
"department" : [
"Packaged Foods"
]
}
}
]
}
}

As we expected, we got the next page of results. With this type of query, it’s easy to page through a large set of results by incrementing the from parameter.

Note You might have noticed that the results did not come back in order of indexing. For additional information on sorting in Elasticsearch, consult their documentation.

Conclusion

There’s no doubt that a large result set can be difficult to for users to process. Pagination allows you to limit your result set to the size of your choice, paging through your results as far as needed. With the step-by-step instructions described in this tutorial, you should have no trouble implementing pagination on a query in Elasticsearch using curl.

Just the Code

If you’re already familiar with the concept of pagination, here’s all the code you need to implement pagination on a query in Elasticsearch using curl:

$ curl -H "Content-Type: application/json" -XGET 127.0.0.1:9200/store/products/_search?pretty -d '
> {
> "from": 0,
> "size": 2,
> "query": {
> "range": {
> "price": {
> "gte": 1.00
> }
> }
> }
> }
> '
$ curl -H "Content-Type: application/json" -XGET 127.0.0.1:9200/store/products/_search?pretty -d '
> {
> "from": 2,
> "size": 2,
> "query": {
> "range": {
> "price": {
> "gte": 1.00
> }
> }
> }
> }
> '

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.