How to Use Aggregation to Compute the Average of a Field in Elasticsearch

Introduction

No matter what type of data you’re working with in Elasticsearch, you’ll probably encounter a need to compute an average at some point. You may need to compute the average user rating of a product, or you might be calculating average grades. Averages are also used when you’re compiling statistics about price and sales data. Regardless of the intended purpose, computing the average of a field in Elasticsearch is a simple task. In this tutorial, you’ll learn how to use aggregation to compute the average of a field in Elasticsearch. If you’re already familiar with the concept of aggregation and would prefer to skip the explanation, feel free to jump ahead to Just the Code.

Use the avg Aggregation

Let’s look at an example of how you can compute the average of a field in Elasticsearch. We’ll use an index called store, which represents a small grocery store. In this store index, we’ll have a type called products which lists the store’s products. We will keep our dataset simple by only including a handful of products with just a small number of fields: id, price, quantity, and department. The following table shows our dataset:

idnamepricequantitydepartment
1Multi-Grain Cereal4.994Packaged Foods
21lb Ground Beef3.9929Meat and Seafood
3Dozen Apples2.4912Produce
4Chocolate Bar1.292Packaged FoodsCheckout
51 Gallon Milk3.2916Dairy
60.5lb Jumbo Shrimp5.2912Meat and Seafood
7Wheat Bread1.295Bakery
8Pepperoni Pizza2.995Frozen
912 Pack Cola5.296Packaged Foods
10Lime Juice0.9920Produce
1112 Pack Cherry Cola5.5995Packaged Foods
121 Gallon Soy Milk3.3910Dairy
131 Gallon Vanilla Soy Milk3.499Dairy
141 Gallon Orange Juice3.294Juice

The mapping to create our store index and products type is shown below:

curl -H "Content-Type: application/json" -XPUT 127.0.0.1:9200/store -d '
{
  mappings: {
    products: {
      properties: {
        name: { type: "text" },
        price: { type: "double" },
        quantity: { type: "integer" },
        department: { type: "keyword" }
      }
    }
  }
}
'

Now that we’ve created our dataset and mapping, we can show how to use aggregration to compute the average price of “Dairy” products. In our dataset, there are three “Dairy” items: 1 Gallon Milk: $3.29 1 Gallon Soy Milk: $3.39 * 1 Gallon Vanilla Soy Milk: $3.49

The average should be $3.39.

The code below shows how to use aggregation to compute the average. It’s a bit complex, but the explanation that follows it will clarify what’s going on:

Request to Elasticsearch:

curl -H "Content-Type: application/json" -XGET "127.0.0.1:9200/store/products/_search?size=0&pretty" -d '
{
"query": {
"match_phrase": { "department": "Dairy" }
},
"aggs": {
"avg_dairy_price": {
"avg": { "field": "price"}
}
}
}
'

Response:

{
"took" : 15,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"avg_dairy_price" : {
"value" : 3.39
}
}
}

You can see in the results that the avg_dairy_price was indeed $3.39. Let’s talk about how we made this happen. The first step was to use a match_phrase query on department: this limits the results to products in the “Dairy” department. Next, an aggregator was created using aggs and was given the name avg_dairy_price. Finally, avg was used to set the type of aggregator and to define which field to compute the average for. In this example, the field was price.

In the results, the value for hits shows how many results matched the query, and aggregations tells the value for our aggregator as $3.39. This value is the average for the price field. Note that the cURL contains the parameter size=0. Without this parameter, the query would return each individual “Dairy” product, instead of the aggregate information we want.

Most of the time, you’ll want to use a query with your aggregator, but you’re not required to. For example, if we wanted to calculate the average price of all our products in store, we could have left out the query JSON completely.

Conclusion

There are many situations where you may want to compute the average of a field in Elasticsearch, so it’s important to know how to accomplish the task correctly. This tutorial outlined a simple way to use aggregration to compute the average. With these step-by-step instructions, you’ll be able to add this computation to your search applications and get the information you need.

Just the Code

If you’re already familiar with the concept of aggregation, here’s all the code you’ll need to compute the average of a field in Elasticsearch:

curl -H "Content-Type: application/json" -XGET "127.0.0.1:9200/store/products/_search?size=0&pretty" -d '
{
"query": {
"match_phrase": { "department": "Dairy" }
},
"aggs": {
"avg_dairy_price": {
"avg": { "field": "price"}
}
}
}
'

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.