How to Use Aggregation to Compute the Average of a Field in Elasticsearch with NodeJS

Have a Database Problem? Speak with an Expert for Free
Get Started >>

Introduction

If you’re running a NodeJS application with Elasticsearch it won’t be long before you’ll need to compute the average of field in your dataset. Computing the average of a field is a common practice when using Elasticsearch. You might need to compute the average user rating of a song, movie, or product. Averages are computed using aggregation. Aggregation has many use cases besides computing the average but in this article we will focus on computing the average. We will be using a simple application in NodeJS which interacts with Elasticsearch via the elasticsearch npm module. We’ll show you how step-by-step. But if you’d just rather see the example code, click here to jump to Just the Code.

Note: The code will vary depending on all your system parameters but we hope to give you an idea of how this is done.

Prerequisites

Before we show you how to compute the average value of a field with Elasticsearch in Javascript, it’s important to make sure a few prerequisites are in place. There are only a few of system requirements for this task: NodeJS needs to be installed The elasticsearch npm module installed. A simple npm install elasticsearch should work in most cases. Elasticsearch also needs to be installed and running. * In our example, we have Elasticsearch installed locally using the default port of 9200. If your Elasticsearch installation is running on a different server, you’ll need to modify your javascript syntax accordingly.

Getting up to Speed

For this article you should have a NodeJS application setup but here is our main application file index.js so you can see our starting point.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
var elasticsearch = require("elasticsearch");

var client = new elasticsearch.Client({
  hosts: ["http://localhost:9200"]
});

client.ping(
  {
    requestTimeout: 30000
  },
  function(error) {
    if (error) {
      console.error("Cannot connect to Elasticsearch.");
      console.error(error);
    } else {
      console.log("Connected to Elasticsearch was successful!");
    }
  }
);

If you run this application using NodeJS with this command:

1
$ node index.js

You will get a console message letting you know whether your application is connected to Elasticsearch. If it’s not you’ll get a failure message and you’ll need to investigate. If you haven’t already created an application already, this code is a good place to start.

In the code so far we have …: ( MAKE INTO FULL SENTENCES )

  • Imported the elasticsearch module. This gives us all the functions we’ll need to interact with Elasticsearch
  • Then we created a client that’s connected to the port where Elasticsearch is running
  • Now that we have a connection to Elasticsearch with client all our interactions with Elasticsearch will be done through client. In this starter app we simply ping Elasticsearch to verify that we have a connection. We will replace this with calculating the average in the coming steps.

Use the search function on the elasticsearch client

Now instead of simply pinging Elasticsearch, we will demonstrate how to use aggregations to compute the average. Specifically we will find the average price of products in the “Dairy” department.

In our example we use an index representing a small grocery store called store. The store index contains the type products which lists all the stores products. To keep it simple our dataset only has a small number of products with just a few fields including the id, price, quantity, and department. Here is the json we used to create our dataset:

idnamepricequantitydepartment
1Multi-Grain Cereal4.994Packaged Foods
21lb Ground Beef3.9929Meat and Seafood
3Dozen Apples2.4912Produce
4Chocolate Bar1.292Packaged Foods, Checkout
51 Gallon Milk3.2916Dairy
60.5lb Jumbo Shrimp5.2912Meat and Seafood
7Wheat Bread1.295Bakery
8Pepperoni Pizza2.995Frozen
912 Pack Cola5.296Packaged Foods
10Lime Juice0.9920Produce
1112 Pack Cherry Cola5.595Packaged Foods
121 Gallon Soy Milk3.3910Dairy
131 Gallon Vanilla Soy Milk3.499Dairy
141 Gallon Orange Juice3.294Juice

Here is the json we used to define the mapping:

1
2
3
4
5
6
7
8
9
10
11
12
13
{
    "mappings": {
        "products": {
            "properties" : {
                "name": { "type": "text"},
                "price": { "type": "double"},
                "quantity": { "type": "integer"},
                "department": { "type": "keyword"}
            }
        }
    }
}
'

As you can see here are three items in the dairy department: 1 Gallon Milk: $3.29 1 Gallon Soy Milk: $3.39 * 1 Gallon Vanilla Soy Milk: $3.49

The average price is $3.39 and we’ll use verify this at the end.

Now let’s look at the javascript to implement the aggregation and dissect it afterwards.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
var elasticsearch = require("elasticsearch");

var client = new elasticsearch.Client({
  hosts: ["http://localhost:9200"]
});

/* Aggregate Average */
client.search({
  index: 'store',
  type: 'products',
  body: {
      query: {
          match_phrase: {
              department: "Dairy"
          }
      },
      aggs: {
        avg_dairy_price: {
          avg:
            {
              field: "price"
            }
        }
      }
  }
}).then(function(resp) {
  console.log("Successful query! Here is the response:", resp);
}, function(err) {
  console.trace(err.message);
});

We use the search function on the client and it has a function defintion that looks like this: `js client.search([params, [callback]]) `

Note The function definition has a callback parameter but we will be using promises instead because they tend to be more readable. The API does support promises as long as you don’t provide a callback parameter.

We won’t go over the full-list of parameter options ( you can consult the documentation for that ) but we give the search function everything it needs to compute the aggregation. First we create the query to limit the results to products in the “Dairy” department. Then we create an aggregation with the aggs. We give that aggregation a name avg_dairy_price. We set the aggregation type to avg so Elasticsearch to compute the average * Then we specify the field that we want to take an average on field: "price".

Let’s see if it works. To see if it works we used promises to console.log the results when it is successful and log an error on failure. Let’s run the application and see what happens:

1
2
3
4
5
6
7
8
9
$ node index.js
Successful query! Here is the response: { took: 0,
  timed_out: false,
  _shards: { total: 5, successful: 5, skipped: 0, failed: 0 },
  hits:
   { total: 3,
     max_score: 1.2039728,
     hits: [ [Object], [Object], [Object] ] },
  aggregations: { avg_dairy_price: { value: 3.39 } } }

We get back our success message and we can verify that our aggregator gave us the value we expected:

1
aggregations: { avg_dairy_price: { value: 3.39 } }

Conclusion

In this tutorial we demonstrated how to calculate an average using aggregation in Elasticsearch. This was a basic example of an aggregation but can easily be built upon to perform the more complex aggregations you’ll require for your application. We hope you found tutorial helpful and you can apply it to your specific application. If you have questions or this didn’t work for you please reach out to us so we can help. Thank you.

Just the Code

If you’re already comfortable with NodeJS and aggregations here’s all the code we used to demonstrate how to find an average with Elasticsearch and NodeJS.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
var elasticsearch = require("elasticsearch");

var client = new elasticsearch.Client({
  hosts: ["http://localhost:9200"]
});

/* Aggregate Average */
client.search({
  index: 'store',
  type: 'products',
  body: {
      query: {
          match_phrase: {
              department: "Dairy"
          }
      },
      aggs: {
        avg_dairy_price: {
          avg:
            {
              field: "price"
            }
        }
      }
  }
}).then(function(resp) {
  console.log("Successful query! Here is the response:", resp);
}, function(err) {
  console.trace(err.message);
});

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.