How to Aggregate a Filtered Set in Elasticsearch using NodeJS

Introduction

Somtimes you’d like to do an aggregation but not of your entire dataset. Most often you’ll want to limit the data by some condition or another. It’s useful to know how to get Elasticsearch to perform aggregations like averages, sums, maximums, and minimums but of a filtered set. We’ll show you exactly how to do this type of filtered aggregation in Javascript running on top of NodeJS. If you’d just rather see the example code, click here to jump to Just the Code.

Prerequisites

Before we show you how to get the stats with Elasticsearch in Javascript, it’s important to make sure a few prerequisites are in place. There are only a few of system requirements for this task: NodeJS needs to be installed The elasticsearch npm module installed. A simple npm install elasticsearch should work in most cases. Elasticsearch also needs to be installed and running. * In our example, we have Elasticsearch installed locally using the default port of 9200. If your Elasticsearch installation is running on a different server, you’ll need to modify your javascript syntax accordingly.

Our Demo Data

We love to show by example. In our example for this stats aggregation we will use an index representing a small grocery store called store. The store index contains the type products which lists all the stores products. To keep it simple our dataset only has a small number of products with just a few fields including the id, price, quantity, and department. Here is the json we used to create our dataset:

idnamepricequantitydepartment
1Multi-Grain Cereal4.994Packaged Foods
21lb Ground Beef3.9929Meat and Seafood
3Dozen Apples2.4912Produce
4Chocolate Bar1.292Packaged Foods, Checkout
51 Gallon Milk3.2916Dairy
60.5lb Jumbo Shrimp5.2912Meat and Seafood
7Wheat Bread1.295Bakery
8Pepperoni Pizza2.995Frozen
912 Pack Cola5.296Packaged Foods
10Lime Juice0.9920Produce
1112 Pack Cherry Cola5.595Packaged Foods
121 Gallon Soy Milk3.3910Dairy
131 Gallon Vanilla Soy Milk3.499Dairy
141 Gallon Orange Juice3.294Juice

Here is the json we used to define the mapping:

{
    "mappings": {
        "products": {
            "properties" : {
                "name": { "type": "text"},
                "price": { "type": "double"},
                "quantity": { "type": "integer"},
                "department": { "type": "keyword"}
            }
        }
    }
}

Our Demo example of a Filtered Aggregation

So an example for our store index might be to try and compute the average price but only of products in the Dairy department. Here’s the code to do it:

File index.js:

var elasticsearch = require("elasticsearch");

var client = new elasticsearch.Client({
  hosts: ["http://localhost:9200"]
});

/* Get the filtered aggregation for the average dairy price */
client.search({
  size: 0,
  index: 'store',
  type: 'products',
  body: {
        "aggs" : {
            "dairy_prices" : {
                "filter" : { "term" : { "department": "Dairy" } },
                "aggs" : {
                  "avg_dairy_price" : { "avg" : { "field" : "price" } }
                }
            }
        }
    }
}).then(function(resp) {
  console.log("Successful query!");
  console.log(JSON.stringify(resp, null, 4));
}, function(err) {
  console.trace(err.message);
});

There’s a few steps to dissect so let’s dive in step-by-step: First we required the elasticsearch library because it gives us the library of functions that make it easy to access Elasticsearch. Next we created a variable var client which creates and stores our connection to Elasticsearch. From this point on we’ll use this client to do all our interactions with Elasticsearch. We then use the search function on client to create an aggregator. We specify the index and type to perform the search on. * The important part comes in the body where we define the aggregator by using the aggs keyword. We gave our aggregator a name dairy_prices that can be anything but choose something that makes sense. The aggregate contains a filter so that we aggregate on only products in the dairy department. An aggregation of type average is called for using the type avg on the field price.

Here is how we ran the code which is in our working directory:

$ node index.js

And here was the response:

Successful query!
{
    "took": 9,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 14,
        "max_score": 0,
        "hits": []
    },
    "aggregations": {
        "dairy_prices": {
            "doc_count": 3,
            "avg_dairy_price": {
                "value": 3.39
            }
        }
    }
}

As you can see our dairy_prices doc_count was three because there are only three dairy products so we know our filter worked. Then it gives us our average dairy price value as $3.39.

Conclusion

In this tutorial we demonstrated how to use Elasticsearch aggregations with a filtered dataset. There are many other things you can do with aggregation and you can consult the Elasticsearch documentation to learn more about it. The documentation is also useful if you need help with syntax. We hope you found this tutorial helpful and can apply it to your specific application. If you have questions or this didn’t work for you please reach out to us so we can help. Thank you.

Just the Code

If you’re already comfortable with NodeJS and aggregations here’s all the code we used to demonstrate how to do a filtered aggregation in Elasticsearch using NodeJS.

var elasticsearch = require("elasticsearch");

var client = new elasticsearch.Client({
  hosts: ["http://localhost:9200"]
});

/* Get the filtered aggregation for the average dairy price */
client.search({
  size: 0,
  index: 'store',
  type: 'products',
  body: {
        "aggs" : {
            "dairy_prices" : {
                "filter" : { "term" : { "department": "Dairy" } },
                "aggs" : {
                  "avg_dairy_price" : { "avg" : { "field" : "price" } }
                }
            }
        }
    }
}).then(function(resp) {
  console.log("Successful query!");
  console.log(JSON.stringify(resp, null, 4));
}, function(err) {
  console.trace(err.message);
});

Pilot the ObjectRocket platform free for 30 Days

It's easy to get started. Imagine the time you'll save by not worrying about database management. Let's do this!

PILOT FREE FOR 30 DAYS

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.