How to Find the Number of Distinct Values with Cardinality Aggregations in Elasticsearch using NodeJS

Written by Data Pilot

April 07, 2019

Elasticsearch
NodeJS

Introduction

Elasticsearch aggregations allow you to compute averages, sums, maximums, and minimums. They also let you determine all the unique values that appear for a field in a given dataset. This is called a cardinality aggregation. If this is the type of calculation you need to perform we’ll show you how to do it step-by-step in Javascript. But if you’d just rather see the example code, click here to jump to Just the Code.

Prerequisites

Before we show you how to compute a weighted average with Elasticsearch in Javascript, it’s important to make sure a few prerequisites are in place. There are only a few of system requirements for this task: NodeJS needs to be installed The elasticsearch npm module installed. A simple npm install elasticsearch should work in most cases. Elasticsearch also needs to be installed and running. * In our example, we have Elasticsearch installed locally using the default port of 9200. If your Elasticsearch installation is running on a different server, you’ll need to modify your javascript syntax accordingly.

Using the cardinality aggregation

We love to show by example. In our example we use an index representing a small grocery store called store. The store index contains the type products which lists all the stores products. To keep it simple our dataset only has a small number of products with just a few fields including the id, price, quantity, and department. Here is the json we used to create our dataset:

id	name	price	quantity	department
1	Multi-Grain Cereal	4.99	4	Packaged Foods
2	1lb Ground Beef	3.99	29	Meat and Seafood
3	Dozen Apples	2.49	12	Produce
4	Chocolate Bar	1.29	2	Packaged Foods, Checkout
5	1 Gallon Milk	3.29	16	Dairy
6	0.5lb Jumbo Shrimp	5.29	12	Meat and Seafood
7	Wheat Bread	1.29	5	Bakery
8	Pepperoni Pizza	2.99	5	Frozen
9	12 Pack Cola	5.29	6	Packaged Foods
10	Lime Juice	0.99	20	Produce
11	12 Pack Cherry Cola	5.59	5	Packaged Foods
12	1 Gallon Soy Milk	3.39	10	Dairy
13	1 Gallon Vanilla Soy Milk	3.49	9	Dairy
14	1 Gallon Orange Juice	3.29	4	Juice

Here is the json we used to define the mapping:

1
2
3
4
5
6
7
8
9
10
11
12

{
"mappings": {
"products": {
"properties" : {
"name": { "type": "text"},
"price": { "type": "double"},
"quantity": { "type": "integer"},
"department": { "type": "keyword"}
}
}
}
}

So an example of a cardinality in our grocery store would be to try and find the number of distinct departments in our store.

Here’s the code to run a cardinality aggregation and find the number of departments:

File index.js:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

There’s a few steps to dissect so let’s dive in step-by-step: First we required the elasticsearch library because it gives us the library of functions that make it easy to access Elasticsearch. Next we created a variable var client which creates and stores our connection to Elasticsearch. From this point on we’ll use this client to do all our interactions with Elasticsearch. We then use the search function on client to create an aggregator. We specify the index and type to perform the search on. The important part comes in the body where we define the aggregator by using the aggs keyword. We gave our aggregator a name department_count that can be anything but choose something that makes sense. We specify the type of aggregator as cardinality. * Lastly we specify what field the aggregator should evaluate the department field for distinct values.

And here was the response:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

$ node index.js
Successful query!
{
"took": 10,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 13,
"max_score": 0,
"hits": []
},
"aggregations": {
"department_count": {
"value": 8
}
}
}

As you can see it gave us a value of 8 for our department_count. We can verify this by looking back our data and counting the distinct departments … Packaged Foods, Meat and Seafood, Produce, Checkout, Dairy, Bakery, Frozen, and Juice.

Conclusion

In this tutorial we demonstrated how to use Elasticsearch aggregations to get the unique values of a specific field in a dataset. There are many other things you can do with aggregation and you can consult the Elasticsearch documentation to learn more about it. The documentation is also useful if you need help with syntax. We hope you found this tutorial on the cardinality aggregation helpful and can apply it to your specific application. If you have questions or this didn’t work for you please reach out to us so we can help. Thank you.

Just the Code

If you’re already comfortable with NodeJS and aggregations here’s all the code we used to demonstrate how to do a cardinality aggregation in Elasticsearch using NodeJS.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

How to Find the Number of Distinct Values with Cardinality Aggregations in Elasticsearch using NodeJS

Introduction

Prerequisites

Using the cardinality aggregation

Conclusion

Just the Code

Pilot the ObjectRocket Platform Free!

Keep in the know!

Services

Platform

Company

Resources

Support