MongoDB Group by Multiple Fields Using Aggregation Function

Written by Data Pilot

July 07, 2019

MongoDB

Have a Database Problem? Speak with an Expert for Free
Get Started >>

Introduction

In this article I will introduce you to the Mongodb group by multiple fields. To understand better first we will learn about the aggregation function.

MongoDB Aggregation Function

Aggregation functions are used in GROUP BY clauses to aggregate grouped data. Aggregate functions perform better when used with the GROUP BY clause. However, the use of aggregate function is not limited to grouped queries.

When an aggregate function is used in a query without the GROUP BY clause, the aggregate function aggregates the entire result set (all rows that match the WHERE clause). If you do not use the GROUP BY clause, some aggregate functions in the SELECT list can only be used with other aggregate functions. That’s why the aggregate function must use the GROUP BY clause to connect to the list in the SELECT list.

For example, instead of using the GROUP BY clause, AVG in a SELECT list can only correspond to SUM, but it cannot correspond to a specific column.

The MongoDB aggregation operation is used for batch operations on the data. After the collection is grouped by condition, a series of operations such as sum, average, and others are performed. Aggregation operations can perform complex collections operations, especially for math statistics and data mining.

The input of the aggregation operation in MongoDB is the collection document. The output can be one or more documents. MongoDB offers a very powerful aggregation operation that can be divided into three categories:

Aggregation pipeline
Aggregation operation for single use
MapReduce programming model

Mongodb Group by Multiple Fields

The $ group operator is an aggregator that returns a new document. It has its own operator, we can get the field of the current document by $ symbol + field name. To understand the MongoDB group by multiple fields first, let’s have a look at a list of all operators that can be used in $ group:

$ sum – Returns the sum of all numeric fields.
$ avg – Calculates the average between numeric fields.
$ min – Returns the minimum value from the numeric field
$ max – Get the maximum value from the numeric field.
$ push – inserts the field value into the result field
$ addToSet – Inserts a value into an array of the resulting document, but does not create duplicates.
$ first – Gets only the first document from the grouped ones, usually for sorting.
$ last – Returns the last document.

Aggregation pipeline

In multi-threaded POSIX mode, there is a method called a pipeline, whose data element stream is executed sequentially by a set of threads in the specified order. The aggregation pipeline consists of a stage. After processing the document in one stage, the aggregation pipeline transfers the processing result to the next stage.

Polymer tube function Filter the document to find out which documents meet the criteria. Convert the document and change the output form of the document.

Each level of the aggregation pipeline is defined using stage operators, and each stage operator can use expression operators to calculate the sum, average, concatenation, or line breaks before each level. The result is returned at the end and the returned result can be directly output or saved to the collection.

Processing flow

Db.collection.aggregate () can use several channels at the same time for data processing.
Db.collection.aggregate () uses MongoDB’s native operations to efficiently aggregate and support operations such as GroupBy in SQL without having to write custom JS routines.
Each phase of the pipeline limits 100 MB of memory. If a single-node pipeline exceeds the limit, MongoDB generates an error. To process large amounts of data, set the allowDiskUse property to true to write data to a temporary file for an aggregated pipe node and allow a memory limit of 100 MB.
Db.collection.aggregate () can be applied to a series of slices, but the result cannot be lost in the series of slices. MapReduce can be applied to a series of slices, and the result can be lost in a series of slices.
Db.collection.aggregate () returns a cursor, the data is stored in memory and can be managed directly as MongoShell.
The output of db.collection.aggregate () can only be saved in one document, and the BSON document size is limited to 16 MB.

Grouping method

Using the group method is similar to using the GROUP BY clause in SQL. The group method has three parameters:

Key: Displays the key of the group
Initial: Initializes the document field that represents the document group
Reduce: A function that returns the number of elements. This function uses two parameters as arguments: the current element and the aggregate result document for the current group.
Keyf: optional parameter. Use a function other than the key argument to represent the return key object.
Cond: optional parameter. Specifies that the condition should return true, otherwise the document will not participate in the grouping. If this parameter is not specified, all documents participate in the grouping.
Finalize: optional parameter. This represents a function that is triggered before returning a grouped result.

For example: `js > db.users.group ({key: {name : true}, initial: {total : 0}, reduce : function (curr, res){res.total += 1}}) `

Let’s take a look at the expression. The key parameter specifies that the group will be grouped by name: key: {name: true}

The value of the initial parameter initializes the fields that represent the result document for the group. In this case, the initial value of the total field is set. This field indicates the number of elements in the group. Since there may not be any elements, we initialize to zero.

The reduce parameter represents a function where the curr parameter points to the current object in the group and res represents the current group. If you find another object with a specific value for the name field, add that document to the group and increase the total value in the res document by 1.

The use of $group

The $group uses _id to specify the key name to be grouped, using the custom field statistics. Using aggregate operation will clear the concept of Mongodb group by multiple fields.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39

db.users.aggregate({
$match : { age: { $gte : 18 } }
},{
$group : { _id:$username, count:{$sum:1} }
});

// Mongodb group by multiple fields
db.users.aggregate({
$match: {age: {$gte:18} }},
$group: {_id:{username:$username, age:$ge}, 'count':{$sum:1} }
})

// $sum:val Add val to each document
// $avg:val Average the value of each document
db.users.aggregate({
$group: { _id:$username, count:{$avg:$age} }
})

db.users.aggregate({
$group: { _id:$username, count:{$max:$age} }
})

db.users.aggregate({
$group: {_id:$username, count:{$min:$age} }
})

// $first:val Become first in the group
db.users.aggregate({
$group:{_id:$username, count:{$first: $age} }
})
db.users.aggregate({
$group:{_id:$username, count:{$last: $age} }
})
db.users.aggregate({
$group: {_id:$username, count:{$addToSet: $age} }
})
db.users.aggregate({
$group:{_id:$username, count:{$push: $age} }
})

Mongodb group by multiple fields using Aggregate operation

First, the key on which the grouping is based is selected and then the collection is divided into groups according to the selected key value. You can then create a final document by aggregating the documents in each group.

The group does not support fragmented clusters and cannot perform shard clustering. If you need distributed support, you must use aggregate or mapReduce.

1
2
3
4
5
6
7
8
9
10
11
12
13

db.collection.group(document)
{
# Grouping field
key:{key1, key2:1},
# Query conditions
cond:{},
# Aggregate function
reduce:function(current, result){},
# initialization
initial:{},
# Counting Return Function Set
finalize:function(){}
}

Calculate the number of items in each column

1
2
3
4
5
6
7
8
9

SELECT COUNT(*) FROM goods GROUP BY category_id;
db.goods.group({
key:{category_id:1},
cond:{},// all together
reduce:function(current, result){// Corresponding to the current line，result Corresponding to multiple rows in the group
result.total += 1;
},
initial:{total:0}
})

Conclusion

This article is written for the developers to understand the Mongodb group by multiple fields using aggregate function. If you find this article useful keep visiting Object Rocket for more database related articles.

If you need advice or help on managing your database security, scalability, or backups please don’t hesitate to contact us at Object Rocket.

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

MongoDB Group by Multiple Fields Using Aggregation Function

Introduction

MongoDB Aggregation Function

Mongodb Group by Multiple Fields

Aggregation pipeline

Processing flow

Grouping method

The use of $group

Mongodb group by multiple fields using Aggregate operation

Conclusion

Pilot the ObjectRocket Platform Free!

Keep in the know!

Services

Platform

Company

Resources

Support