# How to use Postgres Group By

## Introduction

This article teaches how to use the Postgres Group By statement. During our lesson, we’ll explore use of the PostgreSQL HAVING clause, as well as the AVG function, COUNT function, MIN function, MAX function, and SUM function, which are all aggregate functions.

## Prerequisites

- A base understanding of how to write SQL for Postgres.
- Some knowledge of the use of common SQL queries, including SELECT, FROM, and WHERE (which is like HAVING) clauses. If you have no exposure to “HAVING”, that is fine because it is part of what we are here to learn.
- Understanding of what
*function*,*integer*, and*string*do and how they work. - The meaning of the word “aggregate”:

## Using PostgreSQL GROUP BY

The Postgres *GROUP BY* command groups a set of values so that we can enact group-level (aggregate) functions like Avg(), Bit_Or(), Bit_And(), Count(), Max(), Min(), and Sum().

The GROUP BY statement in PostgreSQL is used in conjunction with SELECT to group the rows (records) in a Postgres table that have a specific data profile. The purpose can be to return aggregates that apply to the group(s) and/or remove redundancy.

In Postgres, GROUP BY comes after the WHERE or HAVING clause and before the ORDER BY.

## PostgreSQL GROUP BY syntax

id_distinct

, An_Aggregate_Function([name_column])

FROM

tbl_name_of_table

WHERE OR HAVING

condition(s)

GROUP BY

id_distinct;

ORDER BY

[name_column TO sort];

### Analysis of Group By syntax

In the above query, _Some_Aggregate*Function()* represents functions that act upon a *set* of values, as mentioned in the section above titled “Using PostgreSQL GROUP BY”.

## HAVING vs WHERE

*HAVING* and *WHERE* occupy similar but different roles.

- HAVING is used to filter values from a group. This – instead of WHERE – is what you will typically use when using aggregate functions in your SQL.
- WHERE is used to filter records from a result. This occurs
*before*any groupings are made.

## Why and when will we use the GROUP BY clause?

When we need to group data, we use *GROUP BY* to either remove redundancy or utilize an aggregate function. Below, we will show examples. A simple rule is to use WHERE before GROUP BY and HAVING after GROUP BY. This will work for *most* situations.

## Postgres Group By example

We’ve named our initial PostgreSQL table as tbl_functions:

t_name_function | t_category | t_operates_on | i_difficulty | i_uses |
---|---|---|---|---|

min | Aggregate | number | 4 | 8 |

max | Aggregate | number | 4 | 8 |

avg | Aggregate | number | 4 | 12 |

count | Aggregate | all | 4 | 14 |

sum | Aggregate | number | 4 | 20 |

char_length | Singular | string | 3 | 32 |

lower | Singular | string | 2 | 21 |

upper | Singular | string | 2 | 20 |

length | Singular | string | 3 | 45 |

### Table columns above explanation

- _t_name
*function*is the name we’ve stored in Postgres. - _t
*category*keeps track of whether the function in this row is for aggregate or singular use. If “Aggregate”, then we know this function uses GROUP BY. - _t_operates
*on*is whether this function’s parameter requires a string or number. - _i
*difficulty*is a measurement of the learning curve required to learn this function. - _i
*uses*is a percentage as integer of how often the function is used in the company’s code base.

### Postgres Group By SQL

Now we will use GROUP BY on the table, leaving out any functions at first.

t_category

FROM

tbl_functions

GROUP BY

t_category;

The Postgres SQL above returns:

t_category |
---|

Singular |

Aggregate |

Why is that? With the SQL above, we told PostgreSQL:

- Look at the “t_category” column only.
- Return only unique values from that column.

Let’s go through the same process with another column in our table:

t_operates_on

FROM

tbl_functions

GROUP BY

t_operates_on;

The above SQL returns:

t_operates_on |
---|

number |

all |

string |

Note that in the above returned data, the duplicates in the _t_operates*on* column were aggregated?

Let’s increase the depth of your understanding by adding in aggregate functions, starting with just one. We’ll build the Postgres SQL to use the GROUP BY clause along with the AVG aggregate function to see the average i_difficulty to learn and use any of the functions in our company applications.

t_category

, Avg(i_difficulty) AS avg_of_difficulty

FROM

tbl_functions

GROUP BY

t_category;

Analysis

- (1) SELECT: Relays to Postgres that we want an average of all values in the _i
*difficulty*column and to name the results to be avg_of_difficulty. The AVG aggregate function requires GROUP BY. - (2)
*FROM*: The Postgres source data is tbl_functions. - (3)
*GROUP BY*: Tells PostgreSQL that we want grouping of the _t*category*column in tbl_functions.

The above query returns:

t_category | avg_of_difficulty |
---|---|

Aggregate | 4 |

Singular | 2.5 |

As you can see, AVG(i_difficulty) gave us an average of all the values in the i_difficulty column.

To answer a question that might be on your mind of “What is used more, Aggregate functions or Singular-type?” we’ll use another example – and instead of AVG – we will use another aggregate function; SUM.

t_category

, SUM(i_uses) AS sum_of_uses

FROM

tbl_functions

GROUP BY

t_category;

Output:

t_category | sum_of_uses |
---|---|

Aggregate | 62 |

Singular | 118 |

## Use HAVING to filter

What if we only want to return a “sum_of_uses” based on when the function category is “Aggregate”? This is where we add the “HAVING” clause:

t_category

, SUM(i_uses) AS sum_of_uses

FROM

functions

-- TWO NEW LINES OF SQL:

HAVING

t_category = 'Aggregate'

GROUP BY

t_category;

Returns:

t_category | sum_of_uses |
---|---|

Aggregate | 62 |

Here we see the sum function added up all numbers in the i_uses column HAVING the value of “Aggregate” in the t_category column of our PostgreSQL table.

## Extras

- You may wish to investigate Postgres’ ROLLUP, which is a GROUP BY subclause. It is used to define multiple sets of columns you want to group.
- We invite you to look for our other articles using PostgreSQL’s
*GROUP BY*, along with related aggregate functions like AVG, COUNT, MIN, MAX, and SUM. - Are you curious why we named some of our variables and columns with a prefix of “i
*” or “t*“? In this article, we used “i*” to mean integer and “t*” to mean text or string. Here is a short tutorial on that topic.

## Conclusion

In this tutorial we learned how to use the Postgres Group By clause. We studied use of the PostgreSQL HAVING clause, as well as AVG and SUM functions, which are aggregate functions.

## Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started