Postgres Aggregate Functions

Have a Database Problem? Speak with an Expert for Free
Get Started >>

Introduction

In this lesson we learn how to use the following Postgres aggregate functions: AVG function and SUM function, which are both aggregate functions in PostgreSQL. We also explore a statement required for filtering of aggregated data; “HAVING”.

Prerequisites

  • An understanding of how SQL queries fit into the Postgres database system and the means to execute them.
  • Understanding of common query types and clauses, including SELECT, FROM, and HAVING (which is like WHERE but for aggregate data, as you will learn). If you have no understanding of how “HAVING” works, you have come to the right place as we’ll be learning it here.

Essential to understanding and using aggregate functions in Postgres is learning to GROUP or aggregate table data, so we will begin by studying how the GROUP BY command works.

Postgres GROUP BY clause

The Postgres GROUP BY statement aggregates a set of rows so that we can use group-based (aggregate) functions like Avg, Count, Min, Max, and Sum. The GROUP BY statement is used with SELECT to group the records (rows) in a Postgres table or view that have a specific data “look”. The purpose can be to return values that apply to the group(s) and/or to remove dupes.

Note: GROUP BY is placed after the WHERE or HAVING statements and before ORDER BY.

Syntax of Postgres GROUP BY

1
2
3
4
5
6
7
8
9
10
11
SELECT
    value_distinct
    , Your_Aggregate_Function([name_of_column])
FROM
    tbl_table_name_here_
WHERE OR HAVING
    [SOME condition(s) met]
GROUP BY
    value_distinct;
ORDER BY
    [name_column TO sort];

Analysis

In the above Group By syntax example, _Your_AggregateFunction() represents functions that act upon a set of rows, as mentioned above.

HAVING or WHERE?

HAVING and WHERE occupy similar but different roles. HAVING can be essential to filtering your aggregated records.

  • We use HAVING to limit records in a group. This – instead of WHERE – is what you will typically use when using aggregate functions in your SQL.
  • Use WHERE to limit records from any non-aggregated set. This occurs before any groupings are made.

We’ll learn in a bit more detail about HAVING later in this tutorial.

Why and when will we use the GROUP BY clause?

When we need to group data, we use GROUP BY to either remove redundancy or utilize an aggregate function. Below, we will show examples. A simple rule is to use WHERE before GROUP BY and HAVING after GROUP BY. This will work for most situations.

Postgres Group By example

We created and filled the PostgreSQL table as “tbl_functions”:

t_name_functiont_catt_operates_oni_difficultyi_uses
minAggregatenumber48
maxAggregatenumber48
avgAggregatenumber412
countAggregateall414
sumAggregatenumber420
char_lengthSingularstring332

Analysis

  • _t_namefunction is the name we’ve stored in Postgres.
  • _tcat keeps track of whether the function in this row is for aggregate or singular use. If “Aggregate”, then we know this function uses GROUP BY.
  • _t_operateson is whether this function’s parameter requires a string or number.
  • _idifficulty is a measurement of the learning curve required to learn this function.
  • _iuses is a percentage as integer of how often the function is used in the company’s code base.

Postgres Group By SQL

Now we will use GROUP BY on the “tbl_functions” Postgres table, first choosing to leave out use of functions.

1
2
3
4
5
6
SELECT
    t_cat
FROM
    tbl_functions
GROUP BY
    t_cat;

The Postgres query above returns:

t_cat
Singular
Aggregate

Why is that? With the SQL above, we told PostgreSQL:

  • Look at the “t_cat” column only.
  • Return only unique values from that column.

Let’s go through the same process with another column in our table:

1
2
3
4
5
6
SELECT
    t_operates_on
FROM
    tbl_functions
GROUP BY
    t_operates_on;

The above SQL returns:

t_operates_on
number
all
string

Note that in the above returned data, the duplicates in the _t_operateson column were aggregated?

Let’s increase the difficulty by adding in an aggregate function, starting with AVG. We’ll build the Postgres SQL to use the GROUP BY clause along with the AVG aggregate function to see the average i_difficulty to learn and use any of the functions in our company applications.

First, though:

The Postgres AVG function

PostgreSQL’s AVG function returns the average of a group of numbers.

Postgres AVG function syntax

1
i_average := AVG(group_of_values);

Postgres AVG function example

Our initial dataset is a table called tbl_languages.

id_technologyt_namei_dif
0Python8
1Java7
2C#9
3JavaScript8

The exercise is to find the average difficulty – see the “i_dif” column – of all four technology ages in the table you see above.

Note for the below sql example. This is where we use our knowledge of SQL GROUPing into the mix. You can think of it this way: If you want to average more than one number, you must look at the entire group of values.

1
2
3
4
5
6
    SELECT
        AVG(i_dif)
    FROM
        tbl_languages
    GROUP BY
        i_dif;

How does the above work? We’ll go piece by piece:

  • (1) SELECT: We are telling Postgres’ SQL engine that we want an average of all values (because no WHERE statement) in the _idif column.
  • (2) FROM: Specifying _tbllanguages as our data source.
  • (3) GROUP BY: Informing PostgreSQL that we are aggregating rows. The AVG() function requires aggregation of the data.

The above query returns the integer, “8”.

Another AVG example

1
2
3
4
5
6
7
SELECT
    t_cat
    , Avg(i_dif) AS avg_of_dif
FROM
    tbl_functions
GROUP BY
    t_cat;

Analysis

  • (1) SELECT: Informs Postgres that we want an average of all values in the _idif column and to name the results to be avg_of_dif. The AVG function requires GROUP BY.
  • (2) FROM: The Postgres source data is tbl_functions.
  • (3) GROUP BY: Tells PostgreSQL that we want grouping of the _tcat column in tbl_functions.

The above query returns:

t_catavg_of_dif
Aggregate4
Singular2.5

As you can see, AVG(i_dif) returned an average of all the numbers in the i_dif column.

The Postgres SUM aggregate function

The PostgreSQL SUM() function gives us the sum of a group of numbers.

Sum function syntax

1
i_sum_total := SUM(group_of_numeric_values);

Sum function example

Our test dataset is a table called tbl_coding_languages.

id_techt_namei_people_know_it
0Python6
1Java1
2C#3
3JavaScript4

The ipeople_know_it column represents the number of developers in your firm who know this language. We want to get a total of all the coders in the company who know these languages. This is where we use GROUP BY. You can think of it this way: If you want to sum _more than one number, you have to add up the group of numbers, since there is no point to retrieving the sum of one number as it will equal itself.

1
2
3
4
5
6
SELECT
    SUM(i_people_know_it)
FROM
    tbl_coding_languages
GROUP BY
    i_people_know_it;

Analysis

  • (1) SELECT: Relay to Postgres that we want a sum of all numbers (there is no HAVING filter) in the _i_people_knowit column.
  • (2) FROM: Specify _tbl_codinglanguages as our source for data.
  • (3) GROUP BY: Instructs PostgreSQL that we are grouping. As mentioned above, the SUM() function REQUIRES aggregation of data.

The query returns “14”.

1
2
3
4
5
6
7
SELECT
    t_cat
    , SUM(i_uses) AS sum_of_uses
FROM
    tbl_functions
GROUP BY
    t_cat;

Output:

t_catsum_of_uses
Aggregate63
Singular119

Use HAVING to filter

What if we want to return only a “sum_of_uses” from “tbl_functions” based on when the function category is “Aggregate”? This is where we add the “HAVING” statement to filter, much like the way WHERE works:

1
2
3
4
5
6
7
8
9
SELECT
    t_cat
    , SUM(i_uses) AS sum_of_uses
FROM
    tbl_functions
HAVING
    t_cat = 'Aggregate'
GROUP BY
    t_cat;

Returns:

t_catsum_of_uses
Aggregate63

Here we see the sum function added up all numbers in the i_uses column HAVING the value of “Aggregate” in the t_cat column of our PostgreSQL table.

Extras

Conclusion

We learned how and when to use Postgres aggregate functions such as AVG function and SUM function, which are aggregate functions for PostgreSQL. We also explored the “HAVING” clause in more than one example. Code samples and images included.

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.