How to Use PostgreSQL LAG Function

Have a Database Problem? Speak with an Expert for Free
Get Started >>

Introduction

In this article we will learn how to use PostgreSQL LAG function. Please review the following prerequisites or if you already have PostgreSQL up and running you can jump straight into the article.

Prerequisites

  • Ensure that PostgreSQL server is properly installed, configured and running on the background.

For Linux and Windows systems you can download PostgreSQL here

  • To start PostgreSQL server use a LINUX machine use the following command.
1
sudo service postgresql start
  • To verify if the service is running use the following command.
1
service postgresql status
  • The result should look something like the following:
1
2
3
4
5
6
7
8
9
● postgresql.service - PostgreSQL RDBMS
   Loaded: loaded (/lib/systemd/system/postgresql.service; enabled; vendor prese
   Active: active (exited) since Thu 2019-08-01 14:51:20 PST; 36min ago
  Process: 1230 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
 Main PID: 1230 (code=exited, status=0/SUCCESS)

Aug 01 14:51:20 user-UX330UAK systemd[1]: Starting PostgreSQL RDBMS...
Aug 01 14:51:20 user-UX330UAK systemd[1]: Started PostgreSQL RDBMS.
lines 1-8/8 (END)
  • To start, stop and restart PostgreSQL server in a Windows machine do the following instruction.
  1. Open Control Panel
  2. Open Administrative Tools
  3. Open Services
  4. Find the PostgreSQL Server service
  5. Start, Stop or Restart the service

Restarting PostgreSQL server in a Windows Machine

What is PostgreSQL LAG Function

LAG() is a PostgreSQL function allows access to a row that comes before the present row at a defined physical offset. This means that from the current row the LAG() function is able to access data of the previous row, or the row before the previous row, and so on.

The LAG() function can be beneficial for comparing data from the current and previews table rows.

Below is the syntax for the LAG() function.

1
2
3
4
5
LAG(expression [,offset [,default_value]])
OVER (
    [PARTITION BY partition_expression, ... ]
    ORDER BY sort_expression [ASC | DESC], ...
)

Let’s discuss the above statement part by part.

  1. The expression will be evaluated to the row that comes before this row at a fixed offset. It will be a column, expression, or subquery. A single value must be returned by the expression and cannot be a window function.

  2. The offset is a value that is composed of a number that specifies number of rows that comes before the present row from which to access the information. The offset may be an expression, subquery, or a column. The offset will be defaulted to one (1) should you not specify it.

  3. When the offset goes beyond the scope of the partition the LAG() function will return the default_value. Should we take out default_value the function will be returning NULL.

  4. Using PARTITION BY will divide the rows into partitions to where the LAG() function will be applied. As a default, the entire result set will be treated as an individual partition when the PARTITION BY clause was omitted.

  5. ORDER BY simply specifies the order of the rows for every partition to where the LAG() function is specified.

PostgreSQL LAG() function examples

Now that we have explain the syntax part by part we will now put that overview understanding to practice by taking some few examples.

Before we proceed with the example we will be creating a table to work with.

Let’s use the below query to create a table named team_stats.

1
2
3
4
5
6
7
CREATE TABLE team_stats(
    team_name VARCHAR(100) NOT NULL,
    team_id INT NOT NULL,
    year_game SMALLINT NOT NULL CHECK (year_game > 0),
    points DECIMAL(12,2) CHECK (points >= 0),
    PRIMARY KEY (team_name,year_game)
);

After creating the above gamers_stats table, we will now insert few rows in the table.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
INSERT INTO
    team_stats(team_name, team_id, year_game, points)
VALUES
    ('Luke SkyWalker',1,2018,180000),
    ('Mike SkyScraper',2,2018,170000),
    ('Jack Sparrow',3,2018,140000),
    ('Jackie Chan',4,2018,300000),
    ('Jet li',5,2018,300000),
    ('Luke SkyWalker',1,2019,160000),
    ('Mike SkyScraper',2,2019,140000),
    ('Jack Sparrow',3,2019,150000),
    ('Jackie Chan',4,2019,28000),
    ('Jet Li',5,2019,250000),
    ('Luke SkyWalker',1,2020,190000),
    ('Mike SkyScraper',2,2020,170000),
    ('Jack Sparrow',3,2020,180000),
    ('Jackie Chan',4,2020,290000),
    ('Jet Li',5,2020,270000);

PostgreSQL Example How to Use LAG() function in Result Set

In this example we will be using the LAG() function to return the total points of the current year (2019) and the previous year (2018):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
WITH cte AS (
   SELECT
      year_game,
      SUM(points) points
   FROM team_stats
   GROUP BY year_game
   ORDER BY year_game
)
SELECT
   year_game,
   points,
   LAG(points,1) OVER (
      ORDER BY year_game
   ) previous_year_points
FROM
   cte;

The result should resemble something like the following:

1
2
3
4
5
6
 year_game |   points   | previous_year_points
-----------+------------+----------------------
      2018 | 1090000.00 |
      2019 |  728000.00 |           1090000.00
      2020 | 1100000.00 |            728000.00
(3 ROWS)

In the above query, the CTE will return the total points summarized by year. Then, the outer query Will be using the LAG() function to return the total points of the previous year for each row. The first row has NULL in the previous_year_points column because there is no previous year of the first row.

In this another example we are using three common table expressions to return the points difference or variance between the current and previous years:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
WITH cte AS (
   SELECT
      year_game,
      SUM(points) points
   FROM team_stats
   GROUP BY year_game
   ORDER BY year_game
), cte2 AS (
   SELECT
      year_game,
      points,
      LAG(points,1) OVER (
         ORDER BY year_game
      ) previous_year_points
   FROM
      cte
)  
SELECT
   year_game,
   points,
   previous_year_points,  
   (previous_year_points - points) variance
FROM
   cte2;

The result should resemble something like the following.

1
2
3
4
5
6
 year_game |   points   | previous_year_points |  variance
-----------+------------+----------------------+------------
      2018 | 1090000.00 |                      |
      2019 |  728000.00 |           1090000.00 |  362000.00
      2020 | 1100000.00 |            728000.00 | -372000.00
(3 ROWS)

Example How to Use PostgreSQL LAG() function in Partitions

In this example we will be using the LAG() function to compare the total points of the current year with the total points of the previous year of each team.

1
2
3
4
5
6
7
8
9
10
SELECT
   year_game,
   points,
   team_id,
   LAG(points,1) OVER (
      PARTITION BY team_id
      ORDER BY year_game
   ) previous_year_points
FROM
   team_stats;

The result should resemble something like the following.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
 year_game |  points   | team_id | previous_year_points
-----------+-----------+---------+----------------------
      2018 | 180000.00 |       1 |
      2019 | 160000.00 |       1 |            180000.00
      2020 | 190000.00 |       1 |            160000.00
      2018 | 170000.00 |       2 |
      2019 | 140000.00 |       2 |            170000.00
      2020 | 170000.00 |       2 |            140000.00
      2018 | 140000.00 |       3 |
      2019 | 150000.00 |       3 |            140000.00
      2020 | 180000.00 |       3 |            150000.00
      2018 | 300000.00 |       4 |
      2019 |  28000.00 |       4 |            300000.00
      2020 | 290000.00 |       4 |             28000.00
      2018 | 300000.00 |       5 |
      2019 | 250000.00 |       5 |            300000.00
      2020 | 270000.00 |       5 |            250000.00
(15 ROWS)
  1. The PARTITION BY will distribute the rows into teams (or partitions) specified by column team_id.
  2. ORDER BY will be sorting the rows in each team by years in ascending order.
  3. Finally, the LAG() function will be applied to each partition to return the points of the previous year.

Conclusion

In this tutorial, you have learned how to use the PostgreSQL LAG() function. We hope you’re able to apply what you’ve learned here to your issue or application.

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.