Using a Postgres Subquery

Introduction

In this training session, we learn the best ways of using a Postgres subquery. During this lesson, we will use and learn about nested select in Postgres, using a subquery with select, update, and insert into. We’ll also look at how “IN” and “NOT IN” work in a subquery in PostgreSQL. A simple way of describing a PostgreSQL subquery is “A query within a query.”

Prerequisites

  • A basic grasp of how to create queries for a PostgreSQL database (or other SQL-compliant databases like Cockroach or MS SQL Server) using one of the many free tools available, such as PG Admin, dBeaver, or with programming languages like Javascript, Python, Java, PHP, C#, ASP.Net, VB.Net, Node, B4X, Classic ASP, etc.) that provide connections to your database, as well as a method for using SQL queries with your database to get data or make changes to your data.
  • Sense of how common SQL statements like SELECT, FROM, and WHERE work because we will build on those statements in this training session.

What is a Postgres Inner Select?

An inner select or inner query is otherwise known as a subquery in Postgres.

In a Postgres application we may sometimes want to retrieve data from a subset of data created on the fly, insert data into a table based on a subset of data, or update a table of data based on a data subset. We’ll be exploring all these ways of using a Postgres subquery (inner select).

Postgres subquery syntax

Subquery in WHERE syntax

In this syntax example, we will show the most simple form of a Postgres subquery, placing the inner select in the WHERE clause.

SELECT
    column_field_01
    , column_field_02
    , column_field_03
FROM
    tbl_data
WHERE
    column_field_01 IN -- this can also be "NOT IN", "EXISTS, an operator like "=", "<", and others.
    (
    SELECT
        column_field_01
    FROM
        tbl_data
    WHERE
        [condition]
    )
ORDER BY column_field_01

Analysis

In the code example above, we show two sets of data that are pulled from the same table; “tbl_data”. The simple view here is that the “inner” nested data set – “inner query” – the one in the WHERE clause, runs first, and then the outer query is filtered based on the results of the data returned by the nested query. The IN statement within that WHERE clause is important. It is the command that tells PostgreSQL’s SQL engine to pull only rows from the table where the value in column_field_01 exists in the rows returned by the inner select SQL.

Subquery in FROM syntax

SELECT
    column_field_01
    , column_field_02
    , column_field_03
    , nested.column_field_04
FROM
    (
    SELECT
        column_field_04
    FROM
        tbl_data
    WHERE
        [condition]
    ) AS inner_query

Analysis

Building on what you learned above, the same principle applies here. In this example, we gave a name to the inner query, “inner_query”, so that we can refer to it from the outer query at the top of the full SQL. Note that we have enclosed the inner select query in parenthesis. This is a must.

Prior to creating a real world example for study, let’s look at the syntax for using a subquery with INSERT INTO.

Subquery in INSERT syntax

INSERT INTO
    tbl_data
    (
    column_field_01
    , column_field_02
    )
    VALUES
    (
    SELECT
        column_field_03
        , column_field_04
    FROM
        tbl_other
    WHERE
        [condition]
    )

Analysis

In the INSERT INTO type query above, we started by telling Postgres to add some rows to “tbl_data”. Next, we are deciding which columns (column_field_01 and column_field_02) that we wish to fill with the two VALUES returned by the subquery that follows and is enclosed in parentheses. Finally, it is important to notice that the inner select pulls its data from “tbl_other”, utilizing a potential condition in the WHERE clause at the end of the subquery. Notice here one way we departed from the previous examples is that we are using two tables. Not a whole lot more complex, but important to note.

Why use a Postgres Subquery

Here’s a problem that is fairly typical in one form or another where subqueries can come in handy. If a database developer didn’t know subqueries existed, they might believe they need to go through the convoluted process of retrieving data from a table into a list or array, manage the array in some way such as adding items, removing items, and/or sorting, and then use that array to add specific rows back into a table. Luckily, a subquery (nested select) provides a far more efficient method for doing that. We’ll start by scripting the SQL, which will be structurally similar to Subquery in INSERT syntax above.

Insert data using subquery in Postgres

First step is to set up two tables with test data:

tbl_tech_used

t_tech_namet_tech_cati_rating
JavascriptLanguage94
PostgreSQLSQL database91
PythonLanguage93
MS SQL ServerSQL database92
C#Language92

The above table is the table used by the company to track which technologies are being used by the company. The table below is filled with potential technologies. It’s important to notice the bottom table includes tech that is already in the top table. So when we are building our INSERT INTO query using a NESTED SELECT, we want to be sure to keep duplicates out. This is a use case where nested selects excel.

tbl_tech_proposed

t_tech_namet_tech_cati_rating
JavascriptLanguage94
MongoNoSQL database85
MySQLSQL database50
PostgreSQLSQL database91
PythonLanguage93
PHPLanguage71
JavaLanguage74
C++Language72
C#Language93

Writing the PostgreSQL SQL

INSERT INTO
    tbl_tech_used
    (
    t_tech_name
    , t_tech_cat
    , i_rating
    )
    VALUES
    (
    SELECT
        t_tech_name
        , t_tech_cat
        , i_rating
    FROM
        tbl_tech_proposed
    WHERE
        i_rating > 75
        AND
        t_tech_name NOT IN
        (
            SELECT
                t_tech_name
            FROM
                tbl_tech_used
        )
    )

Analysis

The first thing you may notice with our example above is that we have a nested SELECT in another nested SELECT! Why did we do this? Because we want to be sure we do not add a row to tbl_tech_used that is already in that table. Let’s take it line by line:

  • INSERT INTO: This is where we set up which columns in “tbl_tech_used” get populated with data from the first (outer) of our SELECT statements below.
  • VALUES: This is part of the INSERT functionality.
  • SELECT (OUTER): Determines which fields (columns) are pulled from “tbl_tech_proposed”.
  • WHERE: This line filters results of our subquery by two requirements. (1) “i_rating” must be larger than 75. (2) “t_tech_name” must not already exist in “tbl_tech_used”.
  • NOT IN … SELECT (INNER): This is the part of our query that makes sure we don’t copy duplicate records from “tbl_tech_proposed” into “tbl_tech_used”.

After we run that query, we get the following resulting dataset:

tbl_tech_used

t_tech_namet_tech_cati_rating
JavascriptLanguage94
PostgreSQLSQL database91
PythonLanguage93
MS SQL ServerSQL database92
C#Language93

Miscellaneous

  • Are you curious why we used a prefix of “i” or “t” in some columns? In this article, we used “i” to mean integer and “t” to mean text or string. Here is a Naming conventions on that topic.

Conclusion

We learned in this training tutorial some common / popular ways of using a Postgres subquery. During this lesson, we used and learned about nested select in Postgres, using subqueries with select, update, and insert into. We also looked at how “IN” and “NOT IN” work in a subquery in PostgreSQL. Finally, we saw evidence that a simple way of describing a PostgreSQL subquery is “A query within a query.” Code samples were included.

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.