Using Nested Select in Postgres SQL

Introduction

In this article, we’ll explore how to use nested select in Postgres SQL. During this tutorial, we’ll use the following structure:

  • What? What do Nested Select statements do and what is the syntax?
  • How? How do we best use this clause in our PostgreSQL SQL commands?
  • Why? When would we make use of this statement? We’ll learn via using a realistic use case. Includes use of the Postgres “WHERE” clause. We’ll also include exploration of “INSERT INTO” and “NOT IN”.

Prerequisites

  • A beginner-level of understanding of how to write SQL for Postgres (or similar relational databases like MS SQL Server, Oracle, and MySQL) using one of the many available tools out there, such as the PG Administration tool, dBeaver, or with script or programming languages like Python, Javascript, Java, PHP, C#, ASP.Net, VB.Net, Ruby, Node.js, B4X, Classic ASP, etc.) that provide a database connection, as well as a method for sending PL/SQL queries to our database tables, to get data or make changes to your data.
  • Comprehension of the use of common PL/SQL statements, including SELECT, FROM, and WHERE statements.
  • Knowledge of what function, integer, and string are and how they work on a beginner level.

What is nested select statement?

Nested select statements are otherwise known as “subqueries”.

Sometimes in our Postgres-based application we may need to pull data from a subset of data we create on the fly, update a table based on a subset of data, or insert to a table based on a data subset. Later in this article, we’ll play with some whys and ways of doing this.

First, let’s study the syntax of this powerful SQL tool, from a few perspectives:

Syntax 1: Subqueries in WHERE

SELECT
    column_1
    , column_2
    , column_3
FROM
    tbl_data
WHERE
    column_1 IN -- this can also be "NOT IN", "EXISTS, an operator like "=", "<", and others.
    (
    SELECT
        column_1
    FROM
        tbl_data
    WHERE
        [condition]
    )
ORDER BY column_1

Syntax 1 Analysis

In the syntax example above, we are looking at two data sets that are both pulled from the same table, “tbl_data”. The high level view is that the “inner” nested data set, the one in parentheses, runs first and the “outer” query is filtered based on the results of that inner, nested one returns. The IN you see here is important. It’s telling PostgreSQL, “Only pull records from tbl_data where the value in column_1 exists in the recordset returned by the nested select query. We’ll later see some ways this can be useful.

Syntax 2: Subqueries in FROM

SELECT
    column_1
    , column_2
    , column_3
    , nested.column_4
FROM
    (
    SELECT
        column_4
    FROM
        tbl_data
    WHERE
        [condition]
    ) AS nested

Syntax 2 Analysis

In this syntax example, we are giving a name to our “inner” nested select query, “nested”, so that we can refer to it in our outer select at the top of the overall SQL. Note that again, we are enclosing the nested select query in parenthesis. This is a must.

Before we do a real world example, let’s see the syntax for using nested select statements with INSERT INTO.

Syntax 3: Subquery with INSERT INTO

INSERT INTO
    tbl_data
    (
    column_1
    , column_2
    )
    VALUES
    (
    SELECT
        column_3
        , column_4
    FROM
        tbl_other
    WHERE
        [condition]
    )

Syntax 3 Analysis

In the INSERT INTO above, we begin by telling Postgres to add rows to “tbl_data”. Next, we are determining which columns (column_1 and column_2) we want to fill with the two respective VALUES returned by the nested SELECT statement that follows and is encapsulated in parentheses. Finally, note that the nested select pulls its data from “tbl_other”.

How and why we use nested select statements

Here’s a real world situation. If a developer didn’t know about nested selects, in some situations they might believe they need to pull data from a table into an array, manipulate that array in some way (perhaps sorting), and then use that array to add rows into their first table. Fortunately, nested selects (subqueries) provide us with a far more efficient way of accomplishing the task. We’ll dive in by writing the SQL, which will be structurally similar to “Syntax 3” above.

First, let’s set up two tables with test data:

tbl_technologies_used

t_name_techt_category_techi_rating
JavascriptLanguage95
PostgreSQLSQL database90
PythonLanguage90
MS SQL ServerSQL database90
C#Language92

The above table is the table used by the company to track which technologies are being used by the company. The table below is filled with potential technologies. It’s important to notice the bottom table includes tech that is already in the top table. So when we are building our INSERT INTO query using a NESTED SELECT, we want to be sure to keep duplicates out. This is a use case where nested selects excel.

tbl_technologies_proposed

t_name_techt_category_techi_rating
JavascriptLanguage95
MongoNoSQL database85
MySQLSQL database50
PostgreSQLSQL database90
PythonLanguage90
PHPLanguage70
JavaLanguage75
MS SQL ServerSQL database90
C#Language92
C++Language88
dBaseFlat database25

Writing the PostgreSQL query

INSERT INTO
    tbl_technologies_used
    (
    t_name_tech
    , t_category_tech
    , i_rating
    )
    VALUES
    (
    SELECT
        t_name_tech
        , t_category_tech
        , i_rating
    FROM
        tbl_technologies_proposed
    WHERE
        i_rating > 75
        AND
        t_name_tech NOT IN
        (
            SELECT
                t_name_tech
            FROM
                tbl_technologies_used
        )
    )

Analysis

The first thing you may notice with our example above is that we have a nested SELECT in another nested SELECT! Why did we do this? Because we want to be sure we do not add a row to tbl_technologies_used that is already in that table. Let’s take it line by line:

  • INSERT INTO: This is where we set up which columns in “tbl_technologies_used” get populated with data from the first (outer) of our SELECT statements below.
  • VALUES: This is part of the INSERT functionality.
  • SELECT (OUTER): Determines which fields (columns) are pulled from “tbl_technologies_proposed”.
  • WHERE: This line filters results of our subquery by two requirements. (1) “i_rating” must be larger than 75. (2) “t_name_tech” must not already exist in “tbl_technologies_used”.
  • NOT IN ... SELECT (INNER): This is the part of our query that makes sure we don’t copy duplicate records from “tbl_technologies_proposed” into “tbl_technologies_used”.

After we run that query, we get the following resulting dataset:

tbl_technologies_used | t_name_tech | t_category_tech | i_rating | |—————|—————–|———:| | Javascript | Language | 95 | | PostgreSQL | SQL database | 90 | | Python | Language | 90 | | MS SQL Server | SQL database | 90 | | C# | Language | 92 | | C++ | Language | 88 |

Tidbits

  • If you wonder why we prefixed some of our fields, variables, and columns with “i” or “t“? In this article, we used “i” to mean integer and “t” to mean text or string. Here is a short tutorial on that topic.

Conclusion

Here in this article we learned here how to use NESTED SELECT in order to query Postgres with SQL that will accomplish various tasks that require subsorting of data. These are also known as subqueries. We also created a real world example to help round out your understanding of how nest selects in your use of SQL in PostgreSQL database work.

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.