Self-JOIN in PostgreSQL

Introduction

In this article we will discuss about the self-join principle of PostgreSQL, but first we have to see what a Join is and how it is used in database systems. After we explain the JOIN basics we’ll show you how to do the self-join in PostgreSQL.

JOIN

In database management systems we often have a need to merge the results of two tables in their column fields rather than the row fields. For the row field merging of two tables we basically merge the rows of two tables having same columns using a UNION or UNION ALL statement, but what happens when you need to merge the tables with different columns to get additional insights. For this specific purpose a database system provides the facility of JOINS. The PostgreSQL syntax for join is as follows:

SELECT [T1.Column1…. T1.ColumnN , T2.Culumn1…. T2.ColumnN]
FROM Table1 T1
[INNER, LEFT, RIGHT, FULL] JOIN Table2 T2
ON T1.foreignKey = T2. Primary Key.

JOIN have different types and constraints. For example, in order for join to work we need to have at least one column as a foreign key in one table and that same column field should be a primary key in another table.

Example: Consider two tables as S1 and S2 with respective fields S1 → [CNIC, Name, Contact, StudentID] S2→ [StudentID, Name, Degree]

Here in S1 the primary key is CNIC and in S2 the primary key is StudentID. Now take a look that the primary key in S2 is in S1 as foreign key, so these two tables can be joined.

Types of joins are illustrated as the following diagrams:

Image from Gyazo

Select All the records from two tables where the join condition meets. This will only return those fields which have same where called equal join in most database condition. This is called equal join in most database management systems.

Image from Gyazo

Select All records from table B along with table A where join condition met if met at all. This will return all records of table B but will only return those records of table A which meets the condition.

Image from Gyazo

Select all records from table A along with records from table B where the join condition is met. This will return all records of table A but will only return those records of table B which meets the condition.

Image from Gyazo

Select All records from table A and all records from table B regardless of the join condition. So, this will be used to join the two tables without any conditions required to be met.

Self-JOIN

Self-join in PostgreSQL is quite easy to understand if you understood the above join explanation. Self-join is basically a query to merge same tables with different aliases. In order to perform a self-join, you have to reference the same table twice but with different aliases to remove a confusion between tables.

The self-join can be viewed as a joining of two copies of the same table. Set the comparison and eliminate the condition of same rows. The syntax for self-join is a same as normal joins. The only difference is that this time the T2 table will also be the same table as T1:

SELECT [T1. Column1…. T1. ColumnN, T2. Culumn1…. T2. ColumnN]
FROM Table1 T1
[INNER, LEFT, RIGHT, FULL] JOIN Table1 T2
ON T1. foreign Key = T2. Primary Key.

Let’s take an example. Consider one table of employees having multiple fields including supervisor field for each employee this supervisor will also be an employee so each supervisor and employee will be in the same table so if we want to check which employee reports to which supervisor, we will make a self-join query on to the employee table:

CREATE TABLE Employees (
   employeeID INT PRIMARY KEY,
   name VARCHAR (255) NOT NULL,
   superviserID INT,
   FOREIGN KEY (superviserID)
   REFERENCES Employees (employeeID)
   ON DELETE CASCADE
);
INSERT INTO Employees (
   employeeID,
   name,
   superviserID
)
VALUES
   (1, 'Asmar', NULL),
   (2, 'Ali', 1),
   (3, 'Hassan', 1),
   (4, 'Anna', 2),
   (5, 'Sau', 2),
   (6, 'Kelsie', 3),
   (7, 'Tory', 3),
   (8, 'Salley', 3);

Now the self-join query:

SELECT
    e.name Employee, m.name Superviser
FROM
    Employees e
JOIN Employees m ON m. employeeID = e. superviserid
ORDER BY
    Superviser;

The result set will look like this

Image from Gyazo

This query referenced the Employees tables two times. First as an employee and second as a Superviser. It uses table aliases e for employee and m for the Superviser. The join query finds Employee/Superviser pair by matching the EmployeeID and superviserid columns from the same table.

Conclusion

The self-join is particularly useful for making pairs in the same table without using any sub query or making a copy of the same table which are not the most efficient ways to achieve pairwise comparisons.

Pilot the ObjectRocket platform free for 30 Days

It's easy to get started. Imagine the time you'll save by not worrying about database management. Let's do this!

PILOT FREE FOR 30 DAYS

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.