SQL EXCEPT vs NOT IN - What's the Difference?
Introduction
In this article, we will be discussing the difference between the “except” clause and the “not in” operator of the PostgreSQL database. We hope to clear up the difference between the two so you know when to use which.
Let’s start off by stating that EXCEPT is a clause and NOT IN is an operator. Now it begs the question, what exactly are clauses and operators in PostgreSQL:
Clauses
Clauses in the PostgreSQL are keywords that have definite meaning (similar to their English language counterpart) and purpose. The purpose is self-explanatory by the name of the clause. For example, the WHERE clause is used to query a specific condition. GROUP BY clause is used to group the data on a specific attribute. ORDER BY clause is used to order the data by a specific attribute, etc.
Some clauses in PostgreSQL can only be used with another clause and they usually enhance the functionality of the clause they are being used with. For example, the HAVING clause can only be used with a group by clause and it restricts the results of the group by clause on a condition. The FROM clause is mandatory with the SELECT statement to specify from which table the data needs to be selected.
PostgreSQL EXCEPT Clause
In the real world, when we use the word ‘except’, we are defining an exception from what we stated in the statement preceding the ‘except’ and the statement succeeding describes that exception. For example, “I like all the fruits except oranges and jackfruits”. Now in this statement, “I like all the fruits” is the original statement but “oranges and jackfruits” are not included in the fruits the person likes.
In PostgreSQL, EXCEPT works in pretty much the same way. EXCEPT is used to restrict the results from a SELECT FROM WHERE query on the query after the EXCEPT clause. The general syntax is:
1 2 3 4 5 6 7 8 9 | SELECT column1 [, column2 ] FROM table1 [, table2 ] QUERY A [WHERE condition ] EXCEPT SELECT column1 [, column2 ] FROM table1 [, table2 ] QUERY B [WHERE condition] |
Now, this syntax is identical to the one we use in the English language, so it is very easy to understand. The above Venn diagram makes it even more clear to understand what exactly an EXCEPT clause will do. It would give all the results of query A except those which are in query B too.
Consider the following example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | CREATE TABLE customers( c_id INT PRIMARY KEY, c_name TEXT NOT NULL, c_age INT NOT NULL, c_address CHAR(50), c_salary INT NOT NULL ); INSERT INTO customers( c_id, c_name, c_age, c_address, c_salary ) VALUES (0, 'JOHN CONNER', 19, '23 BOULEVARD ST.', 20000), (1, 'ADAM SMITH', 21, '28 BOULEVARD ST.', 25000), (2, 'BLAKE JONES', 23, '22 BOULEVARD ST.', 30000), (3, 'BRIDGETTE JOHNSON', 18, '21 BOULEVARD ST.', 40000), (4, 'CMAILIA BLACK', 20, '27 BOULEVARD ST.', 35000), (5, 'AMY WINEHOUSE', 20, '23 BOULEVARD ST.', 20600), (6, 'CHRIS JONES', 23, '01 BOULEVARD ST.', 37000); CREATE TABLE orders( o_id INT PRIMARY KEY, o_date TEXT NOT NULL, c_id INT FOREIGN KEY REFERENCES customers(c_id), o_amount INT NOT NULL ); INSERT INTO orders( o_id, o_date, c_id, o_amount ) VALUES (0, '09-01-2018', 5, 3000), (1, '11-02-2019', 1, 2110), (2, '20-03-2015', 0, 4500), (3, '25-03-2019', 6, 1010), (4, '26-12-2018', 0, 350); |
When the above code is executed, we get the following tables:
Now if we want to see which customers did not purchase anything, we can do that by:
1 2 3 4 5 | SELECT c_id FROM customers EXCEPT SELECT DISTINCT c_id FROM orders; |
The EXCEPT statement reduces to the result from the upper query to only those that are not in the lower query as evident by the output.
Operators
The operators in PostgreSQL are used to perform simple operations such as arithmetic, logic or comparison operations. So we have Arithmetic operators, logical operators, and comparison operators in PostgreSQL. Currently, we will only focus on Logic Operations.
Usually, logic operations are used as conjunctions between two queries. There are many logic operators in PostgreSQL. These include:
- AND
- OR
- NOT
- ANY
- IN
- EXISTS etc.
PostgreSQL NOT IN operator
The NOT IN operator is a combination of a NOT and IN operator. So we need to know first what is a NOT and an IN operation. The NOT operator reverses the meaning and functionality of any operator that succeeds it. It is also similar to what NOT is used in the English language. For example, “The room was empty” and “The room was not empty” have the opposite meaning. In the same way, NOT IN has entirely different meaning and functionality as compared to the IN operator.
The IN operator is used to match a set of values to some other set of values. The general syntax is:
1 | VALUE IN (value1, value2, ...) |
OR
1 2 3 4 5 | SELECT column1 [, column2 ] FROM table1 [, table2 ] QUERY A WHERE column1 IN ( SELECT column1 [, column2 ] FROM table2 [, table2 ]); QUERY B |
When combined with NOT, the NOT IN operator checks if some value does not exist in some set of values. i.e. It returns all the results from the preceding values that are not in the succeeding set of values. Or we can say it limits the results of query A to only those results that are only in A and not in B. So we can say that it works the same way as the EXCEPT clause. It is just another way of achieving the same functionality.
For example, in the customers and orders database example, we found the customers who have not ordered anything using except clause. We can do the same using the NOT IN operator as:
1 2 3 4 5 | SELECT c_id FROM customers WHERE c_id NOT IN ( SELECT DISTINCT c_id FROM orders); |
This would give the same results as we got with the EXCEPT clause:
Conclusion
There is absolutely no difference in the EXCEPT clause and NOT IN operator. They both serve the same purpose and they are simply two different ways of achieving the same functionality. The queries written using EXCEPT can be rewritten using NOT IN operator. It is just a matter of personal preference.
Pilot the ObjectRocket Platform Free!
Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.
Get Started