Foreign Keys in Postgres
In this tutorial we’ll study the purpose and use of foreign keys in Postgres. Prior to the advent of relational databases, data was stored pretty much as you see in spreadsheets, where duplicate data like names, for example, were common and even necessary. Before we talk about and show you how to use Postgres foreign keys, we’ll illustrate why foreign keys are necessary by looking at the prevalent type of databases prior to the invention of the relational database model that PostgreSQL is built upon. This approach will give you a deeper, lower-level understanding of foreign keys.
Relational database vs flat
We’ll create a hypothetical use case: Start with a table named “tbl_orders” in a flat database system that has a row for every purchase. When a user buys more than once, their name, phone, email, address, etc. is duplicated with each order. There are other drawbacks to using the old flat database system, but you probably get the idea. Especially when you see how the relational database model works.
Postgres relational database
This is where relational databases show their primary advantage over the flat model! How? Simply put, we create another table called “tbl_users” that has a unique ID for each customer in the PostgreSQL database. Next, we set up foreigh keys or one-to-many table relationships between tbl_orders and tbl_users by linking the ID (key column) in tbl_users to a (optimally it is indexed) column in tbl_orders called ID_user using a “foreign key”. Note: You can name this column anything you want. We used our preferred naming convention. Now, with our basic relational database, when that same customer buys a product, the ID matching that user is placed in the ID_user column, instead of their full name and contact information. This method – as compared to the flat model – saves potential for typos, as well as space (text vs. integer) in your database.
- Basics of how to write Postgres SQL.
- Comprehension of common SQL statements, including SELECT, FROM, GROUP BY, HAVING, WHERE, and ORDER BY statements.
- Knowledge of what index, string, integer, and text are and how/when to use them.
There is a tight correlation between normalization and use of foreign keys. Let’s explore relational database normalization.
What is normalization?
Let’s cultivate an understanding of normalization by looking at a table from pre-normalized times where flat databases were the cutting edge.
|110||Textbook: How to design a database||James Wonka||224 Olacer Ave, Austin, TX||$15.95||2019-11-15|
|185||License: MS Windows 10||Fred Dewman||5213 Python Cir, Austin, TX||$49.25||2019-11-16|
|186||License: MS Office 365||Fred Dewman||5213 Python Cir, Austin, TX||$39.10||2019-11-16|
|201||Course: Compression Algorithms||Stan McCran||5 Postgres Drive, Austin, TX||$185.95||2019-11-17|
Notice the similarities to a spreadsheet?
See here that Fred Dewman ordered two items, so he shows up two times in the flat table above. Unfortunately, in a flat database system like you see above, this means quite a bit of information gets duplicated; as we see here with his contact information. Another way data duplication happens: What if someone else ordered the “License: MS Windows 10” product, then the information we see in the “t_product” column, would also be duped. See the inefficiency of that method? Especially when you think about how we are leaving out columns that would be necessary to an order tracking system, including product properties such as size, color, weight, and other options. And with the customer data, we want info like phone number and email address. Maybe even a customer rating or link to another table that tracks any times your customer service department talked with that customer.
How to normalize
Now if we want to normalize that table, the minimum would be to turn it into three tables. We recommend the five you see here:
- tbl_cart_details (not necessarily needed, so not shown in the image below.)
Here’s an image of those tables, using arrows to represent foreign keys to show relationships between tables, which are always indexes. Foreign keys are a special kind of index. What is the significance of the direction of the arrows? In this case, the arrows point to the table that is the “master” for looking up information. Meaning, the table getting pointed to is the one being depended on to provide data. So, for example, notice the arrow pointing from the “id_user” column in tbl_cart to the “id” column in tbl_users. This relationship fixes the issue we saw above where we get duplicates of user information for every new order. After the image below, we’ll do more analysis, so you can more deeply understand how normalization benefits your Postgres database efficiency.
Another arrow (“foreign key”) to understand is the one pointing from the “id_product” column in the “tbl_cart” table to the “id” column in the “tbl_products” table. This allows us to replace a string of text in tbl_purchases above with an id pointing to tbl_products, so we can reduce data duplication, which affords us a reduction of database size – sometimes huge, reduction of potential typos and other errors, and increase in overall efficiency.
The final arrow to understand in the above diagram is the one pointing from the “id_user” column in tbl_purchases to the “id” column in tbl_users. Much like the relationship between “tbl_cart” and “tbl_users”, the relationship we are showing you between tbl_purchases and tbl_users allows us to remove duplicate customer data from tbl_purchases.
The database schema diagram
Note that we have explained some of the above relationships diagram with many other fields who’s use may not be obvious:
- users.id_session: In some applications we have more than one user at a time using the app. We can use session variables to track and keep users distinct from each other. For example, if a user is added to tbl_users, after adding that user, we want an easy way to determine the new “id” generated in that table, we can use “id_session” to find out via a SELECT SQL command.
- cart.id_coupon: Two reasons for this column: (1) Extend this lesson on your own by creating a table called “tbl_coupons” and relating it to both tbl_cart and tbl_purchase_details; and (2) In order to give you an “assist” in building a shopping cart system of your own, where you may not have first thought of having coupons as an option for customers.
- products.id_category: This is another column, like “tbl_cart.id_coupon”, where we are coaching you to create a table called tbl_categories and a boost for creating your own future shopping cart application.
- products.id_sub_cat: This is another column, like “products.id_category”, where we are provide some options; create a table called tbl_categories_sub (alternative to adding this new table: add id_parent column to your tbl_categories) and a boost for creating a e-commerce solution.
Query to create a foreign key
id INTEGER NOT NULL,
id_user INTEGER REFERENCES tbl_users(id),
PRIMARY KEY (id, id_user)
Notice we used the Postgres REFERENCES key word to set up our foreign key above that points from tbl_users to tbl_products using the primary key id column in tbl_users and the indexed id_user column in tbl_products.
There are other types of joins and foreign keys that we did not study here. Please feel free to continue studying the various kind of joins that exist.
In this tutorial we learned the purpose and use of foreign keys in Postgres. We also studied how databases worked before relational databases were invented, building an understanding of the differences and benefits to PostgreSQL normalization, which depends on the use of foreign keys, if done in the most efficient manner.
Pilot the ObjectRocket Platform Free!
Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.Get Started