Foreign Keys in CockroachDB
In this document we will learn to normalize with foreign keys in a CockroachDB database via one-to-many relationships. Before relational databases were put into use, we stored data similar to how you have seen in a spreadsheet, where you often have duplicate data like – say – customer names. Imagine a table called “tblOrders” that has a row for each order. When a customer buys more than once, you would get their name, address, phone, etc. duplicated each time. This is where relational databases show their benefits! To get there, we create another table called “tblUsers” that has a unique indexed ID field for every user in our CockroachDB database. We then create “one-to-many” table relationships – otherwise known as Foreign Keys – linking the orders table and the users table by pointing the ID in the tblUsers table to a column in the tblOrders table called something like idUser using a “foreign key”. Once we’ve done that, any time that same user buys a product, the ID representing that user is placed in the idUser column, instead of their full name. See how this saves space (text vs. integer), as well as potential for typos?
In this document, we will investigate and learn how to normalize CockroachDB databases in a variety of situations where one-to-many relationships between two or more tables are useful and/or necessary. We’ll use the following structure in this article:
- What? What is normalization? What do one-to-many table relationships look like?
- How? How do we normalize a Cockroach database and what are some situations where we would?
- A basic knowledge and/or curiosity of how to write SQL for Cockroach (or other popular relational databases like Postgres, MS SQL Server, and MySQL) using an IDE like Dbeaver, or with server-side scripting or programming languages like Python, PHP, Java, .Net, C#, Node.js, Ruby, B4X, etc. that provide a database connection, as well as a method for sending SQL queries to our database tables, to get or make changes to data.
- Comprehension of the use of common SQL commands, including SELECT, FROM, and WHERE commands.
What is normalization?
Let’s increase our knowledge of how normalization works by examining a data table from pre-historic- I mean- pre-normalized times.
|110||Book: Database Design||Susie Query||2842 Postgres, Austin, TX||$15.92||2019-12-15|
|185||License: MS Windows 10||Stan Freman||5633 Python Blvd, Austin, TX||$58.94||2019-12-16|
|186||License: MS Office 365||Stan Freman||5633 Python Blvd, Austin, TX||$52.00||2019-12-16|
|201||Course: Compression Algos||Scott McSQL||29 Cockroach Ave, Austin, TX||$180.00||2019-12-17|
The first important thing to notice here is that Stan Freman ordered 2 items, so he appears twice in our tblOrders table. Unfortunately, in a flat database system like you see here above, this means information gets replicated; in this case his name and address. Another way data duplication could happen here: Imagine if someone else ordered the “License: MS Windows 10” item, then that information, which we see in the “txtProduct” column, would also be replicated. See how inefficient that method can be? Especially if you consider we have left out many other columns that would be necessary to an order tracking system, including product properties such as weight, color, size, etc. And for the customers, we’d need columns like email address and phone number.
Next we want to normalize that table, so the minimum would be to turn it into three tables. We recommend the five you see here:
- tblUsers (users)
- tblProducts (products)
- tblCart (cart)
- tblPurchases (purchases)
- tblPurchasesDetails (purchases_details)
Here’s a graphic view of those five tables, with joins and foreign keys represented as arrows to show relationships between the tables, which are always indexed. What about the direction of the arrows you see here? The arrows point to the table that is the “depended on” for retrieving information. Meaning, the table getting pointed to is the one being depended on to give the data. Notice the arrow pointing from the “id_user” column in the “cart” table to the “id” column in the “users” table. This relationship fixes the issue we saw above where we have to repeat user information for every order. After the image below, we will do more analysis, so you can more deeply understand how normalization benefits your Cockroach database efficiency.
Another join arrow to understand is the one pointing from the “idProduct” column in the “tblCart” table to the “id” column in the “tblProducts” table. This allows us to replace text in our “tblOrders” (now tblPurchases in our new normalized database) table with an id pointing to “tblProducts”, so we can reduce repetition to zero, which means a reduction in overall database size, reduction of potential typos, and increase in overall efficiency, which usually means greater speed and easier maintenance.
The final arrow to understand in the graphic above is the one pointing from the “idUser” column in the “tblPurchases” table to the “id” column in the “tblUsers” table. Much like the relationship between “tblCart” and “tblUsers”, this relationship between “purchases” and “users” allows us to remove repetitive user data from “tblPurchases”.
There are other types of “join” and “foreign key” (arrows), that we did not outline here. Please feel free to continue your study of this topic by looking into and experimenting with the various other kinds of database joins that exist in the Cockroach ecosystem. We intend to provide more on this topic in the future as we want to be your primary CockroachDB resource.
In this tutorial we learned (a) why, how and when to normalize your Cockroach database with foreign keys, otherwise known as one-to-many relationships, for an increase in accuracy, ease of maintenance, and efficiency. We also provided a real world shopping cart example to help deepen your knowledge of how foreign keys work for database normalization.
Pilot the ObjectRocket Platform Free!
Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.Get Started