Postgres and Python Multidimensional Array

Introduction

In this tutorial, we will learn how to use both the Postgres and Python Multidimensional Array, specifically the Array data type in both Python and Postgres SQL. We will explore the multidimensional array from a few perspectives, including:

  • What? What is the overview and and syntax for using the Array data type?
  • Why? In what situations and how do we use Arrays in our PostgreSQL queries? There are many ways to use this. We’ll begin by sharing the workings of a single dimension array and then once we understand that, we’ll learn to create and use 2-dimension arrays.

What is an array and how does it work?

An array is essentially a group or list of items. In the database and programming world, arrays can save work and increase our efficiency. You can call “0, 2, 9, 23, 5, 16” an array of integers or numeric items. You can call “Python, Postgres, Integer, String, SQL, Query” an array of strings or array of text items. “Items” is often used to describe an individual in the group that is an array. That’s the general concept. Now let’s look at how Postgres and Python multidimensional arrays differ.

Python Array Data Type

Python arrays (otherwise known as “lists”) are a datatype that have multidimensional functionality. That may sound daunting. We’ll take it one step at a time. First, we’ll create a very simple one-dimensional array:

arr_dogs = ["Collie", "German Shepherd", "Beagle"]

And now we’ll see how to retrieve data from the array we just created:

t_dog_type = arr_dogs[1]
print(t_dog_type)

“t_dog_type” equals “German Shepherd” because array counting begins at zero. So “1” means the second item in the list.

Let’s loop through our array so you can see all the values:

arr_dogs = ["Collie", "German Shepherd", "Beagle"]
for t_dog_type in arr_dogs
    print(t_dog_type)

Here are the results of the loop you see above:

Collie
German Shepherd
Beagle

We can also add/remove items from a Python list (array) using the append and pop methods, respectively.

Now that you have a good understanding of how Python lists / arrays work, let’s learn how to use a multidimensional array. Let’s say we want to track types (like we were) AND colors.

arr_dogs = [["Collie", "German Shepherd", "Beagle"], ["Blonde", "Grey", "Brown"]]
for i in range(len(arr_dogs))
    print("My " + arr_dogs[0][i] + " is " + arr_dogs[1][i])

Note: Python’s “len” function gives us the size of the array in number of items.

The results:

My Collie is Blonde
My German Shepherd is Grey
My Beagle is Brown

Here’s how it works:

arr_stuff = [["array 0 item 0", "array 0 item 1", "array 0 item 2"], ["array 1 item 0", "array 1 item 1", "array 1 item 2"]]
print(arr_stuff[0,1]) # gives us "array 0 item 1"

IMPORTANT NOTE about Python Arrays and Lists: For simplicity-sake, we’ve been treating them here as if they are the same thing. Mostly, they are. But there is one significant difference to keep in mind: If you create an array as we’ve shown you above, you can store multiple data types all within the same array. For example: arr_elements = [5, “Tesla”, 2.3]. Notice how we mixed an integer with a string and a real number? If you use “import array as arr” and then use arr_elements = arr.array(‘i’, [5, “Tesla”, 2.3]), you will get an error. Note: The “i” you see in that array creation stands for “integer”. So yeah, you have to set the type of the entire array when using the array module instead of lists.

Now that we understand how to create arrays / lists in Python, let’s look at how they work in PostgreSQL.

Postgres Array Data Type

Postgres has a data type called Array. So yes, within one cell (cross section between a row and column), you can actually have – instead of an integer or text data type – an array data type. Is your mind blown yet? This can be a difficult concept for some, so let’s take it slow.

CREATE TABLE tbl_users (
    t_name_user text
    , arr_i_grade INTEGER[]
)

In the SQL above, you may recognize the text data type that is being used for “t_name_user”. The column we named “arr_i_grade” is probably new for you as it is an array data type. The brackets (“[]”) are how we tell Postgres “type this column as an array.” The way we named the column has no relevance to Postgres. We named it that way because of our own needs for clarity and understanding.

We’ll generate some data from the table filled with some test data so we can get a visual on how an array works in PostgreSQL.

SELECT
    t_name_user text
    , arr_i_grade INTEGER[]
FROM
    tbl_users

Gives us…

t_name_userarr_i_grade
Steve40, 65
Bill88, 80, 85
Sharon82, 85, 89
Todd90, 89, 95, 94
George60, 75
Sal77
Fred74, 72, 79, 78

Notice: The actual data returned from PostgreSQL included the “{}” brackets that we removed from here so as to give you a less cluttered and easier to understand view of the actual data.

Now we’ll retrieve data from the table, pulling only part of the array, so you can get an even deeper perspective on how an array works in Postgres.

SELECT
    t_name_user text
    , arr_i_grade INTEGER[1]
FROM
    tbl_users

Gives us:

t_name_userarr_i_grade
Steve40
Bill88
Sharon82
Todd90
George60
Sal77
Fred74

Analysis

The “[1]” in “arr_i_grade integer[1]” told the SQL query interpretation engine to give us the first item in the “arr_i_grade” column and that column happens to be an array.

Using arrays in the WHERE clause

Beginning with our original table:

t_name_userarr_i_grade
Steve40, 65
Bill88, 80, 85
Sharon82, 85, 89
Todd90, 89, 95, 94
George60, 75
Sal77
Fred74, 72, 79, 78
SELECT
    t_name_user text
    , arr_i_grade INTEGER[1]
FROM
    tbl_users
WHERE
    arr_i_grade IN (79, 82, 88, 90)

In the above SQL, let’s examine the WHERE / IN condition above. It tells Postgres, “Return only rows where one of the numbers in the arr_i_grade field is in the list that includes 79, 82, 88, and 90.”

That SQL will return:

t_name_userarr_i_grade
Bill88
Sharon82
Todd90
Fred79

Notice an important difference between Python lists and Postgres arrays is that the index for items in Python begins with “0” while in PostgreSQL arrays it begins with “1”.

Now that you understand the basics of 1-dimensional arrays, it’s time to go deeper.

Postgres Multidimensional Array

Just like we saw with Python, Postgres also offers multidimensional arrays. Looking back at our first example, we tracked the grades each user has accumulated in tests they’ve taken. What if, for each grade, we want to also track the year the test was taken in? That’s where a second dimension comes in handy.

DROP TABLE tbl_users;

CREATE TABLE tbl_users (
    t_name_user text
    , arr_i_grade INTEGER[][]
)

Notice the similarity with creating lists (arrays) in Python?

Now we have recreated the test table, but with one difference. arr_i_grade integer[] has changed to arr_i_grade integer[][]. Adding the second set of brackets tells Postgres that our data type is still an array but now our Array of integers has two dimensions. As we did before, we’ll look at this data visually, so we can see the two dimensions in action.

Filled with data, our table looks like…

t_name_userarr_i_grade
Steve{2019, 40}, {2019, 65}
Bill{2018, 88}, {2018, 80}, {2019, 85}
Sharon{2017, 82}, {2018, 85}, {2019, 89}
Todd{2018, 89}, {2019, 95}, {2019, 94}
George{2019, 60}, {2019, 75}
Sal{2019, 77}
Fred{2018, 72}, {2018, 79}, {2019, 78}

Note: We’ve removed the extra curly brackets that might make the lesson more confusing.

Studying the data above, we can glean the following:

  • Steve scored 40 on a test in 2019 and then 65 on a test in 2019.
  • Bill scored 88 on a test in 2018 and then 80 on another test in 2018. He then scored 85 on a test in 2019.
  • Sharon scored 82 in 2017, 85 in 2018, and 89 in 2019.
  • Todd scored 89 in 2018, 95 in 2019, and then 94 in his next test in 2019.
  • George scored a 60 on his first 2019 test and a 75 on his second.

Miscellaneous

Are you curious as to why we named some columns and variables with a prefix of “tbl_”, “arri“, “i“, or “t“? In this tutorial, we used “arri” to mean array of integers, “i” to mean integer and “t” to mean text. Here is a tutorial that explains why, including how that practice can increase your efficacy as a programmer.

Conclusion

In this tutorial, we learned how to use both Postgres and Python multidimensional array and created some examples together to aid in understanding. We saw that Python offers two kinds of array, one with strong typing called “array” and a more flexible one called “list”. For Python, we focused mainly on using lists, as they are more closely related to PostgreSQL’s array type.

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.