Retrieve Blob Datatype from Postgres with Python

Introduction

In this tutorial we will learn how to retrieve the Blob datatype from Postgres with Python. We’ll begin by looking at the more simple way we might handle something like an image in our web database application by storing the image data in a file on the web server and only storing the file name and path in PostgreSQL. After looking at that method, we’ll study how to store and retrieve an image directly in and from Postgres. Finally, we will compare these two methods and look at the pros and cons.

Get image file name from Postgres

The most simple method to retrieve an image to display on a web page is to store the image in a folder on the server and store the path to that image in a text column in PostgreSQL. One benefit would be increased portability because the blob datatype is not available or works differently in other databases. We’ll talk more on this later in this tutorial.

Get blob datatype from Postgres

The other method we’ll discuss here is to retrieve a blob datatype from PostgreSQL and then display that image or other object that is defined by a large amount of data, such as video, music, or other large file or document. Below we will go deeper into this methodology.

Prerequisites

Following are a few technologies that will be useful for you to aid in understanding this tutorial. On to the pontification!

Python coding

We used Python 3.8 for this article, which is the latest as of the time when we wrote this. You are not obliged to use the latest version of Python for this tutorial, however, if you want to, you can get it here free. We advise staying as up to date as you can.

Writing SQL

Understanding of how to write basic queries with SQL. In this tutorial, we stick to the most simple SQL, including the SELECT statement and WHERE clause. For managing our PostgreSQL database, we use the dBeaver free database management tool for ease, error checking, and greater efficiency than the PG Admin software.

Naming conventions

On naming conventions shows how and why you may want to prefix variables, column (field) names, table names, etc. as you see done in this article. For example, naming “tvariable” with the “t” prefix to set it as a “text” (string) object and “tbl_” before table names in order to easily distinguish those objects as tables. The tutorial goes a bit deeper, as well, talking about how to name variables based on a logical hierarchy system.

Open() for Python

Before learning to code for the two primary ways mentioned above for retrieving blob data from Postgres, we’ll take a quick look at the Python open function. It can be used to create a file (w), open for write (w), or open for read (r). Open() requires Python’s “psycop2g” library.

Syntax of Open

file = open('[path/filename.extension]', '[r OR w]')

Analysis: We supply the “open” function with a file name and either “r” or “w”, which stands for read or write. The open function either fills a Python array type with the data from the file designated as the “path/filename.extension” OR, if we chose “w” (write), creates and/or writes data to that file.

Get file name from Postgres

The first method we’ll discuss here is to retrieve an image to display on a web page by storing only the file name in the database. As discussed above, a benefit to this approach is increased portability because the blob datatype is not available or may work differently in other database systems. Another portability issue is database size. Storing blobs in your database can quickly cause a huge amount of bloat because blobs tend to be much larger than other datatypes. For example, a boolean type is 1 byte and a text type might be as large as 255 bytes, where a blob type for a small image might be 30,000 bytes!

Example for retrieve file name

import psycopg2

# database credentials here
# database connection here

def retrieve_filename(id_product, t_path_for_read):
    # read BLOB from Postgres
    s = ""
    s += "SELECT t_name_file"
    s += " FROM tbl_products"
    s += " WHERE"
    s += " ("
    s += " id_product = (%id_product)"
    s += " )"
    db_cursor.execute(s, [id_product])
    myRow = db_cursor.fetchone()
    t_name_file = myRow[0]
    myFile = open(t_path_for_read + '\' + t_name_file, 'r')
    x = myFile.read()

Analysis

  • import psycopg2: The psycopg2 Python library allows us to easily create connections and cursors for PostgreSQL.
  • database credentials: At the end of this tutorial, we’ll provide code for this.
  • database connection: At the bottom of this article, we’ll provide the full code.
  • def retrive_filename: This function has the code for retrieving the filename (t_name_file) from Postgres. The function has 2 parameters; id_product and t_path_for_save. The SQL SELECT statement uses id_product in the WHERE clause to filter for a specific product. t_path_for_save is used to specify for Python what folder to save the data in.
  • myRow = db_cursor.fetchone(): This retrieves the file name from Postgres and stores it in a local Python list variable called myRow.
  • t_name_file: “t_name_file” gets its value from the first item in myRow.
  • open: We use the open function to read the file named based on t_path_for_read and the file name stored in the t_name_file variable.

Get Blob datatype from Postgres

A different way to store and retrieve large data – blob datatype in this case – for a web database app is to store the data for that blob object directly in a column in the database. Most – but not all – large database systems support this data type. Earlier, we called the “store in folder on web server” approach as the “simple approach,” however, some might argue that the method to store and retrieve as a Blob datatype from Postgres with Python is the more simple and even elegant approach. Let’s think about why.

Note: One small benefit to using Python for this process of retrieving a blob from Postgres is Python’s flexible variable definition structure. As you will see below, all we need to do is fill a variable and the fact that we are filling it with blob data will automatically define the variable as a type that can handle the image data or whatever kind of data it happens to be.

Example of Python blob retrieval

import psycopg2

# database credentials here
# database connection here

def retrieve_blob(id_product, t_path_for_save):
    # read BLOB from Postgres
    s = ""
    s += "SELECT t_name_file, blob_data"
    s += " FROM tbl_products"
    s += " WHERE"
    s += " ("
    s += " id_product = (%id_product)"
    s += " )"
    db_cursor.execute(s, [id_product])
    blob_data = db_cursor.fetchone()
    open(t_path_for_save + '\' + blob_data[0], 'wb').write(blob_data[1])

Analysis

  • def retrive_blob: We’ve encapsulated our code for retrieving blob data from Postgres into this function, which has two parameters; id_product and t_path_for_save. If you study the SQL SELECT statement in our example, you will see id_product is plugged into that script’s WHERE clause, which we’ll discuss later in this analysis. The t_path_for_save parameter is used after we retrieve the data in order to tell Python what folder to place the data in.
  • blob_data = db_cursor.fetchone(): This retrieves the blob data from Postgres and stores it in a local Python variable called “blob_data” as a list type, meaning column 0 (t_name_file) is stored as blob_data[0] and column 1 (blob_data) is stored as blob_data[1].
  • open: This Python function is used to create a file on the server named based on t_path_for_save and the file name, which is stored in blob_data[0]. This open function also uses the write method to store the actual contents of blob_data[1] into this new file.

Source code to get blob from Postgres with Python

import flask # Python library for various functionalities
import psycopg2 # Python library for database connection

# connect to the Postgres database
t_host = "PostgreSQL database host address" # this will be either "localhost", a domain name, or an IP address.
t_port = "5432" # default port for postgres server
t_dbname = "database name"
t_user = "database user name"
t_pw = "password"
db_conn = psycopg2.connect(host=t_host, port=t_port, dbname=t_dbname, user=t_user, password=t_pw)
db_cursor = db_conn.cursor()

@app.route("/main")

def retrieve_blob(id_product, t_path_for_save):
    # read BLOB from Postgres
    s = ""
    s += "SELECT t_name_file, blob_data"
    s += " FROM tbl_products"
    s += " WHERE"
    s += " ("
    s += " id_product = (%id_product)"
    s += " )"
    try:
        db_cursor.execute(s, [id_product])
        blob_data = db_cursor.fetchone()
        open(t_path_for_save + '\' + blob_data[0], 'wb').write(blob_data[1])
        db_cursor.close()
    except (Exception, psycopg2.DatabaseError) as error:
        print(error)
    finally:
        if db_conn is not None:
            db_conn.close()

def main():
    retrieve_blob(42, "c:\inetpub\mySiteName")

Conclusion

In this tutorial we explored two of the most popular methods to retrieve images or other large files from PostgreSQL. We learned how to retrieve only the name of the file in the database and then how to retrieve the blob datatype from Postgres with Python. A necessary ingredient to both of these methods is the open function from Python’s psycopg2 library. Finally, we provided examples, analysis, and commented source code throughout the lesson.

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.