Memory Management with the Psycopg2 Adapter for PostgreSQL (Part 2)

Memory management in the psycopg2 Python adapter for PostgreSQL

A running Python script using the psycopg2 adapter for PostgreSQL can cause some memory issues if you fail to take certain precautions or setup your app correctly. The following article will provide some tips to help you better manage your psycopg2 application’s memory and system resources. This is part two of the memory management in psycopg2 article series.

NOTE: PostgreSQL also has a memory management library called SPI you can connect to to assist in streamlining your psycopg2 Python application’s efficiency.

Prerequisites to using the psycopg2 adapter for PostgreSQL

  • The code in this article has only been tested on Python 3, and Python 2.7 is now deprecated and losing support, so it’s recommended that you have a version of Python greater than 3.4 installed, and that you run your scripts using python3. Use the python3 -V command in a terminal window to verify that it’s installed.

  • You’ll need the PIP3 package manager for Python 3 working on your machine or server in order to install the psycopg2 library using the pip3 command. Install the adapter using the following command in a UNIX terminal or Windows command prompt: pip3 install psycopg2.

  • The PostgreSQL server should be installed and running, and you should have some database and table record data you can test the code on.

Install the psutil for library for Python 3

Some installations of Python may not come with psutil pre-installed. You can use the pip3 list command to check what packages are already installed, or use the following command, in a UNIX terminal, to install it:

sudo python3 -m pip install psutil

Or you can also install it with just the pip3 install commands, like in the following example:

pip3 install psutil

Create a Python script and import libraries to connect to PostgreSQL

The following Python code imports the psycopg2 adapter’s sql and connect libraries, as well as the psutil, sys, and gc libraries to help manage your system’s memory:

# import the psycopg2 database adapter for PostgreSQL
from psycopg2 import connect, sql

# import Python's 'sys' and 'psutil' libraries
import sys, psutil

# import Python's garbage collection library
import gc

Declare a string for the PostgreSQL table name and SQL statement

Declare a string for the PostgreSQL table name so that it can be inserted into a SQL statement string concatenation later:

# declare a string object for the table name
table_name = "some_table"

The following code uses the LIMIT SQL keyword to restrict the number of records returned. This will help to reduce the application’s memory use in the case fetching records from extremely large PostgreSQL tables:

# concatenate a SQL string from table name and columns
sql_statement = "SELECT * FROM {} LIMIT 1000;".format( table_name )

# print the concatenated SQL string
print ("sql_statement:", sql_statement)

Using Python’s ‘with’ statement to manage memory and connect to PostgreSQL

Python’s with statement allows you to execute a block of code, inside of an indentation, in order to close all of the objects within it, and free up those resources, after the code has been executed.

Connect to PostgreSQL inside of a ‘with’ indentation

The code in this section will nest a bunch of code to connect to PostgreSQL and execute transactions inside of a with indentation block. Here’s some example code to connect to a PostgreSQL server using the as Python keyword to assign an alias (conn) for the connection:

# use Python's 'with' statement to release the memory afterwards
try:
    with connect (
        dbname = "some_database",
        user = "objectrocket",
        host = "localhost",
        password = "mypass"
    ) as conn:

NOTE: Make sure to pass the correct parameters or else PostgreSQL will raise an error.

Use the ‘with’ and ‘as’ Python keywords while using PostgreSQL cursor

The next step is to nest another with indentation code block in order to create a cursor object, and execute the SQL string declared earlier, inside of the connection indentation:

        with conn.cursor() as cursor:
            # have sql.SQL() return a sql.SQL object
            sql_object = sql.SQL(
                # pass SQL string to sql.SQL()
                sql_statement
            ).format(
                # pass the identifier to the Identifier() method
                sql.Identifier( table_name )
            )

NOTE: It’s recommended that you use the sql.SQL() and sql.Identifier() methods, instead of just passing a concatenated string to the execute() method, to help prevent SQL injection attacks.

Setting the cursor’s name attribute to

Psycopg2 will load all of the query into memory if a name isn’t specified for the cursor object. Here’s an example of how you can do that to reduce how much memory the application uses:

with conn.cursor(name='memory_test') as cursor:

Lower the cursor object’s itersize integer value

You can also reduce the cursor’s itersize attribute (default value is 1000) to help save on resources by reducing the batch size of each fetch iteration:

# reduce the itersize attribute value (default is 1000)
cursor.itersize = 100

Call the psutil function while executing a PostgreSQL statement

Execute the sql.SQL object by passing it to psycopg’s execute() method. Now you can call the get_avail_memory() Python function (explained and constructed in part one of this series) inside of the nested with indentation block to get an idea of how much memory the PostgreSQL app is using:

            # execute the sql.SQL() object
            cursor.execute ( sql_object )

            # use the fetchall() method to return a list of all the data
            table_data = cursor.fetchall()
            print ("table_data:", len(table_data))

            # call the psutil function to check available memory
            print (get_avail_memory())
           
except Exception as err:
    print ("\npsycopg2 error:", err)

Conclusion

Save the code and then execute the script using the python3 command followed by the script name. You should see a response in your terminal or command prompt window that looks something like the following:

sql_statement: SELECT * FROM some_table LIMIT 1000;
table_data: 1000
available memory: 43.05%
memoryview cursor: <memory at 0x7f7153692a08>
available memory: 43.04%
connection.closed: 0
cursor.closed: True
memoryview() cursor error: cursor already closed

Screenshot of a Python script running in a terminal window returning memory information while running a psycopg2 app

At this point you should have an arsenal of tip and methods you can use to better manage your system’s memory and resources while running a psycopg2 PostgreSQL application. Check out part 1 of this article series for a more general overview of memory management in Python.

Just the Code

#!/usr/bin/python3
# -*- coding: utf-8 -*-

# import the psycopg2 database adapter for PostgreSQL
from psycopg2 import connect, sql

# import Python's 'sys' and 'psutil' libraries
import sys, psutil

# import Python's garbage collection library
import gc

# declare a string object for the table name
table_name = "some_table"

# concatenate a SQL string from table name and columns
sql_statement = "SELECT * FROM {} LIMIT 1000;".format( table_name )

# print the concatenated SQL string
print ("sql_statement:", sql_statement)

# a function that returns a string of the system's available memory
def get_avail_memory():
    # call the psutil virtual_memory() method
    mem = psutil.virtual_memory()

    # calculate the available memory
    avail_mem = ((mem.total - mem.used) / mem.total)*100

    # concatenate a string for the memory
    avail_mem = "available memory: {0:.2f}%".format( avail_mem )
    return avail_mem


# use Python's 'with' statement to release the memory after complete
try:
    with connect (
        dbname = "some_database",
        user = "objectrocket",
        host = "localhost",
        password = "mypass"
    ) as conn:
        with conn.cursor(name='memory_test') as cursor:
            # have sql.SQL() return a sql.SQL object
            sql_object = sql.SQL(
                # pass SQL string to sql.SQL()
                sql_statement
            ).format(
                # pass the identifier to the Identifier() method
                sql.Identifier( table_name )
            )

            # reduce the itersize attribute value (default is 1000)
            cursor.itersize = 100

            # execute the sql.SQL() object
            cursor.execute ( sql_object )

            # use the fetchall() method to return a list of all the data
            table_data = cursor.fetchall()
            print ("table_data:", len(table_data))

            # call the psutil function to check available memory
            print (get_avail_memory())

            # have memoryview return the memory address for the cursor
            cursor_bytes = bytes(cursor)
            print ("memoryview cursor:", memoryview(cursor_bytes))

except Exception as err:
    print ("\npsycopg2 error:", err)

# explicitly invoke Python's garbage collection
gc.collect()

# return the available memory using the psutil function
print (get_avail_memory())

# check if the PostgreSQL connection and cursor objects are closed
print ("connection.closed:", conn.closed) # 0 == open
print ("cursor.closed:", cursor.closed) # False == open

try:
    # should return error stating: "cursor error: cursor already closed"
    cursor_bytes = bytes(cursor)
    print ("memoryview cursor:", memoryview(cursor_bytes))
except Exception as err:
    print ("memoryview() cursor error:", err)

# connect to PostgreSQL again
print ("\nconnecting to PostgreSQL")
conn = connect (
    dbname = "some_database",
    user = "objectrocket",
    host = "localhost",
    password = "mypass"
)

# instantiate a cursor object from the connection
cursor = conn.cursor()

# check if the cursor is closed
if cursor.closed == False:
    print ("closing cursor")
    cursor.close()

# check if the connection is closed
if conn.closed == 0:
    print ("closing connection")
    conn.close()

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.