From Postgres to CSV with Python

Introduction

In this article we study how to export data from Postgres into a CSV with Python scripting, using Python’s psycopg2 “OPEN” and “COPY_EXPERT” functions for creating a comma-separated values text file while moving data into that file, along with PostgreSQL’s “COPY TO” function to get the data we want out of a Postgres query. Here is how the lesson will go:

  • What? We’ll study the following functions to use for exporting a text file and copying that data into a PostgreSQL table: Open(), Copy_Expert(), “Copy To”, render_template(), and Format().
  • How? After learning how the primary pieces (functions) here work, we will script a Python application that uses all of these Postgres and Python functions to extract data from a database into a CSV file.

Prerequisites

  • You will want to have a basic understanding of Postgres, whether you’re utilizing the “PGadmin” tool (free) or another database admin tool like dBeaver (also free).
  • Ability to write applications with languages like Python, Javascript, PHP, C#, Java, ASP.Net, VB.Net, Note.js, Ruby, etc. to write to a PostgreSQL database. Here, we are using Visual Studio Code to write Python. It is not necessary for you to use the same tools we do.
  • Optional and probably useful: Understanding naming conventions, which explains why we prefix our variables, column names, table names, etc. as you see us doing in this article. For example, naming “t_variablename” with the “t” you see at the beginning in order to delineate it as a “text” (string) object and “tbl_” before table names in order to clearly mark those objects as tables.

Before writing our Python app, we’ll learn how to use the functions mentioned above. After learning how each function works, we’ll then build a chunk of code using that function to create a portion of our import application.

Postgres COPY TO function

COPY tbl_my_table (optional column_1, column_2, etc) | optional t_query TO optional f_my_path_and_file | optional STDOUT WITH parameter(s)
  • “tbl_my_table”: This refers to the Postgres source table you want to pull data from.
  • “column_1”: Optional column names to be read. If no columns are specified, all columns will be copied.
  • “t_query”: You can plug some SQL into this spot instead of column names.
  • “t_my_path_and_file”: A path and file name can go here, which would tell PostgreSQL where to save the file.
  • “STDOUT”: You can specify the data pulled from Postgres to be sent as output for Python to gather up.
  • “parameters”: This optional field can have any of the following:
    • FORMAT: The format to be read or written, including text, CSV, or binary. Text is default.
    • OIDS: Determines whether you are copying the OID for each row. (An error will occur if OIDS is specified for a table that does not have OIDs, or when query copying). This parameter is Boolean (true/false or 1/0).
    • DELIMITER: The delineation character that separates columns within lines in the file. The default delineator is tab in a text format and a comma in the CSV format. This is not used when you chose “binary” for the “FORMAT” parameter.
    • NULL: The text that represents a null value. If your “FORMAT” is “text”, “N” is the default. If you set “FORMAT” to “CSV”, you need an unquoted empty string. If you do not want to distinguish between nulls and empty strings, we recommend using an empty string here. “NULL” is another parameter you can’t use when “FORMAT” is set to “binary”.
    • HEADER: Indicates that the file includes a header line (row) mapping to columns. When outputting, the first line has column names from the data table you specified. Upon input, the first line is automatically ignored. This parameter is used only when “FORMAT” is set to “CSV”.
    • QUOTE: Dictates the character used for quoting when a value is quoted. Double-quote is default. Must be a single character. Again, only if “FORMAT” is set to “CSV”.
    • ESCAPE: Indicates the character that must be present before a character that matches the “QUOTE” parameter value. Default is whatever that parameter value is. It needs to be a single character. Again, only if “FORMAT” is set to “CSV”.
    • FORCE_QUOTE: Makes quoting mandatorily/automatically used for non-NULL values in each of the columns you ordered. NULL output is not quoted. If “*” is used, non-NULLs will be quoted in all of the fields (columns). This option is not used for “COPY FROM”, so only possible with “COPY TO”, and only if “FORMAT” is set to “CSV”.
    • FORCE_NOT_NULL: This tells Postgres not to match values from the columns-you-picked against NULL string. In the default case where the NULL is empty, this means that empty values are going to be read as zero-length strings rather than NULLs, even when not quoted. This option is not used for “COPY TO”, so only possible with “COPY FROM”, and only if “FORMAT” is set to “CSV”.
    • ENCODING: Determines the encoding for your file. Default is the client encoding.

Python’s OPEN function

This Python function can create a file (w), open for write (w), or open for read (r). It depends on the “psycop2g” library. Here’s the syntax:

import psycopg2
open('[path/filename.extension]', '[r OR w]') AS f_output

Analysis

  • “import psycopg2”: Python requires the psycopg2 library for use of the “open” function.
  • “open(‘path-to/text-file.csv’, ‘[r OR w]’)”: We supply the “open” function with a file name and either “r” or “w”, which means read or write. The open function either fills a Python array type with the text delineated in the file designated as the “path/filename.extension” OR, if we chose “w” (write), creates and/or writes to that file.

Now to put our newfound understanding of the Postgres “COPY” function and Python’s “Open” and “Copy_expert” functions into operation! We are going to write the following Python code so that ease of understanding and readability are prioritized vs what all is needed for your application. Not to worry because at the end of this tutorial we will provide you with the full, working source code.

# create a query to specify which values we want from the database.
s = "'"
s += "SELECT *"
s += " FROM "
s += "tbl_users"
s += "'"

# set up our database connection.
conn = psycopg2.connect...
db_cursor = conn.cursor()

# Use the COPY function on the SQL we created above.
SQL_for_file_output = "COPY ({0}) TO STDOUT WITH CSV HEADER".format(s)

# Set up a variable to store our file path and name.
t_path_n_file = "c:\data\users.csv"
WITH Open(t_path_n_file, 'w') as f_output:
    db_cursor.copy_expert(SQL_for_file_output, f_output)

Analysis

  • “SELECT *”: We’re asking for all available columns from the “tbl_users” table.
  • “No semicolon at the end?”: Yes. Postgres’ documentation recommends this for this use case; “COPY TO” later.
  • “SQL_for_file_output”: Create an object and fill it with data. This object will be used next in the “Open” and “Copy_Expert” functions.
  • “WITH open”: Create a CSV file and open it up for writing, using “t_path_n_file”.
  • “copy_expert “: Use the database cursor we created earlier, the “SQL_for_file_output” data, and point it to the CSV file that is standing by using “f_output”.

Now that we have built and scrutinized the critical and potentially new functions needed for our application, let’s look at a full listing of the source code you can copy directly into your projects:

Full Source Code in Python

from flask import Flask
import psycopg2

app = Flask(__name__)
@app.route("/")

t_host = "Your Postgres database host address" # either a domain name, an IP address, or "localhost"
t_port = "5432" # This is the default postgres port
t_dbname = "database name"
t_user = "database user name"
t_pw = "password"
db_conn = psycopg2.connect(host=t_host, port=t_port, dbname=t_dbname, user=t_user, password=t_pw)
db_cursor = db_conn.cursor()

@app.route("/export")
def csv_export():
    s = "'"
    s += "SELECT *"
    s += " FROM "
    s += "tbl_users"
    s += "'"

    # set up our database connection.
    conn = psycopg2.connect...
    db_cursor = conn.cursor()

    # Use the COPY function on the SQL we created above.
    SQL_for_file_output = "COPY ({0}) TO STDOUT WITH CSV HEADER".format(s)

    # Set up a variable to store our file path and name.
    t_path_n_file = "c:\data\users.csv"

    # Trap errors for opening the file
    try:
    WITH Open(t_path_n_file, 'w') as f_output:
        db_cursor.copy_expert(SQL_for_file_output, f_output)
    except psycopg2.Error as e:
        t_message = "Error: " + e + "/n query we ran: " + s + "/n t_path_n_file: " + t_path_n_file
        return render_template("error.html", t_message = t_message)

    # Success!

    # Clean up: Close the database cursor and connection
    db_cursor.close()
    db_conn.close()

    # Send the user on to some kind of informative screen.

SECURITY TIP: Use Stored Procedures

Protect the integrity of your users’ data by learning about

Conclusion

In this tutorial we learned how to export data using Postgres, CSV, and Python. We combined the use of the “COPY” Postgres function with the format, open, and copy_expert Python functions. Some other functions we used here were render_template() so that we could do some error reporting. This required use of some “try and except” error checking. We explained the parts first, and then dove in and created a fully operational Python export data from Postgres application. Finally, we shared all the source script here for you to use in your project.

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.