How to Create PostgreSQL Test Data

Introduction

When you’re developing applications or trying to evaluate query performance, it can be helpful to have some test data on hand. However, creating a large dataset manually can be tedious and time-consuming. With PostgreSQL, it’s easy to create a test dataset containing randomly-generated values with just a couple of simple commands. In this article, we’ll show you how to generate PostgreSQL test data using the psql command-line interface.

Prerequisites

Before going any further with this tutorial, you’ll need to make sure you have a PostgreSQL database cluster working on your machine. You can use the command systemctl status postgresql on a Linux machine to check if it’s installed. It’s also important to have Docker installed on your machine and to have some basic knowledge of how to use a Docker container– we’ll be using a container to hold our PostgreSQL test data.

Postgres Docker container

A Docker container is used for the development and deployment of applications. With a container, applications can run faster and become portable and scalable.

To create and run a Docker container, you’ll need to use the commands shown below:

1
docker run --name objectrocket -e POSTGRES_PASSWORD=1234 -p 5432:5432 -d postgres

The Docker container will automatically pull the image for PostgreSQL, and it will be used as our container.

Let’s review some of the options available for a Docker command:

  • The --name option will provide the name of the container you will run and use.

  • The -e option sets an environment variable within the Docker container; in our example, we set the PostgreSQL password.

  • You can use the -p option to bind the 5432:5432 port to the localhost.

  • The -d option will be used to launch the container in the background along with pulling the PostgreSQL image.

To see a list of available running containers, use this command:

1
docker ps

The output will look something like this:

1
2
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
16bff39ba7de postgres "docker-entrypoint.s…" 10 seconds ago Up 6 seconds 0.0.0.0:5432->5432/tcp objectrocket

Now that we’ve reviewed some Docker fundamentals, let’s create a table in the psql command-line console for PostgreSQL.

To access psql on your machine, open your terminal and use the following command:

1
sudo -u postgres psql -h localhost -p 5432

The table we’re going to create will hold our generated PostgreSQL test data.

Create a table in Postgres

Now we’ll use the postgres database to create a table.

To create a table in PostgreSQL, we use the CREATE TABLE command. The basic syntax is shown below:

1
CREATE TABLE TABLE_NAME(COLUMN + datatype + constraints [optional]);

Here’s the statement we’ll use to create the table for our test data:

1
2
3
4
5
6
7
CREATE TABLE users(
    id BIGINT GENERATED ALWAYS AS IDENTITY,
    PRIMARY KEY(id),
    hash_firstname TEXT NOT NULL,
    hash_lastname TEXT NOT NULL,
    gender VARCHAR(6) NOT NULL CHECK (gender IN ('male', 'female'))
);

PostgreSQL test data

Next, we’ll INSERT values into the table using a random function to generate the data:

1
2
INSERT INTO users(hash_firstname, hash_lastname, gender)
SELECT md5(RANDOM()::TEXT), md5(RANDOM()::TEXT), CASE WHEN RANDOM() < 0.5 THEN 'male' ELSE 'female' END FROM generate.series(1, 10000);

NOTE: This function uses md5, which is an algorithm used for hash functions that produces a 128-bit hash value.

Let’s check out the results from the data that has been generated:

Using the SELECT statement, we can count the total rows of data in the users table:

1
2
3
4
5
SELECT COUNT(*) FROM users;
 COUNT
-------
 10000
(1 ROW)

We can also count the number of rows where the value of gender is equal to male :

1
2
3
4
5
SELECT COUNT(*) FROM users WHERE gender = 'male';
 COUNT
-------
  5048
(1 ROW)

We can do the same for the female data:

1
2
3
4
 COUNT
-------
  4952
(1 ROW)

Finally, let’s select and display all the data from the users table. Since we’re dealing with a large amount of data, we’ll use the LIMIT clause to return just a few rows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
SELECT * FROM users LIMIT 15;
 id |          hash_firstname          |          hash_lastname           | gender
----+----------------------------------+----------------------------------+--------
  1 | 6f649335d8f4dc2510fa5a1f7eb49d2e | 3742f76184a441fecb00e7a9ed655cd4 | male
  2 | 36fcde01f52ad3ed359c0fa89cf07664 | 24660b13820dedaae22ccb4d71422b99 | male
  3 | f68a8fdbeee077491d0015fd37c78fcf | daa6f64b3eb60a9b6494c39a666c7c71 | male
  4 | 166fbca935e7175880b8e34440487c3c | 682027f66518552ab04f1105e2a0eae7 | female
  5 | 4972fcafc6ae949646762d52184f900d | 666becc1d8f50993d5d4dc435e8bc534 | male
  6 | 86631b9a3cc4eae00619acc0ed2d781e | 094d7867fbe572c0cbed3b5996262fe6 | female
  7 | c60c876550035c5f36045b765ad64ef6 | 48e731262757b8176ee2d593eedd94d3 | male
  8 | daf7c30dba0679f0859e6358c9a2da53 | e44e73102cdc3059a3e6edc483faad2e | male
  9 | 1b657d7e4c37c08b5c72572fdeaf0a9c | b5a48a1c44b9247cc3d9d4f14a4efdc4 | male
 10 | 6e26230bb9df62a8e44797a0cc37ea05 | 942977ab1fd00a9ea665668d13cf9a7d | female
 11 | 0533a975b5bfda8149fec58e213fd796 | 30ca1aa2cdbaa440897c4bd2e68ab5c3 | male
 12 | f5d1e6d98831f50cc80f115c6baebf76 | 857404ddaf714e7284f7a441d91a225f | female
 13 | f1233af1bd103ac3c23331681a5d425d | dcf43368115c0786aee36ffad469dae7 | female
 14 | d82253175a06100a842f8db37bf7fea8 | a2e8f9d83fb88e3490b6d76d330e6b5b | male
 15 | ddd01e9d271fa4a1e063bf55db0b537c | 511c5e95fe3ee8cfca797a30131305cd | male
(15 ROWS)

Conclusion

When you need some PostgreSQL test data but don’t want to waste time manually inserting records, it’s easy to create a dataset using randomly-generated values. In this article, we walked you through the process of creating a test dataset in PostgreSQL. With our examples to guide you, you’ll be able to generate your own dataset to use in testing and development.

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.