TimescaleDB Data Retention
Introduction
If you’re using TimescaleDB to store time-series data, you know how quickly this type of data can accumulate in your database. New data is constantly being added, and older data becomes less relevant as time goes on. It often makes sense to free up disk space and delete older records; however, using a typical DELETE
statement to remove huge amounts of data can be a slow and costly process. Fortunately, there’s a more efficient way to get the job done. In this article, we’ll take a closer look at TimescaleDB data retention and explain how to create efficient and effective data retention policies.
Prerequisite
In order to follow along with this tutorial, you’ll need to have the following prerequisites in place:
You’ll need to install and configure PostgreSQL 11 on your machine.
You’ll need some basic PostgreSQL knowledge.
You’ll also need to install and set up TimescaleDB.
What Is Data Retention?
Data retention refers to the continued storage of data for a certain period of time as required by an organization. An organization’s specific requirements for data retention will depend on a number of factors:
- state and federal regulations compliance
- the ability to recover critical business data when needed
In order to maximize the benefits of data retention while still optimizing storage on your system, it’s important to develop a well-planned set of guidelines defining what data will be archived and how long it will be stored.
TimescaleDB Automatic Data Retention Policies
The TimescaleDB enterprise edition provides a framework that automates data management processes, including data retention policies. We can perform different operations that involve TimescaleDB data retention policies. It’s possible to create, remove and even alter a policy that automatically triggers the drop_chunk
command, depending on how a specific schedule was defined.
Adding Retention Policies
To add a policy on a TimescaleDB hypertable, we can use the add_drop_chunks_policy
command. The syntax of this command is shown below:
1 | SELECT add_drop_chunks_policy(<table_name>, INTERVAL <time_frame>); |
Let’s look at this command in more detail:
- First, we call the
SELECT
clause, followed by theadd_drop_chunks_policy
command. - We then specify the name of the target table. After that, we specify the
interval
option and define a time frame for when the policy will be executed.
Here’s an example of the add_drop_chunks_policy
command in action:
1 | SELECT add_drop_chunks_policy('sample_table', INTERVAL '36 hours'); |
This statement deletes all the chunks that are older than the given time frame, which is ’36 hours’.
Removing Retention Policies
In the previous example, we learned how to add a data retention policy. Now, let’s discuss how to remove a retention policy using the remove_drop_chunks_policy
command. The syntax of the command looks like this:
1 | SELECT remove_drop_chunks_policy(<table_name>); |
In this command, we call the SELECT
clause, followed by the command remove_drop_chunks_policy
. We then provide the name of the target table.
We can see how the command works in the following example:
1 | SELECT remove_drop_chunks_policy('sample_table'); |
This statement removes any existing retention policy associated with the sample_table
.
TimescaleDB provides a helpful option named if_exists
to use in conjunction with remove_drop_chunks_policy
. When this option is set to true, it prevents TimescaleDB from returning an error if no policy exists for the hypertable in question. The basic syntax looks like this:
1 | SELECT remove_drop_chunks_policy(<table_name>, if_exists => TRUE); |
Notice that we added the if_exists
option and set its value to true
.
Let’s try using this option in an example:
1 | SELECT remove_drop_chunks_policy('sample_table', if_exists => TRUE); |
Any policy connected to the table sample_table
will be deleted if it exists, but no error will be returned if it’s not found.
NOTE Keep in mind that this functionality is only available in the Enterprise Edition. Performing this process with a non-enterprise license will result in the following error:
1 | ERROR: cannot execute an enterprise function with an invalid enterprise license |
Conclusion
Time-series data can accumulate quickly in a database, and it’s important to know how to manage it effectively. In this article, we provided an overview of TimescaleDB data retention policies, and we showed you how to both add and remove retention policies for a table. With these examples and instructions to get you started, you’ll be able to set up data retention policies in your own TimescaleDB database.
Pilot the ObjectRocket Platform Free!
Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.
Get Started