How to install and use Curator Python in Elasticsearch
If you have indices stored in Elasticsearch, Python’s Curator module is a tool you will love. Just as a museum curator manages the displays, Curator helps you manage all of your indices. Curator Python for Elasticsearch allows you to create your own scripts to perform various tasks. The Elasticsearch Curator Python API supports Python versions 2.7 and later and is currently compatible with the 5.x Elasticsearch versions.
- While the Python code will function on Windows machines, this article will explain how to install and use the Curator module on UNIX-based machines, such as a macOS or Linux.
- You are advised to use Python 3 as Python 2 is depreciating and will be obsolete by 2020. A listing of all Python 3 dependencies for the
curatorlibrary are listed in the repository’s Debian page.
- Be sure you have the most recent stable release of Python 3 on your machine.
- The commands for
pip3package manager each follow the following format (ending with a
pip3 install examplemodule
- The Elastic Stack must be installed and running properly on your machine for you to be able to import data to your target cluster. You will need to have an index created on your cluster to place the Python data into.
- Use a private key to establish SSH access to your server.
Install the Curator Module
- Use the Python 3’s
pip3package manager to install Curator:
If Curator is already installed on you machine, employ PIP’s
freeze command to obtain the currently-installed version:
It should return something similar to this:
Once Curator is installed, use the
--help option to view a list of all the commands:
The default file and directory for the configuration file is:
. in front of the directory denotes that directory is hidden. If there is no hidden directory in your home (
~) directory for Curator, (use
ls -a to list hidden files and folders in a directory), and you will then be able to create the directory and the YAML configuration file yourself:
sudo touch curator.yml
The most essential
curator.yml fields to configure in the file are the
"port:" fields. Make sure Elasticsearch is running on the correct domain and port.
You can use a text editor, like
nano, to edit the file:
The default configuration file is as follows
# Leave a key empty if there isn’t a value given. None will be a string,
# not a Python "NoneType"
blacklist: ['elasticsearch', 'urllib3']
blacklist:field should be one of the following three:
- An empty Python array (
- An array of log-handler strings
- A black space
Refer to Elastic’s documentation for the Curator’s configuration file for more information.
Perform YAML File Actions
- Generate a YAML file called
test.ymland edit it with
sudo nano test.yml
Contents of test.yml
Find indexes older than 15 days that match a filter pattern
- filtertype: pattern
- filtertype: age
Save the file and execute the action with the fgollowing command:
Create a new Python script
- Create a Python script in your terminal with the following command:
Edit the Python Script
- Use a terminal editor, like
nano, to edit the new script:
When finished, save the file and exit nano by pressing: CTRL + O CTRL + X
Import the Curator modules:
- Generate a new client instance of Elasticsearch:
- Create an new instance of an index list using the
- Note that this class has different methods for filtering data for each type of available filter.
- The documentation provides complete list of all the filter methods.
- Filter by age and
index_list.filter_by_age(source='name', direction='older', timestring='%Y.%m.%d', unit='days', unit_count=20)
delete_indices = curator.DeleteIndices(index_list)
unit_countis a value you can pass that will calculate the number of units relative to the number of seconds.
Filter by age
In this tutorial you learned how to install, configure and use the Curator module on UNIX-based machines. As Curator is written in Python, it will work with almost all operating systems. Be aware, Curator only requires access to a client node in the Elasticsearch cluster to function, and does not have to be directly installed on the nodes in the cluster. Remember to use Python 3, as you may encounter compatibility issues with Python 2 and outdated libraries.
Pilot the ObjectRocket Platform Free!
Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.Get Started