Use Python To Query Elasticsearch In A Terminal Window

Written by Data Pilot

July 26, 2019

Elasticsearch
Python

Have a Database Problem? Speak with an Expert for Free
Get Started >>

Introduction

If you’re using Elasticsearch to store and search your data, it’s important to know that you can query your data from a number of different sources. One way to to query documents is to pass an Elasticsearch index and a query term to a Python script from a terminal window, having Python return an Elasticsearch API query response. In this article, we’ll walk through an example of having Python query Elasticsearch from a terminal window.

Prerequisites

Let’s review a few prerequisites that need to be taken care of before we jump into the Python code:

First, the Elasticsearch cluster should be installed and running on the same server or machine running the Python script that will be making the query requests. Elasticsearch typically runs on the default port of 9200, and you can use cURL to get a response from the cluster using the following command:

1	curl -XGET localhost:9200

You’ll need to make sure PIP for Python 3 is installed and working properly. Use the command pip3 --version to see if PIP is installed.
You’ll need to have access to a UNIX-based terminal for the Elasticsearch cluster’s server or machine. In Linux you can press CTRL+Alt+T to open a new terminal window; otherwise, you can just open a new Finder window in macOS, navigate to the “Utilities” folder and open Terminal.app from there:

Screenshot of a Finder window in macOS showing the Terminal app in the Utilities folder

You’ll also need to install the Python client library for Elasticsearch using pip3 if you haven’t done so already:

1	pip3 install elasticsearch

You can use the pip3 show elasticsearch command to get additional information on the Elasticsearch library for Python.

Terminal screenshot using cURL and PIP3 to get information on Elasticsearch

Last but not least, you’ll need to have a few documents stored in an Elasticsearch index that you can query. Make sure the index name is spelled correctly when you call the Python script and pass it as an argument.

Create a Python script that will make requests to Elasticsearch

Once all the prerequisites are in place, we can start working on our Python script. In a terminal window, navigate to your Elasticsearch project folder or somewhere on the main node or server that runs the cluster, and create a new Python script that will be used to make Elasticsearch requests. Our script will be named query.py:

1	touch query.py

Import the packages needed to make requests to the Elasticsearch cluster

You can write your script in an IDE that supports Python syntax, or you can use a terminal-based editor like nano, gedit, or vim. The first code we’ll add to our script is used to import all the packages we’ll need:

1
2
3
4
5
6
7
8

#!/usr/bin/env python3
#-*- coding: utf-8 -*-

# use sys for system arguments passed to script
import sys, json

# import the Elasticsearch client library
from elasticsearch import Elasticsearch

The native sys and json Python packages, as well as Elastic’s elasticsearch library, will be used in our script to query Elasticsearch.

Instantiate an Elasticsearch client in the Python script

Next, let’s have the Elasticsearch class return a client instance. When you do this, don’t forget to pass the appropriate domain name and port in the URL string. You can leave the string empty if your cluster is on the default localhost:9200:

1 2	# create a client instance of Elasticsearch client = Elasticsearch("http://localhost:9200")

Get the Elasticsearch index name from one of the system arguments passed

When we call our Python script, we’ll have to pass some arguments to it so that Elasticsearch will know what to query and what index to query. Each argument should be delimited by a space.

In this example, after the name of the script itself, we’ll pass the name of the Elasticsearch index that we want to query. Later in the script, you’ll see that we also pass two additional arguments that represent the field name and the query itself:

1
2
3
4
5
6

# second arg [1] is for the Elasticsearch index's name
try:
INDEX_NAME = sys.argv[1]
except IndexError as err:
print ("IndexError - Script needs the Elasticsearch index name:", err)
quit()

Declare a Python function that will make a query request to Elasticsearch

In the function shown below, we check to make sure the specified index exists at the beginning because the query won’t work if the index doesn’t exist. This function needs to have a filter Python dictionary passed to it as a parameter:

1
2
3
4
5
6
7

def make_query(filter):
index_exists = client.indices.exists(index=INDEX_NAME)

# check if the index exists
if index_exists == True:
print ("INDEX_NAME:", INDEX_NAME, "exists.")
print ("FILTER:", filter, "\n")

Make an API query request to Elasticsearch in a try-except block

Next, we’ll call the Elasticsearch client’s search() method in a try-except exception block. We’ll make sure to pass the INDEX_NAME variable to it as the index parameter along with the filter query dictionary object as the body parameter:

1
2
3
4
5
6

try:
# use JSON for literal string arguments
#filter = json.loads(filter)

# pass filter query to the client's search() method
response = client.search(index=INDEX_NAME, body=filter)

Iterate the API response returned by Elasticsearch

The function will iterate over the response object that’s returned by the Elasticsearch cluster, and it will print the document "hits" that match the query:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

# print the query response
print ('response["hits"]:', len(response["hits"]))
print ('response TYPE:', type(response))

# iterate the response hits
print ("\nDocument %d hits:" % response['hits']['total']['value'])
for hit in response['hits']['hits']:
print(hit["_source"])
except Exception as err:
print ("search(index) ERROR", err)
response = {"error": err}
# return an empty dict if index doesn't exist
else:
response = {}

return response

Declare the main() function and check the query arguments passed in the terminal window

The next step is to declare a main() function, which the Python interpreter will call when running the script. We’ll get the sys.argv list of arguments that get passed to it in the terminal, then we’ll pass the first item in that list to the pop() function. We do this to remove the Python script name, which is always the first argument in the list, because we won’t be needing it:

1
2
3
4
5
6
7

def main():

# declare variable for system arguments list
sys_args = sys.argv

# remove Python script name from args list
sys_args.pop(0)

Make sure that there are exactly three arguments passed

In order for our Elasticsearch query to work, there needs to be exactly three arguments when calling the Python script’s main() function:

1
2
3
4
5

# quit the script if there are not exactly 3 arguments
if len(sys_args) != 3:
print ("Three arguments needed. You provided:", sys_args)
print ("First argument is index, and next two are the field and query:", sys_args)
quit()

Pass a filter dict to the make_query()

If we verify that there were three system arguments passed to the script via the terminal command, then we proceed to create a new filter Python dictionary object from the argument values:

1
2
3
4
5
6
7
8
9
10
11
12
13

else:
# get the field name and query value from sys args
field_name = sys_args[1]
value = sys_args[2]

# pass the field name and value args to filter dict
filter = {
'query': {
'match': {
field_name: value
}
}
}

Then, we pass the filter dict to the make_query() function that was declared earlier outside of the main() function:

1 2	# pass the filter dict to the make_query() function resp = make_query(filter)

Have the Python interpreter call the main() function when the script is run

The final step in creating our script is to instruct the Python interpreter to call the main() module:

1 2	if __name__ == "__main__": main()

Conclusion

Now that our script is complete, all we need to do is run it. To do this, open a terminal window and use the python3 command to run the Python script, passing three arguments delimited by spaces—- the Elasticsearch index name, the field being queried and the partial match query string. Make sure to enclose the query in quotation marks if any of these arguments is made up of more than one word.

1 2	# python3 script_name.py "Index Name" "Field Name" "search for this" python3 query.py some_index "String Field" object

Assuming that there are no errors and there are matching documents in Elasticsearch, the output should look something like this:

1
2
3
4
5
6
7
8
9

INDEX_NAME: some_index exists.
FILTER: {'query': {'match': {'String Field': 'object'}}}

response["hits"]: 3
response TYPE: <class 'dict'>

Queried 2 Hits:
{'String Field': 'Python & Object Rocket', ' Integer Field': ' 31415926535', ' Boolean Field': ' FALSE'}
{'String Field': 'Object Rocket tech tutorials', ' Integer Field': ' 16180339', ' Boolean Field': ' FALSE'}

Make Elasticsearch queries by passing arguments to a Python script in a terminal prompt

Terminal screenshot passing arguments to a Python script to make an Elasticsearch query

Use Kibana to search all of the documents in an Elasticsearch index

If you’re running the Kibana service on the cluster, you can open Kibana’s Console UI and verify the documents in the some_index Elasticsearch index. Just navigate to the Dev Tools section and make the following request to search for all of the some_index Elasticsearch index’s documents:

1	GET some_index/_search

Screenshot of Kibana getting all the documents in an Elasticsearch index

Notice that the search query is not case-sensitive, nor is it an exact match, because a query of just “object” returned the two documents shown in the screenshot.

Just the Code

We’ve looked at quite a bit of Python code throughout this tutorial. Shown below is the example script in its entirety:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87

#!/usr/bin/env python3
#-*- coding: utf-8 -*-

# use sys for system arguments passed to script
import sys, json

# import the Elasticsearch client library
from elasticsearch import Elasticsearch

# create a client instance of Elasticsearch
client = Elasticsearch("http://localhost:9200")

# The script name is the first arg [0], but the second arg [1]
# can be used for the Elasticsearch index's name
try:
INDEX_NAME = sys.argv[1]
except IndexError as err:
print ("IndexError - Script needs the Elasticsearch index name:", err)
quit()

def make_query(filter):
index_exists = client.indices.exists(index=INDEX_NAME)

# check if the index exists
if index_exists == True:
print ("INDEX_NAME:", INDEX_NAME, "exists.")
print ("FILTER:", filter, "\n")

try:
# use JSON for literal string arguments
#filter = json.loads(filter)

# pass filter query to the client's search() method
response = client.search(index=INDEX_NAME, body=filter)

# print the query response
print ('response["hits"]:', len(response["hits"]))
print ('response TYPE:', type(response))

# iterate the response hits
print ("\nDocument %d hits:" % response['hits']['total']['value'])
for hit in response['hits']['hits']:
print(hit["_source"])
except Exception as err:
print ("search(index) ERROR", err)
response = {"error": err}
# return an empty dict if index doesn't exist
else:
response = {}

return response

def main():

# declare variable for system arguments list
sys_args = sys.argv

# remove Python script name from args list
sys_args.pop(0)

# quit the script if there are not exactly 3 arguments
if len(sys_args) != 3:
print ("Three arguments needed. You provided:", sys_args)
print ("First argument is index, and next two are the field and query:", sys_args)
quit()

else:
# get the field name and query value from sys args
field_name = sys_args[1]
value = sys_args[2]

# pass the field name and value args to filter dict
filter = {
'query': {
'match': {
field_name: value
}
}
}

# pass the filter dict to the make_query() function
resp = make_query(filter)

# have interpreter call the main() func
if __name__ == "__main__":
main()

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started