How to Parse the Elasticsearch Query Data Returned by the Explain API In Python
Introduction
The Explain API in Elk stack provides detailed information about an Elasticsearch query’s score. The Explain API calculates a score “explanation” for the query of a particular document, providing useful feedback on whether or not a document matches a specific query. This tutorial will cover how to parse Elasticsearch query data with explain API in Python to make an API call in Python to the Elasticsearch cluster to retrieve data about a query and how to parse the returned information.
Prerequisites for Parsing Elasticsearch Query Data with Explain API in Python
- Confirm the Elasticsearch cluster is running on the server by executing the following cURL request in a terminal window:
1 | curl -XGET localhost:9200 |
The results should resemble the following:
* The Elasticsearch client must be installed on the same server making API requests to the cluster. Execute the following command to use the PIP3 package manager for Python 3 to install the low-level client:
1 | pip3 install elasticsearch |
- A few documents on an Elasticsearch index for creating a query and for passing to the
Elasticsearch.explain()
method call. The following script represents the document used in this tutorial for the example query:
1 2 3 4 5 6 7 8 9 10 11 | { "_index" : "some_index", "_type" : "_doc", "_id" : "12345", "_score" : 1.0, "_source" : { "string field" : "Object Rocket articles", "integer field" : 42, "boolean field" : false, "timestamp" : "2019-08-10 19:57:39.195582" } |
The results should resemble the following:
How to Import the Elasticsearch Client and Exceptions Libraries
Execute the following command to import the native JSON library. This will allow Python to format the JSON response returned by Elasticsearch to a more easily understood format:
1 2 | # use the JSON library to prettify Elasticsearch JSON responses import json |
Now execute the following command to import the Elasticsearch low-level client as well as the client’s exceptions
library that will catch any API errors returned by Elasticsearch:
1 2 | # import the Elasticsearch client library from elasticsearch import Elasticsearch, exceptions |
How to Declare a Client Instance and an Elasticsearch Query Dictionary
This section will explain how to use the Elasticsearch library to create a client instance that can be used to make API calls to the cluster.
Execute the following command to have the Elasticsearch library return a client object for API calls:
1 2 | # declare a client instance of the Python Elasticsearch library client = Elasticsearch("http://localhost:9200") |
How to declare a Python dictionary object for the Elasticsearch query
The Explain API requires a valid JSON query in the form of a Python dict
object be passed to its method call. The following script will instantiate the query object outside of the method that will be passed later:
1 2 3 | # declare a query dict object for the explain() method call query = {"query": {"match" : {"string field" : "Object" }}} print ("query:", query) |
How to have the Explain() Method Return a Result ‘dict’ Object
The explain()
method requires the client instance, the index named as a string and the _id
and query body passed to it.
Execute the following command:
1 | result = Elasticsearch.explain(client, index="some_index", id=12345, body=query) |
How to use the json.dumps() method to indent and ameliorate the Elasticsearch API response
The JSON library’s dumps()
method has an indent
parameter that can be used to improve the JSON response returned by Elasticsearch. Execute the following command:
1 2 | # print the JSON response from Elasticsearch with indentations print (json.dumps(result, indent=4)) |
How to use the Elasticsearch Client’s ‘exceptions’ Library to Catch API Errors
The following is an example of how the Elasticsearch client’s exceptions
library is used to catch any API errors returned that parses the error object’s info
dictionary:
1 2 3 4 5 6 | # if the query doesn't return a match it will return a 404 NotFoundError try: result = Elasticsearch.explain(client, index="some_index", id=12345, body=query) except exceptions.NotFoundError as error: # the elasticsearch.exceptions.NotFoundError object has a dict attribute print (type(error)) |
How to parse the error object returned by Elasticsearch
The following command will catch the error object’s info
attribute so that the result object will have dictionary values regardless of whether or not the API call was a success:
1 2 3 | # get the dict attr to parse and evaluate result = error.info print ("explain() ERROR:", error) |
How to Parse the Python Dictionary Returned by the API Call
As shown below, even if the results returned an API error, the result
object will still have nested values of ["explanation"]["description"]
that will explain why the API returned the specified results:
1 2 3 | # access the "explanation" and "description" keys print ("explanation description:", result["explanation"]["description"]) # prints --> "description": "weight(string field:object in 0) [PerFieldSimilarity], result of:" |
How to confirm if the query response matched a document on the Elasticsearch index
The JSON response’s "matched"
key will have a boolean string value of "true"
or "false,"
depending on whether or not the query found a match. Execute the following command:
1 2 | # print the "matched" key of the dict object print ("query match:", result["matched"]) |
The results should resemble the following:
Conclusion
This tutorial covered how to parse Elasticsearch query data with explain API in Python to make an API call in Python to the Elasticsearch cluster. The article specifically explained how to import the Elasticsearch client and exceptions libraries and how to have the explain() method return a result dict
object. The tutorial also covered how to use the json.dumps() method to indent and ameliorate the Elasticsearch API response, how to use the Elasticsearch client’s exceptions
library to catch API errors and how to parse the python dictionary returned by the API call. Remember that even if the results returned an API error when parsing the Python dictionary returned by the API call, then the result
object will still have nested values of ["explanation"]["description"]
to explain why the API returned those specific results.
Just the Code
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | #!/usr/bin/env python3 #-*- coding: utf-8 -*- # use the JSON library to prettify Elasticsearch JSON responses import json # import the Elasticsearch client library from elasticsearch import Elasticsearch, exceptions # declare a client instance of the Python Elasticsearch library client = Elasticsearch("http://localhost:9200") # declare a query dict object for the explain() method call query = {"query": {"match" : {"string field" : "Object" }}} print ("query:", query) # if the query doesn't return a match it will return a 404 NotFoundError try: result = Elasticsearch.explain(client, index="some_index", id=12345, body=query) except exceptions.NotFoundError as error: # the elasticsearch.exceptions.NotFoundError object has a dict attribute print (type(error)) # get the dict attr to parse and evaluate result = error.info print ("explain() ERROR:", error) # print the JSON response from Elasticsearch with indentations print (json.dumps(result, indent=4)) # access the "explanation" and "description" keys print ("explanation description:", result["explanation"]["description"]) # prints --> "description": "weight(string field:object in 0) [PerFieldSimilarity], result of:" # print the "matched" key of the dict object print ("query match:", result["matched"]) |
Pilot the ObjectRocket Platform Free!
Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.
Get Started