Elasticsearch Python Index Example

Introduction

Elasticsearch is the lightning speed search and analytics power engine that helps businesses just like yours discover solutions to their use cases. Couple it with Python, the programming language that streamlines system integration, and you have the most sensible way to index a document in Elasticsearch. This tutorial shows you how to index an Elasticsearch document using Python and gives an Elasticsearch Python index example. Learn how to increase your productivity in daily document indexing activities today.

If you’re familiar with indexing an Elasticsearch document using Python, you can jump to Just the Code.

Prerequisites

  • Install and run Elasticsearch. To get Elasticsearch, you can also go to a terminal window, use the curl - XGET "localhost:9200". You’re making an HTTP request. Alternatively, use the address bar of a tab in a browser and go to localhost:9200there.

  • Install Python for your OS. When finished, enter in python --version in a terminal window to confirm the latest Python version is on your OS. Use Python 3 or later because Python 2.7 is outdated.

Check the Elasticsearch cluster for indexes

Save time and forgo taking steps in advance for index creating prior to Elasticsearch document indexing. If there isn’t an index present for a document you’re trying to index, one will be created in a dynamic way while you attempt to index that document.

  • Verify all Elasticsearch indexes on the cluster with this cURL -XGET request:
curl -XGET "localhost:9200/_cat/indices"

Make an API index call script in Python

  • In a UNIX-based system terminal, enter the command mkdir and on your server, construct a project directory.

  • The command touch is what you’ll use to create the script in Python py.

  • Here’s the script in action:

mkdir elastic-project

cd elastic-project

touch index_doc.py

>NOTE If you receive access errors, use sudo commands.

Complete the editing for the script

  • Here’s an example of the nano script editor being used to edit the Python script:
nano index_doc.py

Make a client Elasticsearch library importation

*The import function is what you’ll use to add to the Python script the client Elasticsearch library:

# import the Elasticsearch low-level client library

from elasticsearch import Elasticsearch

Add the library datetime for timestamping

  • The document you’re indexing for this Elasticsearch Python index example needs the library datetime. This gives the document a timestamp of when it was indexed:
# import the dateime lib for an index timestamp

import datetime

Proclaim a Python client instance for the library Elasticsearch

  • Make a Python client instance declaration. Put in the parameter hosts the name of the domain or the IP address of your server so they’ll match the client instance.
# domain name, or server's IP address, goes in the 'hosts' list

client = Elasticsearch(hosts=["localhost:9200"])

Find additional information about the cluster in Elasticsearch

  • The attributes of the client library Elasticsearch can be found in Python using the function dir().
dir(client)

Obtain the version of the client Elasticsearch

  • Use the info() method of the client object to get information about the specific cluster:
# print the Elasticsearch client instance

print (client)

print ("Elasticsearch client version:", client.info()["version"], "\n")
  • You should see a response of the dict object that is similar to this Elasticsearch Python index example:
Elasticsearch client version: {'number': '7.2.0', 'build_flavor': 'default', 'build_type': 'tar', 'build_hash': '508c38a', 'build_date': '2019-06-20T15:54:18.811730Z', 'build_snapshot': False, 'lucene_version': '8.0.0', 'minimum_wire_compatibility_version': '6.8.0', 'minimum_index_compatibility_version': '6.0.0-beta1'}
  • Besides the index() method, the client library Elasticsearch has an extensive list of method calls available. Here’s what the list of attributes looks like:

Screenshot of the Elasticsearch Python client's attributes in IDLE

Make a statement for the index name Elasticsearch

  • Compose a declaration to make global variables the _id of the document and its index name so you can pass them to the method index().
# declare the Elasticsearch index name and document _id

INDEX_NAME = "some_index"

DOC_ID = 1234

Run an index check to determine if it exists

  • Use the index-exists function like this to find out if there’s already an Elasticsearch index:
# the Elasticsearch index exists

index_exists = client.indices.exists(index=INDEX_NAME)

# check if the Elasticsearch index exists

if index_exists == False:

# the Elasticsearch index name does not exist

print ("INDEX_NAME:", INDEX_NAME, "does not exist.")

NOTE:When you make an API Index call, an index will be created if there wasn’t one present.

Proclaim the Elasticsearch Python dictionary

  • Make a method call to the library index() of the client by passing the dict Python object. Note that the dictionary keys of Python mirror the index fields of Elasticsearch.

  • See this document example to get an idea of what it should resemble:

# create a document body for the new Elasticsearch document

doc_body = {

"str field": "Object Rocket",

"int field": 1234,

"bool field": True,

"time field": datetime.datetime.now()

}

How to avoid a mapper_parsing_exception

  • When you index your document, you might see a mapper_parsing_exception and an HTTP 400 type of error returned from the index() method. To avoid this, be sure the schemas of the dict Python and the indexes that pre-existed match.
elasticsearch.exceptions.RequestError: RequestError(400, 'mapper_parsing_exception', 'failed to parse')

Make a method call to the index() of the client object

  • You’ll want to get an API Elasticsearch cluster response after you make a method call to the index() of the client object.

  • This example shows you how an indentation block called “try-catch” identifies exceptions returned from the API. Afterward, it prints the results to the terminal.

# use a try-catch block for any Elasticsearch index() exceptions

try:

# call the Elasticsearch client's index() method

response = client.index(

index = INDEX_NAME,

doc_type = '_doc',

id = DOC_ID,

body = doc_body,

request_timeout=45

)

# print the API response returned by the Elasticsearch client

print ("response:", response)

except Exception as err:

print ("Elasticsearch index() ERROR:", err)

Know what an API response Elasticsearch returned should resemble

  • The response dict object returned by the Elasticsearch cluster is ready for you to review and determine if the method call index() worked properly.
response: {'_index': 'some_index', '_type': '_doc', '_id': '1234', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 0, '_primary_term': 1}

Evaluate the Elasticsearch API response returned

  • The indicator of a successful API response is that you’ll see a value in the integer format for the key ["_shards"]["successful"].
# evaluate the response dict object to check that the API was successful

if response["_shards"]["successful"] == 1:

print ("INDEX CALL WAS SUCCESS:", response["_shards"]["successful"])

else:

print ("INDEX CALL FAILED")

Confirm a successful response with a request HTTP cURL

  • In a window from the terminal, verify if an API response was successful by making a request HTTP cURL.
curl -XGET "http://localhost:9200/some_index/_doc/1234"

Check with Kibana if the API response was successful

  • From your server, go to the 5601.

  • Next, select Dev Tools.

  • Then in Kibana, make a GET HTTP request.

You can also use Kibana to verify that the document was indexed in the Python script. Navigate to port 5601 on your server and click on the Dev Tools section to make the following HTTP request in Kibana:

GET some_index/_doc/1234

Screenshot of Kibana Dev Tools Console getting Elasticsearch document by _id

Conclusion

This tutorial explained how to use Python to index an Elasticsearch document. In addition, many Elasticsearch Python index examples were given to enhance clarification. Today, you learned how to make an index call, check for errors in the response, and confirm if the method call was successful. Python and Elasticsearch make a great team. Use them regularly to facilitate your coding accuracy from this point on.

Just the Code

Here is the complete Elasticsearch Python index example.

#!/usr/bin/env python3

#-*- coding: utf-8 -*-

# import the Elasticsearch low-level client library

from elasticsearch import Elasticsearch

# import the dateime lib for an index timestamp

import datetime

# domain name, or server's IP address, goes in the 'hosts' list

client = Elasticsearch(hosts=["localhost:9200"])

# print the attributes of the Elasticsearch client

print (dir(client.index))

# print the Elasticsearch client instance

print (client)

print ("Elasticsearch client version:", client.info()["version"], "\n")

# declare the Elasticsearch index name and document _id

INDEX_NAME = "some_index"

DOC_ID = 1234

# the Elasticsearch index exists

index_exists = client.indices.exists(index=INDEX_NAME)

# check if the Elasticsearch index exists

if index_exists == False:

# the Elasticsearch index name does not exist

print ("INDEX_NAME:", INDEX_NAME, "does not exist.")

# create a document body for the new Elasticsearch document

doc_body = {

"str field": "Object Rocket",

"int field": 1234,

"bool field": True,

"time field": datetime.datetime.now()

}

# use a try-catch block for any Elasticsearch index() exceptions

try:

# call the Elasticsearch client's index() method

response = client.index(

index = INDEX_NAME,

doc_type = '_doc',

id = DOC_ID,

body = doc_body,

request_timeout=45

)

# print the API response returned by the Elasticsearch client

print ("response:", response)

# evaluate the response dict object to check that the API was successful

if response["_shards"]["successful"] == 1:

print ("INDEX CALL WAS SUCCESS:", response["_shards"]["successful"])

else:

print ("INDEX CALL FAILED")

except Exception as err:

print ("Elasticsearch index() ERROR:", err)

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.