Get the mapping of an Elasticsearch index in Python

Have a Database Problem? Speak with an Expert for Free
Get Started >>

Introduction to the _mapping schema for Elasticsearch schema

The _mapping of an index in Elasticsearch is the essentially the schema for the documents. It’s the layout of the index’s fields that sets up a blueprint for storing and organizing fields, and their respective data, when documents are indexed to the Elasticsearch index.

Getting the mapping for an Elasticsearch index in Python

This article will show you how to install and run Elasticsearch so that you can import the client library in Python and retrieve and parse the data for an index’s _mapping schema. Make sure you already have an index created (with a mapping schema) before testing out the code in this article.

Getting the mapping of an Elasticsearch index in Python

This article will demonstrate how you can use the low-level Python client for Elasticsearch to return an index’s _mapping schema in the form of nested Python dictionary using the indices.get_alias() method of the client library.

Prerequisites to using the Python low-level client for Elasticsearch

  • Install the Elasticsearch service on your system or server. On an Ubuntu flavor of Linux you can download the latest, stable Debian package and install it with the dpkg -i command.

  • If you’re using macOS you can install the Elasticsearch service using Homebrew. First, tap into the repository with the brew tap elastic/tap command to update the Elasticsearch “formula”, and then brew install elastic/tap/elasticsearch-full to install the latest, default distribution. Use the curl localhost:9200 command to ensure that the cluster is running.

Verify that Python 3 and PIP3 are installed

  • The code in this article was designed with Python 3 in mind, and it hasn’t been test on Python 2.7. Use the python3 -V command to verify that Python 3 is installed.

  • You’ll need to use Python’s PIP3 package manager to install the Elasticsearch low-level client for Python. Use the pip3 -V command to check if PIP is installed, and you can input pip3 list to see which packages are already installed.

Install the Elasticsearch low-level client using PIP3 for Python

  • To install the Elasticsearch Python client for Python just use the pip3 command: pip3 install elasticsearch. If you’d like to upgrade the client’s package library you can use the --upgrade flag: python3 -m pip install --upgrade elasticsearch.

Screenshot of a terminal returning version information for Elasticsearch and Python

Import the Python packages for Elasticsearch

1
2
3
4
5
6
7
8
# import the elasticsearch client library
from elasticsearch import Elasticsearch

# import the exceptions library for Elasticsearch
from elasticsearch import exceptions

# import Python's json library to format JSON responses
import json

Connect to the Elasticsearch cluster in Python

1
2
3
4
5
6
7
8
9
10
11
12
try:

    # declare a client instance of the Python Elasticsearch library
    client = Elasticsearch("http://localhost:9200")

    # pass client object to info() method
    elastic_info = Elasticsearch.info(client)
    print ("Cluster info:", json.dumps(elastic_info, indent=4 ))

except Exception as err:
    print ("Elasticsearch client ERROR:", err)
    client = None

Get all of the Elasticsearch cluster’s indices using Python

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# if client is not 'None'
if client != None:

    # returns a list of all the cluster's indices
    all_indices = client.indices.get_alias("*")

    # print all the attributes for client.indices
    print (dir(client.indices), "\n")

    # iterate over the index names
    for ind in all_indices:
        """
        DO STUFF WITH INDEX NAMES HERE
        """

Get the mapping for each Elasticsearch index using Python

Now this is the part where we can get the index’s _mapping schema (as a Python dict object) in order to parse its attribute data and access its field names.

Get the mapping schema for the Elasticsearch indices

The client.indices.get_alias("*") line of code from earlier retrieves all of the cluster’s indexes—including the default Kibana indices (if any). The following code will evaluate if the string "kibana" is in the index name in order to skip over them:

1
2
        # skip indices with 'kibana' in name
        if "kibana" not in ind.lower():

Pass the index name to the client’s ‘indices.get_mapping()’ method to get its schema

1
2
3
4
5
6
7
            try:
                # print the Elasticsearch index name
                print ("\nindex name:", ind)

                # returns dict object of the index _mapping schema
                raw_data = client.indices.get_mapping( ind )
                print ("get_mapping() response type:", type(raw_data))

Get the dict object’s keys to access the mapping’s fields

1
2
3
4
5
6
7
                # returns dict_keys() obj in Python 3
                mapping_keys = raw_data[ ind ]["mappings"].keys()
                print ("\n_mapping keys():", mapping_keys)

                # get the index's doc type
                doc_type = list(mapping_keys)[0]
                print ("doc_type:", doc_type)

Access the dict object’s field attributes using the _doc type

1
2
3
4
5
6
7
8
                # get the schema by accessing index's _doc type attr
                schema = raw_data[ ind ]["mappings"][ doc_type ]["properties"]
                print (json.dumps(schema, indent=4))
                print ("\n# of fields in _mapping:", len(schema))
                print ("all fields:", list(schema.keys()) )

            except exceptions.NotFoundError as err:
                print ("exceptions.NotFoundError error for", ind, "--", err)

Get the Elasticsearch index’s mapping template

1
2
3
4
5
6
7
8
            # try and see if index has a _mapping template
            try:
                # returns dict object of the index _mapping schema
                template = client.indices.get_template( ind )
                print ("template schema:", json.dumps(template, indent=4))

            except exceptions.NotFoundError as err:
                print ("get_template() error for", ind, "--", err)

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.