When to use the keyword type vs text datatype in Elasticsearch

Introduction

When you’re working with data in Elasticsearch, it’s important to understand your options for storing and handling string values. Elasticsearch has two core datatypes that can store string data: text and keyword. It’s easy to get these two types confused, but this tutorial will help set the story straight. In this article, we’ll look at some important differences between these types and discuss when to use a keyword vs a text datatype in Elasticsearch.

Keyword vs Text – Full vs. Partial Matches

The primary difference between the text datatype and the keyword datatype is that text fields are analyzed at the time of indexing, and keyword fields are not. What that means is, text fields are broken down into their individual terms at indexing to allow for partial matching, while keyword fields are indexed as is. For example, a text field containing the value “Roosters crow everyday” would get all of its individual components indexed: “Roosters”, “crow”, and “everyday”; a query on any of those terms would return this string. However, if the same string was stored as a keyword type, it would not get broken down. Only a search for the exact string “Roosters crow everyday” would return it as a result. Because text fields are analyzed in this way, one consequence is that they’re not able to be sorted alphabetically. A keyword field, on the other hand, can be sorted alphabetically in the typical fashion.

Both of these datatypes can prove valuable depending on the situation. Let’s look at some common use cases for each:

Use Cases for text datatype

One useful application of the text datatype is for product descriptions. Imagine a user searching for pajamas– chances are, they’ll simply use “pajamas” as their search terms, and you’ll want your results to give them all products that have the word “pajamas” somewhere in their description.

Use Cases for keyword datatype

The keyword datatype can come in handy for cases where a user will be querying for exact matches. A good example would be a “state” field. A user will search for “North Carolina” but not for the word “North” by itself. Email addresses are also good candidates for the keyword datatype for similar reasons.

Creating an Index with text and keyword Datatypes

Now that we’ve discussed the differences between the text and keyword datatypes, let’s look at some sample code that will show how to create an index containing fields of these types. Our index will be called "demo_index", and it will have two fields: "state" and "product_description". The "state" field will have the keyword datatype, and the "product_description" field will have the text datatype:

PUT demo_index
{
"mappings": {
"_doc": {
"properties": {
"state": {
"type": "keyword"
},
"product_description": {
"type": "text"
}
}
}
}
}

The keyword and text datatypes haven’t always been part of Elasticsearch. Originally, Elasticsearch provided just a single string datatype, and users could set an option called index to either analyzed or not_analyzed in their mapping to specify whether they wanted a string to be broken down into its individual terms upon indexing or simply indexed as is. However, this construct sometimes led to confusion, as some options available for a string type only made sense for one of the two use cases. For this reason, Elastic rolled out the keyword and text datatypes when they released Elasticsearch 5.0. Any backward compatibility with the old string datatype was removed with the release of Elasticsearch 6.0, so it would now be impossible to create an index that utilizes that datatype.

Conclusion

When you’re creating a new index in Elasticsearch, it’s important to understand your data and choose your datatypes with care. Before creating the mapping for an index, it’s helpful to know how users might be searching for data in a specific field; this is especially true when you’re dealing with string data where partial matching may be needed. With the explanations provided in this tutorial, you’ll know when to use a keyword vs a text datatype in Elasticsearch.

Pilot the ObjectRocket platform free for 30 Days

It's easy to get started. Imagine the time you'll save by not worrying about database management. Let's do this!

PILOT FREE FOR 30 DAYS

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.