Elasticsearch 6 Improvements

Introduction

If you’re running an earlier version of Elasticsearch, you may be wondering whether it’s worth upgrading to Elasticsearch 6. Fortunately, we can help you make the call. In this article, we’ll give you an overview of Elasticsearch 6 improvements and discuss how they impact performance, efficiency and recovery.

Elasticsearch 6 Improvements

Streamlining Upgrades

One downside of a major upgrade is having to do a full restart of your cluster; however, this won’t be a problem if you’re upgrading from the latest 5.x version of Elasticsearch to Elasticsearch 6. Instead, you’ll be able to perform a rolling upgrade with no downtime at all for your cluster. The only exception to this applies if you’re currently using X-Pack Security without SSL/TLS. The 6.0 version of X-Pack Security requires TLS between nodes, and you’ll need to perform a cluster restart to enable it if you’re not already making use of it.

Another way that Elastic has made upgrades go more smoothly is through cross-cluster searching. Once you upgrade to Elasticsearch 6, the new version will be able to read indices that were created in a 5.x version of the product but not indices that were created in 2.x. Fortunately, that doesn’t mean you’ll have to re-index all these older indices– you can elect to keep them in a cluster running 5.x and use cross-cluster search to simultaneously search your 6.x clusters and 5.x clusters.

Improving Efficiency

An upgrade to Elasticsearch 6 brings with it the introduction of sequence IDs to streamline the process of shard recovery. In previous versions of Elasticsearch, if a node gets disconnected from its cluster for some reason, every shard on that node would need to be re-synced in a lengthy, costly process; even a rolling node restart would be slow to complete. The use of sequence IDS enables shards to replay only the specific operations that are missing from that particular shard, dramatically speeding up the restart process.

Another change that boosts efficiency in a big way is the debut of index sorting. Typically, a large part of each search request is the sorting of results in order to return the 10 most relevant hits. Index sorting flips this workflow around, having sorting take place at index time instead of search time. With this process in place, a search is complete as soon as it collects enough hits. Although index sorting clearly offers significant performance wins, it’s not applicable for every situation. When you sort your documents at indexing time, you’ll need to choose the same sort order that will serve as your primary sort order at search time– fields like name, timestamp and price would work well in this scenario. If you plan to sort on relevance _score, index sorting isn’t going to make sense; it’s also not the right solution for searches with aggregations.

Indexing Changes and Search Features

If you’re migrating to Elasticsearch 6.0, there’s one key adjustment to be aware of in advance– indices can now have only one mapping type. Indices created in version 5.x that have multiple mapping types will still function as expected, but new indices can only be created with a single mapping type. This move is part of a gradual process to eliminate mapping types across the board. While the crew at Elastic understands that bigger changes aren’t always easy on the users, there’s no doubt that enforcing the use of just one mapping type for an index will result in noticeable performance gains.

In addition to this significant change, there are a few other new features related to search and indexing. Two new field types have been introduced: ip_range and icu_collation_keyword. The ip_range field type enables users to index a range of IPv4 or IPv6 (or a combination of both) addresses, while the icu_collation_keyword type offers support for sorting Elasticsearch documents in a language-specific order.

The _all field has also been deprecated. This special field acted as a “catch-all”, concatenating the values of all the other fields in an index; this allowed users to easily search all fields at once in a query. Unfortunately, this convenience didn’t come for free– the _all field eats up a lot of storage space. It’s still possible to search all fields in an index, as long as you specify it in your query_string or simple_query_string.

Conclusion

It’s clear that the release of Elasticsearch 6 brought with it some important changes. A few of the improvements, such as the ability to sort at index time, are tied to the product’s move to the Lucene 7 engine; other changes simply reflect a push toward better performance and overall efficiency. With this overview of Elasticsearch 6 improvements, you can understand the benefits of upgrading and make an informed decision about whether it’s the right time to take the plunge.

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.