BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News OpenSearch 3.0 Now Generally Available, with a Focus on Vector Database Performance and Scalability

OpenSearch 3.0 Now Generally Available, with a Focus on Vector Database Performance and Scalability

Listen to this article -  0:00

The OpenSearch Software Foundation has announced the general availability of OpenSearch 3.0, the first major release in three years and the first since the project joined the Linux Foundation. This version introduces native support for the Model Context Protocol (MCP), along with pull-based data ingestion and gRPC support, aimed at improving scalability and integration.

OpenSearch was launched in 2021 by AWS as a fork of Elasticsearch 7.10, following Elastic’s license change. With performance as a key focus of this release, OpenSearch 3.0 delivers up to 9.5x faster vector search compared to version 1.3, thanks to support for GPU acceleration and more efficient indexing.

OpenSearch 3.0 upgrades to Apache Lucene 10 and introduces enhancements to data ingestion, transport, and management. James McIntyre, senior product marketing manager at AWS, Saurabh Singh, engineering leader at AWS, and Jiaxiang (Peter) Zhu, senior system development engineer at AWS, explain:

The latest version of Apache Lucene offers significant improvements in performance, efficiency, and vector search functionality. These types of improvements pave the way for larger vector and search deployments, enabling AI workloads to scale factorially over time.

Lucene 10 introduces improvements in both I/O and search parallelism, and requires JVM version 21 or later—resulting in some breaking changes and prompting a major version update. Elasticsearch, which reverted to an open source model under the AGPL license last year, recently released version 9.0.0-rc1, which also supports the latest version of Lucene.

The latest OpenSearch release also adds support for gRPC and pull-based ingestion, and introduces reader-writer separation. This allows indexing and search workloads to be configured independently, ensuring consistent, high-performance operation for each. McIntyre, Singh, and Zhu add:

Benefiting from underlying HTTP/2 infrastructure, gRPC supports multiplexing and bidirectional data streams, enabling clients to send and receive requests concurrently over the same TCP connection. Performance gains can be especially pronounced for users working with large and complex queries, where the overhead of deserializing requests can compound when using JSON.

OpenSearch now also supports index type detection and integrates the dynamic data management framework Apache Calcite, enabling iterative query building and exploration. This is achieved by incorporating the query builder into OpenSearch SQL and PPL. In a popular thread on Hacker News, Joe Johnston writes:

Elastic still has the edge on features. Especially Kibana has a lot more features than Amazon's fork (...) A lot of my consulting clients seem to prefer Opensearch lately. That's mainly because of the less complicated licensing and the AWS support.

Comparing OpenSearch and Elasticsearch, user Macha adds:

One thing that Opensearch misses that would have been very nice to have on a recent project is enrich processors.

OpenSearch is open source under the Apache 2.0 license. More details about the latest release are available in the release notes on GitHub.

About the Author

BT