Elastic adds high-precision multilingual reranking to new Elastic Inference Service

Elastic, a Search AI Company, made two Jina Rerankers available on Elastic Inference Service (EIS), a GPU-accelerated inference-as-a-service that makes it easy to run fast, high-quality inference without complex setup or hosting. These rerankers bring low-latency, high-precision multilingual reranking to the Elastic ecosystem.

Rerankers improve search quality by reordering results based on semantic relevance, helping surface the most accurate matches for a query. They improve relevance across aggregated, multi-query results, without reindexing or pipeline changes. This makes them valuable for hybrid search, RAG, and context-engineering workflows where better context boosts downstream accuracy. The two new Jina reranker models are optimized for different production needs:

Jina Reranker v2 (jina-reranker-v2-base-multilingual)
Built for scalable, agentic workflows.

Low-latency inference with strong multilingual performance.
Ability to select relevant SQL tables and external functions that best match user queries..
Scores documents independently to handle arbitrarily large candidate sets.

Jina Reranker v3 (jina-reranker-v3)
Optimized for high-precision shortlist reranking.

Optimized for low-latency inference and efficient deployment in production settings.
Strong multilingual performance; maintains stable top-k rankings under permutation.
Cost-efficient, cross-document reranking: v3 reranks up to 64 documents together in a single inference call, reasoning across the full candidate set to improve ordering when results are similar or overlapping.

https://ir.elastic.co/news/news-details/2026/Elastic-Adds-High-Precision-Multilingual-Reranking-to-Elastic-Inference-Service-with-Jina-Models/default.aspx

Elastic adds high-precision multilingual reranking to new Elastic Inference Service

Subscribe to the Gilbane Advisor

Choose Language

Topics we cover

Policies

Contact