Elastic, a Search AI Company, made two Jina Rerankers available on Elastic Inference Service (EIS), a GPU-accelerated inference-as-a-service that makes it easy to run fast, high-quality inference without complex setup or hosting. These rerankers bring low-latency, high-precision multilingual reranking to the Elastic ecosystem.

Rerankers improve search quality by reordering results based on semantic relevance, helping surface the most accurate matches for a query. They improve relevance across aggregated, multi-query results, without reindexing or pipeline changes. This makes them valuable for hybrid search, RAG, and context-engineering workflows where better context boosts downstream accuracy. The two new Jina reranker models are optimized for different production needs:

Jina Reranker v2 (jina-reranker-v2-base-multilingual)
Built for scalable, agentic workflows.

  • Low-latency inference with strong multilingual performance.
  • Ability to select relevant SQL tables and external functions that best match user queries..
  • Scores documents independently to handle arbitrarily large candidate sets.

Jina Reranker v3 (jina-reranker-v3)
Optimized for high-precision shortlist reranking.

  • Optimized for low-latency inference and efficient deployment in production settings.
  • Strong multilingual performance; maintains stable top-k rankings under permutation.
  • Cost-efficient, cross-document reranking: v3 reranks up to 64 documents together in a single inference call, reasoning across the full candidate set to improve ordering when results are similar or overlapping.

https://ir.elastic.co/news/news-details/2026/Elastic-Adds-High-Precision-Multilingual-Reranking-to-Elastic-Inference-Service-with-Jina-Models/default.aspx