Elastic, a Search AI Company, made two Jina Rerankers available on Elastic Inference Service (EIS), a GPU-accelerated inference-as-a-service that makes it easy to run fast, high-quality inference without complex setup or hosting. These rerankers bring low-latency, high-precision multilingual reranking to the Elastic ecosystem.
Rerankers improve search quality by reordering results based on semantic relevance, helping surface the most accurate matches for a query. They improve relevance across aggregated, multi-query results, without reindexing or pipeline changes. This makes them valuable for hybrid search, RAG, and context-engineering workflows where better context boosts downstream accuracy. The two new Jina reranker models are optimized for different production needs:
Jina Reranker v2 (jina-reranker-v2-base-multilingual)
Built for scalable, agentic workflows.
- Low-latency inference with strong multilingual performance.
- Ability to select relevant SQL tables and external functions that best match user queries..
- Scores documents independently to handle arbitrarily large candidate sets.
Jina Reranker v3 (jina-reranker-v3)
Optimized for high-precision shortlist reranking.
- Optimized for low-latency inference and efficient deployment in production settings.
- Strong multilingual performance; maintains stable top-k rankings under permutation.
- Cost-efficient, cross-document reranking: v3 reranks up to 64 documents together in a single inference call, reasoning across the full candidate set to improve ordering when results are similar or overlapping.

