Vector database company Pinecone announced Pinecone Serverless, with a unique architecture and a serverless experience, to deliver cost reductions and eliminate infrastructure hassles, allowing companies to bring better GenAI applications to market faster. Companies can improve the quality of their GenAI applications and have a choice of LLMs just by making more data (or “knowledge”) available to the LLM. Pinecone Serverless includes:

  • Separation of reads, writes, and storage reduces costs for all types and sizes of workloads.
  • Architecture with vector clustering on top of blob storage provides low-latency, fresh vector search over practically unlimited data sizes at a low cost.
  • Indexing and retrieval algorithms built from scratch to enable fast and memory-efficient vector search from blob storage without sacrificing retrieval quality.
  • Multi-tenant compute layer provides efficient retrieval for thousands of users, on demand. This enables a serverless experience in which developers don’t need to provision, manage, or think about infrastructure, as well as usage-based billing that lets companies pay only for what they use.

Pinecone Serverless is launching with integrations to Anthropic, Anyscale, Cohere, Confluent, Langchain, Pulumi, and Vercel. Pinecone Serverless is available in public preview today in AWS cloud regions, and will be available thereafter on Azure and GCP.

https://www.pinecone.io/blog/serverless/