Curated for content, computing, and digital experience professionals

Month: May 2024 (Page 1 of 2)

Perplexity introduces Perplexity Pages

Snippets from the Perplexity blog…

You’ve used Perplexity to search for answers, explore new topics, and expand your knowledge. Now, it’s time to share what you learned. Meet Perplexity Pages, your new tool for easily transforming research into visually stunning, comprehensive content. Whether you’re crafting in-depth articles, detailed reports, or informative guides, Pages streamlines the process so you can focus on sharing your knowledge with the world.

Pages lets you effortlessly create, organize, and share information. Search any topic, and instantly receive a well-structured, beautifully formatted article. Publish your work to our growing library of user-generated content and share it directly with your audience with a single click. What sets Perplexity Pages apart?

  • Customizable: Tailor the tone of your Page to resonate with your target audience, whether you’re writing for general readers or subject matter experts.
  • Adaptable: Easily modify the structure of your article—add, rearrange, or remove sections to best suit your material and engage your readers.
  • Visual: Elevate your articles with visuals generated by Pages, uploaded from your personal collection, or sourced online.

Pages is rolling out to users now. Log in to your Perplexity account and select “Create a Page” in the library tab.

https://www.perplexity.ai/page/new

Sinequa releases new generative AI assistants

Sinequa announced the availability of Sinequa Assistants; enterprise generative AI assistants that integrate with enterprise content and applications to augment and transform knowledge work. Sinequa’s Neural Search complements GenAI and provides the foundation for Sinequa’s Assistants. Its capabilities go beyond RAG’s conventional search-and-summarize paradigm to intelligently execute complex, multi-step activities, all grounded in facts to augment the way employees work.

Sinequa’s Assistants leverage all company content and knowledge to generate contextually-relevant insights and recommendations. Optimized for scale with three custom-trained small language models (SLMs), Sinequa Assistants help ensure accurate conversational responses on any internal topic, complete with citations and traceability to the original source.

Sinequa Assistants work with any public or private generative LLM, including Cohere, OpenAI, Google Gemini, Microsoft Azure Open AI, and Mistral. The Sinequa Assistant framework includes ready-to-go Assistants along with tools to define custom Assistant workflows so that customers can use an Assistant out of the box, or tailor and manage multiple Assistants from a single platform. These Assistants can be tailored to fit the needs of specific business scenarios and deployed and updated quickly without code or additional infrastructure. Domain-specific assistants scientists, engineers, lawyers, financial asset managers and others are available.

https://www.sinequa.com/company/press/sinequa-augments-companies-with-release-of-new-generative-ai-assistants

Siteimprove launches new product features

Siteimprove, a platform to help brands stand out with accessible, high-performing digital content experiences, launched new capabilities designed to turn large amounts of data into actionable, easy to understand insights, increase cross-organizational collaboration, facilitate confident decision making, and deliver tangible outcomes. Today’s launch includes four initiatives:

  • Top Paths – To identify high-performing content
    • With Top Paths, marketers can understand the impact of their content on conversion metrics to focus on what moves the needle.
  • Visitor Engagement Score – the Digital Certainty Index (DCI)score of engagement
    • 95 percent of website visits are non-converting, but that doesn’t mean they fail to deliver value. With Visitor Engagement Score, marketers can now measure visitor engagement with their content beyond conversions to better understand the full impact of their organizations’  across the customer journey.
  • No-Code Event Tracking – event configuration without the hassle
    • With No-Code Event Tracking, marketers can now set up events quickly, with full transparency without technical expertise.
  • Sites Progress – tell a convincing and accurate story of the progress across your entire website
    • With Sites Progress, digital marketing teams can now understand and communicate the progress the teams are making, across all the sites, by consolidating data in a single, easy-to-understand view.

https://www.siteimprove.com/hello/new-and-siteimproved-2024-q2-product-release

Tonic.ai launches secure unstructured data lakehouse for LLMs

Tonic.ai launched a secure data lakehouse for LLMs, Tonic Textual, to enable AI developers to securely leverage unstructured data for retrieval-augmented generation (RAG) systems and large language model (LLM) fine-tuning. Tonic Textual is a data platform designed to eliminate integration and privacy challenges ahead of RAG ingestion or LLM training bottlenecks. Leveraging its expertise in data management and realistic synthesis, Tonic.ai has developed a solution to tame and protect siloed, messy, and complex unstructured data into AI-ready formats ahead of embedding, fine-tuning, or vector database ingestion. With Tonic Textual: 

  1. Build, schedule, and automate unstructured data pipelines that extract and transform data into a standardized format convenient for embedding, ingesting into a vector database, or pre-training and fine-tuning LLMs. Textual supports TXT, PDF, CSV, TIFF, JPG, PNG, JSON, DOCX and XLSX out-of-the-box.
  2. Detect, classify, and redact sensitive information in unstructured data, and re-seed redactions with synthetic data to maintain the semantic meaning. Textual leverages proprietary named entity recognition (NER) models trained on a diverse data set spanning domains, formats, and contexts to ensure sensitive data is identified and protected.
  3. Enrich your vector database with document metadata and contextual entity tags to improve retrieval speed and context relevance in RAG systems.

https://www.tonic.ai/textual

Gilbane Advisor 5-22-24 — Text + KG embeddings, floppies!

This week we feature articles from Sunila Gollapudi, and Leontien Talboom & Chris Knowles.

Additional reading comes from Heather Hedden, Cassie Kozyrkov, and Jim Clyde Monge.

News comes from Elastic, DataStax, Flatfile, and Foxit & Straker Translations.

Note: We’ll be off next week, back on June 5th.

All previous issues are available at https://gilbane.com/gilbane-advisor-index


Opinion / Analysis

Combine text embeddings and knowledge (graph) embeddings in RAG systems

Sunila Gollapudi provides a good introduction and how-to suitable for technical and not so technical readers.

“In this article, I am excited to present my experiments combining Text Embeddings and Knowledge (Graph) Embeddings and observations on RAG performance. I will start by explaining the concept of Text and Knowledge Embeddings independently, using simple open frameworks, then, we will see how to use both in RAG applications.” (15 min)

https://towardsdatascience.com/combine-text-embeddings-and-knowledge-graph-embeddings-in-rag-systems-5e6d7e493925

Raw flux streams and obscure formats: Further work around imaging 5.25-inch floppy disks

I’m sure the subject has some of you shaking your heads for any number of reasons. But for those connected with digital preservation efforts, this case-study/lessons-learned piece from Leontien Talboom & Chris Knowles at Cambridge University could be very helpful. Some of the comments may also be useful. The just-curious may be shocked at the complexity involved. (8 min)

https://digitalpreservation-blog.lib.cam.ac.uk/raw-flux-streams-and-obscure-formats-further-work-around-imaging-5-25-inch-floppy-disks-5a2cf2e5f0d1

More Reading

All Gilbane Advisor issues


Content technology news

DataStax launches new Hyper-Converged Data Platform

Brings OpenSearch and Apache Pulsar to HCD Platform; DataStax Enterprise 6.9 enables self-managed data workloads for GenAI.
https://www.datastax.com/press-release/datastax-launches-new-hyper-converged-data-platform-giving-enterprises-the-complete-modern-data-center-suite-ceeded-for-ai-in-production

Architecture optimized for real-time, low-latency applications including search, retrieval augmented generation (RAG), observability & security.
https://ir.elastic.co/news/news-details/2024/Elastic-Announces-First-of-its-kind-Search-AI-Lake-to-Scale-Low-Latency-Search/default.aspx

Flatfile unveils new AI-powered data transformation features

Data transformation and data migration capabilities for business users, data analysts, systems integration teams, and enterprise developers.
https://flatfile.com/news/flatfile-unveils-ai-powered-data-transformation/

Foxit partners with Straker Translations

The collaboration adds translation capabilities to Foxit’s eSignature services, enabling users to translate and sign documents in multiple languages.
https://www.foxit.com ■ https://www.straker.ai

All content technology news


The Gilbane Advisor is authored by Frank Gilbane and is ad-free, cost-free, and curated for content, computing, web, data, and digital experience technology and information professionals. We publish recommended articles and content technology news most Wednesdays. We do not sell or share personal data.

Subscribe | View online | Editorial policy | Privacy policy | Contact

Elastic announced Search AI Lake to scale low latency search

Elastic, a Search AI company, today announced Search AI Lake, a cloud-native architecture optimized for real-time, low-latency applications including search, retrieval augmented generation (RAG), observability and security. The Search AI Lake also powers the new Elastic Cloud Serverless offering. All operations, from monitoring and backup to configuration and sizing, are managed by Elastic – users just bring their data and choose Elasticsearch, Elastic Observability, or Elastic Security on Serverless. Benefits include:

  • Fully decoupling storage and compute enables scalability and reliability using object storage, dynamic caching supports high throughput, frequent updates, and interactive querying of large data volumes.
  • Multiple enhancements maintain query performance even when the data is safely persisted on object stores.
  • By separating indexing and search at a low level, the platform can automatically scale to meet the needs of a wide range of workloads.
  • Users can leverage a native suite of AI relevance, retrieval, and reranking capabilities, including a native vector database integrated into Lucene, open inference APIs, semantic search, and first- and third-party transformer models, which work with the array of search functionalities.
  • Elasticsearch’s query language, ES|QL, is built in to transform, enrich, and simplify investigations with fast concurrent processing irrespective of data source and structure.

https://ir.elastic.co/news/news-details/2024/Elastic-Announces-First-of-its-kind-Search-AI-Lake-to-Scale-Low-Latency-Search/default.aspx

DataStax to launch new Hyper-Converged Data Platform

DataStax announced the upcoming launch of DataStax HCDP (Hyper-Converged Data Platform), in addition to the upcoming release of DataStax Enterprise (DSE) 6.9. Both products enable customers to add generative AI and vector search capabilities to their self-managed, enterprise data workloads. DataStax HCDP is designed for modern data centers and Hyper-Converged Infrastructure (HCI) to support the breadth of data workloads and AI systems. It supports on-premises enterprise data systems built to AI-enable data and is designed for enterprise operators and architects.

The combination of OpenSearch’s Enterprise Search capabilities, with the high-performance vector search capabilities of the DataStax cloud-native, NoSQL Hyper-Converged Database, enables users to speed RAG and knowledge retrieval applications into production.

Hyper-converged streaming (HCS) built with Apache Pulsar is designed to provide data communications for a modern infrastructure. With native support of inline data processing and embedding, HCS brings vector data to the edge, allowing for faster response times and enabling event data for better contextual generative AI experiences.

HCDP provides rapid provisioning and data APIs built around the DataStax one-stop GenAI stack for enterprise retrieval-augmented generation (RAG), and it’s all built on the open-source Apache Cassandra platform.

https://www.datastax.com/press-release/datastax-launches-new-hyper-converged-data-platform-giving-enterprises-the-complete-modern-data-center-suite-ceeded-for-ai-in-production

Foxit partners with Straker Translations

Foxit Software, a provider of PDF products and services, today announced a strategic partnership with Straker Translations, integrating Straker’s AI-powered language translation technology into the Foxit ecosystem. Foxit and Straker’s collaboration provides on-demand, accurate translation capabilities to Foxit’s eSignature services, enabling users to seamlessly translate and sign documents in multiple languages.

Straker’s integration within the Foxit eSignature solution will be valuable for Foxit users across critical sectors such as finance, legal, insurance, tax accounting, healthcare, and biotech, where precision and accessibility in documentation are essential.

This integration ensures that Foxit’s diverse international user base can engage with legal documents in their native language, enhancing understanding and compliance while simplifying the signing process for documents that cross linguistic borders. The addition of Straker’s translation technology not only streamlines the workflow but also enhances legal compliance and reduces the risk of misunderstandings in global transactions.

https://www.foxit.comhttps://www.straker.ai

« Older posts

© 2024 The Gilbane Advisor

Theme by Anders NorenUp ↑