This week we feature articles by Amber Case, and Ilia Shumailov, Zakhar Shumaylov, Yiren Zhao, Yarin Gal, Nicolas Papernot, & Ross Anderson.
Additional reading comes from Raiza Martin & Steven Johnson, Jerry Liu, Sara Fischer, and Nikhil Simha.
News comes from Adobe, CrafterCMS, and Bloomreach.
👉 It’s time for our annual summer break. Our next issue will be published on September 6. I hope you all have a pleasant, and not-too-hot-or-rainy, August.
All previous issues are available at https://gilbane.com/gilbane-advisor-index
Opinion / Analysis
The curse of recursion: Training on generated data makes models forget
Training on generated data is an iffy proposition at best, yet it is also difficult to avoid, or even detect. This is not an unknown problem, but is certainly under-appreciated. This paper argues that the problem is unavoidable and explains why. There is lots of math but the narrative is well written, making the paper useful even for the math-squeamish. (28 min)
“Model Collapse is a degenerative process affecting generations of learned generative models, where generated data end up polluting the training set of the next generation of models; being trained on polluted data, they then mis-perceive reality.”…
How to identify “Truthy” tech trends
Amber Case provides an entertaining, and at least mostly true, analysis of truthiness in tech. It’s also likely that most of us know multiple people who would benefit from reading it. (8 min)
- “An AI-first notebook, grounded in your own documents” Introducing NotebookLM via Google Labs
- LLM-powered knowledge workers… Data Agents via LlamaIndex blog
- AP strikes news-sharing and tech deal with OpenAI via Axios
- Case study… Chronon — A declarative feature engineering framework via The Airbnb Tech Blog
Content technology news
Adobe Firefly supports prompts in over 100 languages
Users can now generate high-quality images, create text effects, streamline workflows and improve productivity in their language of choice.
CrafterCMS release version 4.1
The updated CMS includes several new content authoring features for creating content-centric digital experiences, and a switch to OpenSearch.
Bloomreach supports OAuth 2.0 authentication
Businesses have the ability to integrate Bloomreach Engagement with third-party applications that require OAuth through webhooks.
The Gilbane Advisor is authored by Frank Gilbane and is ad-free, cost-free, and curated for content, computing, web, data, and digital experience technology and information professionals. We publish recommended articles and content technology news weekly. We do not sell or share personal data.