This week we feature articles from Lauren Hinkel, and Michihiro Yasunaga, Jure Leskovec, & Percy Liang.
Additional reading comes from Antoine Craske, Petr Korab, and Ben Lorica & Kenn So.
News comes from Crafter, Siteimprove, MongoDB, and Foxit.
Reminder: If you’ve missed any recent issues you can see them here.
Opinion / Analysis
LinkBERT: improving language model training with document link
A challenge with most common LM pretraining strategies is that they model a single document at a time. That is, one would split a text corpus into a list of documents and draw training instances for LMs from each document independently. Treating each document independently may pose limitations because documents often have rich dependencies with each other… Models that train without these dependencies may fail to capture knowledge or facts that are spread across multiple documents.
Michihiro Yasunaga, Jure Leskovec, & Percy Liang describe LinkBERT, a pretaining method that builds a graph of multiple documents with link information to address this limitation. Links to the paper and code are included. (9 min).
Hallucinating to better text translation
Moving from Stanford to MIT and UC San Diego… Lauren Hinkel describes another machine learning enhancement method focused on improving language translation by using a transformer that “hallucinates” an image based on text that is then used for multimodal translation. (4 min).
- Case study: Airbnb’s microservices architecture journey to quality engineering via QE Unit
- Trends and opportunities in Distributed computing for AI: a status report via Gradient Flow
- Necessary skill… Text network analysis: theory and practice via Towards Data Science
Content technology news
CrafterCMS releases version 4.0S
Includes new content management and authoring capabilities that enable composability of all types of content-rich digital experiences.
MongoDB unveils vision for a developer data platform
Providing development teams with a wider set of use cases, servicing more of the data lifecycle, optimizing for modern architectures…
Siteimprove launches Prepublish
Siteimprove Prepublish technology to make it easier for marketing departments to optimize content within their DXP or CMS.
Foxit integrates PDF Editor with Microsoft Teams and Office 365
Teams and Office 365 Integration to allow delivery of PDF documents with increased speed and collaboration.
The Gilbane Advisor is curated by Frank Gilbane for content technology, computing, and digital experience professionals. The focus is on strategic technologies. We publish recommended articles and content technology news weekly. We do not sell or share personal data.