Google introduced “ToTTo: A Controlled Table-To-Text Generation Dataset”, an open domain table-to-text generation dataset created using a novel annotation process (via sentence revision) along with a controlled text generation task that can be used to assess model hallucination. ToTTo (shorthand for “Table-To-Text”) consists of 121,000 training examples, along with 7,500 examples each for development and test. Due to the accuracy of annotations, this dataset is suitable as a challenging benchmark for research in high precision text generation. The dataset and code are open-sourced on our GitHub repo.
In the last few years, research in natural language generation, used for tasks like text summarization, has made tremendous progress. Yet, despite achieving high levels of fluency, neural systems can still be prone to hallucination (i.e.generating text that is understandable, but not faithful to the source), which can prohibit these systems from being used in many applications that require high degrees of accuracy.
While the process of assessing the faithfulness of generated text to the source content can be challenging, it is often easier when the source content is structured (e.g., in tabular format). Moreover, structured data can also test a model’s ability for reasoning and numerical inference. However, existing large scale structured datasets are often noisy (i.e., the reference sentence cannot be fully inferred from the tabular data), making them unreliable for the measurement of hallucination in model development.
AI inside Inc. announced a new service to support multiple languages on AI-OCR “DX Suite”, through the release of new AI-engine that can recognize English, Traditional Chinese, Thai, and Vietnamese characters. Through this AI inside will begin its global expansion starting with the Asian market. Through the provision of AI-OCR “DX Suite”, AI inside has contributed to improve the operational efficiency and productivity of companies and municipalities in Japan through the high accuracy recognition of both printed and handwritten characters. AI inside has also been developing a foreign language recognition AI-engine in order to expand the availability of “DX Suite” beyond Japan to other countries around the world. This foreign language AI-engine has now achieved commercially viable accuracy to recognize characters of English, Traditional Chinese, Thai, and Vietnamese, and now announce the availability of this multilingual service on the cloud version of “DX Suite”. For current users of “DX Suite” cloud version it is possible to utilize this multilingual service without any additional registration.
Cloudflare, Inc. announced the release of Cloudflare Pages, a new website development platform. Cloudflare Pages is JAMstack-compatible and offers security, scalability, pricing, and performance. Cloudflare Pages provides developers a simpler, faster, and more collaborative way to build websites for free. Performance on the web has always been a battle against the speed of light—accessing a site from London that is served from Seattle, WA means every single asset request has to travel over seven thousand miles. Cloudflare Pages helps with the web performance battle, building entire sites directly onto the edge of the Internet, and closer to the end-users. With Cloudflare Pages, developers can focus on building brilliant websites rather than spending time on integrating disparate systems. Pages integrates seamlessly with GitHub to simplify the development process, to collect and integrate feedback from multiple stakeholders, and to deploy those changes quickly to the edge.
Triton Digital announced they have expanded the multilingual capabilities of the Omny Studio podcast management platform to six languages. In addition to English, French, Spanish, and Portuguese, the platform is now available in German and Italian. In addition to a multilingual CMS, the Omny Studio platform also supports the translation of embed players that match users’ browser language, which includes both German and Italian.
Microsoft announced a number of updates for Macs and new versions of Microsoft 365 for Mac apps that run natively on Macs with M1. Office apps, Outlook, Word, Excel, PowerPoint, and OneNote will take full advantage of the performance improvements on new Macs. The new apps are Universal so they will continue to run on Macs with Intel processors, and have been redesigned to match the new look of macOS Big Sur. Microsoft Teams is currently available in Rosetta emulation mode on Macs with M1 and the browser. We are working on universal app support for M1 Macs and will share more news as our work progresses.
The new Outlook for Mac is redesigned to match the look of macOS Big Sur, and an updated Office Start experience for Word, Excel, PowerPoint, and OneNote for Mac that incorporates the Fluent UI design system. There is now support for iCloud accounts in the new Outlook for Mac. Other office productivity tools include natural language search, data extraction from photos to Excel, voice command additions, additional synchronization and sharing tools, a new modern commenting experience in Word for Mac, and Microsoft Information Protection sensitivity labels to classify and protect data through manual and automatic content labeling. For more details and availability see:
Glue Collaboration, provider of collaborative, real-time VR software services, announced a new release of Glue that enables greater immersion and frictionless interaction for remote teams as they co-create, learn, plan and share. Glue provides shared virtual environments where dispersed participants can come together as if they were face to face in a real physical space. Appealing to people’s visual, haptic and auditory senses, Glue provides a level of immersion in remote meetings simply not possible with conventional video conferencing software.
Glue introduced new expressive avatars that use artificial intelligence and advanced graphics to more closely mimic people’s behavior and features to make communication feel as natural as it does in the real world. Using the new built-in avatar configurator, users can also create their own avatar, adjusting face shape and features, hair and clothing as well as customizing colors. Millions of permutations are possible. The new operating system comes with speech-to-text technology, a new whiteboard for ideation, now also accessible to non-VR Glue users, as well as a camera that zooms and shoots in the resolution users choose. Glue has also made improvements to the way users manage their teams, files and spaces.
nVoq Incorporated announced general availability of nVoq.Voice, its newest offering from the nVoq Platform of medically infused speech recognition solutions. nVoq.Voice is a highly accurate, HIPAA compliant speech-to-text solution that enables clinicians to create a comprehensive patient note in seconds. nVoq.Voice offers clinicians the simplicity they’ve come to enjoy from consumer solutions, coupled with the enterprise-grade security and reliability they need to meet HIPAA and other compliance standards.
Vodori, creator of cloud-based software that helps life science companies get regulated content to market, announced two new products: Pepper Folio, a sales enablement platform, and Pepper Insights, an embedded analytics solution. These new cloud software applications extend the capabilities of Vodori’s Pepper Cloud Product Suite, which delivers a complete content management solution to life sciences companies. The Pepper Cloud family of products help life science companies streamline how marketing, medical, legal and regulatory professionals work together.
Once advertising, promotional, and scientific content has been approved in Pepper Flow, it is automatically available in Pepper Folio so sales reps and MSLs always have access to the latest content for engaging healthcare providers and key opinion leaders. When in the field, reps and MSLs can use Pepper Folio to curate content collections, eDetail on the spot, and share content after engagements to stay connected. Throughout the content lifecycle, teams are able to pull a wide range of data sets from Pepper Insights–from average content review times to which content sales reps are using most to drive high-value content creation and optimize their internal processes.