Yext, Inc. announced “Milky Way,” the latest upgrade to the natural language processing (NLP) algorithm that powers Yext Answers, Yext‘s site search product. Headlining this milestone update is the adoption of BERT, (Bidirectional Encoder Representations from Transformers). Developed by Google, BERT is an open source machine learning framework for NLP designed to better understand user searches. By leveraging BERT within Named Entity Recognition (a process to locate and classify named entities mentioned in unstructured text into predefined categories), Yext Answers improves its ability to distinguish locations from other types of entities, including people, jobs, and events. The update includes:
- Improved Named Entity Recognition: By leveraging BERT, Yext Answers can now better understand the contextual relationship between search terms. Answers will return a more relevant result by taking into account the correct classification, whether a location, person or product.
- Improved Location Detection: The update leaves behind location biasing. Now, Yext Answers will filter through locations stored by a business in their Yext knowledge graph to surface the best match.
- Updated Healthcare Taxonomy: More than 3,000 new healthcare-related synonyms, conditions, treatments, and procedures have been added to the algorithm’s taxonomy.
- Improved Stemming and Typo Tolerance.
Google-affiliated researchers released the Language Interpretability Tool (LIT), an open source, framework-agnostic platform and API for visualizing, understanding, and auditing natural language processing models. It focuses on questions about AI model behavior, like why models made certain predictions and why they’re performing poorly with input corpora. LIT incorporates aggregate analysis into a browser-based interface that’s designed to enable explorations of text generation behavior. The tool set is architected so that users can hop between visualizations and analysis to test hypotheses and validate those hypotheses over a data set. New data points can be added on the fly and their effect on the model visualized immediately, while side-by-side comparison allows for two models or two data points to be visualized simultaneously. And LIT calculates and displays metrics for entire data sets to spotlight patterns in model performance, including the current selection, manually generated subsets, and automatically generated subsets.
LIT works with any model that can run from Python, the Google researchers say, including TensorFlow, PyTorch, and remote models on a server. And it has a low barrier to entry, with only a small amount of code needed to add models and data. The team cautions that LIT doesn’t scale well to large corpora and that it’s not “directly” useful for training-time model monitoring. But they say that in the near future, the tool set will gain features like counterfactual generation plugins, additional metrics and visualizations for sequence and structured output types, and a greater ability to customize the UI for different applications.
H/T VentureBeat: https://venturebeat.com/2020/08/14/google-open-sources-lit-a-toolset-for-evaluating-natural-language-models/
Lexalytics announced that Zignal Labs, creator of the Impact Intelligence platform for measuring the evolution of opinion in real time, has added Lexalytics Salience engine to extend its platform’s natural language processing (NLP) and text analytics capabilities to help marketers, communicators and analysts gain a greater understanding of perceptions across traditional and social media. With Lexalytics, Zignal’s customers across industries can understand what people are saying about products, services or current events, categorize discussions into separate groupings and themes, and evaluate the sentiment of media coverage across multiple languages.
Neofonie announced that TXTWerk – Text mining for SAP solutions, a framework application is now available for trial and online purchase on SAP App Center, the digital marketplace for SAP partner offerings. TXTWerk is delivered online as a subscription service and integrates with SAP and third-party software through the API management capabilities of SAP Cloud Platform Integration Suite. TXTWerk enables the extraction of metadata from texts, providing structured data from unstructured texts. By applying machine learning techniques in combination with rule-based approaches, TXTWerk can read and understand texts quickly. Whether 1,000 or 10 billion documents need to be processed, TXTWerk recognizes the most important keywords, people, places, organizations, events and key concepts and links them to sources such as knowledge graphs or internal company data. Also, part of the framework are artificial intelligence (AI) processes for classification in classes defined by the customer, a sentiment analysis of texts, phrase and role recognition as well as the automatic linking of entities according to specially defined relations. In addition to the AI processes, TXTWerk comes with a knowledge graph with over seven million entries.
Luminoso’s new deep learning model understands documents using multiple layers of attention, a mechanism that identifies which words are relevant to get context around a specific concept as expressed by a word or phrase. This model is capable of identifying the author’s sentiment for each individual concept they’ve written about, as opposed to providing an analysis of the overall sentiment of the document.
Using Concept-Level Sentiment, users will be able to:
- Effectively analyze mixed feedback — Concept-level sentiment analysis is critical for capturing and understanding the voice of the customer (VoC). For example, product reviews rarely contain just one type of feedback, and it’s important to tease apart the good from the bad. Getting a polarity for each of the topics in an open-ended survey response is critical for understanding what works and what doesn’t for your customers.
- Quickly surface buried feedback — Uncovering negative comments in overwhelmingly positive open-ended survey responses is critical for better understanding customers and employees. For instance, in voice of the employee (VoE) surveys, employee feedback can be overwhelmingly positive and delivered in an upbeat way in an effort to soften criticisms. Concept-Level Sentiment in Luminoso enables users to quickly identify and understand “buried” feedback, such as negative points in an overwhelmingly positive HR survey.
- Intuitively aggregate concept sentiment across an entire dataset — For instance, after responses to a mobile app market research survey are loaded into Luminoso Daylight, a user can get a distribution of positive, negative, and neutral opinions about every aspect of the mobile experience across all of its mentions in the dataset.
- Analyze customer and employee feedback across multiple languages — Global organizations often receive customer and employee feedback in multiple languages. With Luminoso, users can analyze the sentiment of concepts, natively in 15 languages.