Curated for content, computing, and digital experience professionals

Category: Semantic technologies (Page 16 of 72)

Our coverage of semantic technologies goes back to the early 90s when search engines focused on searching structured data in databases were looking to provide support for searching unstructured or semi-structured data. This early Gilbane Report, Document Query Languages – Why is it so Hard to Ask a Simple Question?, analyses the challenge back then.

Semantic technology is a broad topic that includes all natural language processing, as well as the semantic web, linked data processing, and knowledge graphs.


Speech recognition

In computer science, speech recognition (SR) is the translation of spoken words into text. It is also known as “automatic speech recognition”, “ASR”, “computer speech recognition”, “speech to text”, or just “STT”. Some SR systems use “training” where an individual speaker reads sections of text into the SR system.

Microsoft adds Hindi to Text Analytics service to strengthen Sentiment Analysis

Microsoft announced the addition of Hindi as the latest language under its Text Analytics service to support businesses and organizations with customer Sentiment Analysis. Text Analytics is part of the Microsoft Azure Cognitive Services. Using this service, organizations can find out what people think of their brand or topic as this enables analyzing Hindi text for clues about positive, neutral, or negative sentiment. The Text Analytics service can be used for any textual/audio input or feedback in combination with Azure Speech-to-Text service. Microsoft’s Text Analytics service uses the latest AI models to analyze content in Hindi, using Natural Language Processing (NLP) for text mining and text analysis. The functionality provided by Text Analytics include sentiment analysis, opinion mining, key phrase extraction, language detection, named entity recognition, and PII detection. Sentiment analysis currently supports more than 20 languages including Hindi.

Microsoft Text Analytics service’s Sentiment Analysis feature evaluates text and returns confidence scores between 0 and 1 for positive, neutral, and negative sentiment for each document and sentences within a document. The service also provides sentiment labels (such as “negative”, “neutral” and “positive”) based on the highest confidence score at a sentence and document-level. It can be accessed from Azure cloud and on-prem using Containers. This helps brands in detecting positive and negative tonality in customer reviews, social media & call center conversations, and forum discussions, among other channels no matter where their data resides.

https://news.microsoft.com/en-in/microsoft-adds-hindi-to-its-text-analytics-service-to-strengthen-sentiment-analysis-support-for-businesses/

Google introduces Document AI platform for document processing

Google Cloud announced the new Document AI (DocAI) platform, a unified console for document processing. Transforming documents into structured data increases the speed of decision making for companies, unlocking business value and helping develop better experiences for customers. Historically, doing this at scale hasn’t been efficient. DocAI is designed to help businesses use Artificial Intelligence (AI) and machine learning to automate these processes. Today, the DocAI platform is available in preview, enabling you to:

  • Ensure your data is accurate and compliant: Automate and validate all your documents to streamline compliance workflows, reduce guesswork, and keep data accurate and compliant.
  • Make better business decisions: Improve operational efficiency by extracting structured data from unstructured documents and making that available to your business applications and users.
  • Use your data to meet customer expectations: Leverage insights to meet customer expectations and improve CSAT, advocacy, lifetime value, and spend.

With the new DocAI platform, you can access all parsers, tools and solutions (e.g. Lending DocAI, Procurement DocAI) with a unified API, enabling a document solution from evaluation to deployment. It allows creation and customization of document processing workflows. Data extraction is now easier because the specialized parsers on the platform are built with Google Cloud’s predefined taxonomy, without the need to perform additional data mapping or training. General parsers such as OCR (Optical Character Recognition), Form parser, and Document splitter are publicly accessible. You can also request access to specialized parsers such as W9, 1040, W2, 1099-MISC, 1003, invoice, and receipts.

https://cloud.google.com/blog/products/ai-machine-learning/google-cloud-announces-document-ai-platform

Expert.ai expands upon cloud offering

Expert.ai announced and will be presenting an enhanced release of its cloud-based Natural Language API today at API World. The new expert.ai NL API features include:

  • Relation extraction to express the connection and accurately answer questions like: “who did what when?”, and “what caused what to whom?”
  • Sentiment analysis considering the intrinsic positivity or negativity of the concepts expressed in text, based on the words used (polarity) and how relevant we judge them (intensity)
  • A new geographic taxonomy to identify and disambiguate countries and some other administrative divisions (e.g., San Jose, CA, USA vs. San Jose, Costa Rica)

Learn more about expert.ai NL API, now available for free testing, visit and sign up to start developing intelligent applications today.

https://developer.expert.ai, https://expert.ai, https://apiworld.co/

sentiment analysis

Sentiment analysis (also known as opinion mining or emotion AI) refers to the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine.

Sensory unveils VoiceHub portal

Sensory unveiled VoiceHub, an online portal that enables developers to quickly create wake word models and voice control command sets for prototyping and proof-of-concept purposes. VoiceHub allows users to select languages and model sizes through drop down menus. Sensory’s VoiceHub provides developers with free tools to immediately create custom wake words and voice command sets for their applications. These projects take just moments to put together and some models are trained and downloadable within an hour of submitting them. VoiceHub outputs wake word and voice command set models, compatible with a companion Android application for quick prototyping, or as code for specific target DSPs for more advanced proof-of-concept testing. The tools allow developers to create wake word models, either custom branded or based on today’s most popular voice assistant platforms, and command set models targeting a desired memory footprint. This makes it suitable for all applications, ranging from ultra-low power, resource limited wearables to high-power, high-performance appliances on the edge.

Based on Sensory’s TrulyHandsfree technology, VoiceHub supports numerous languages for testing voice control across global product lines. Since VoiceHub trains voice models similarly to TrulyHandsfree, the wake word and voice control models created in VoiceHub are accurate and in most cases suitable for mass production. VoiceHub users can expect a steady stream of updates and new features, including support for more languages, expanded DSP platform support, and the ability to quickly develop large vocabulary natural language models. At launch, the platform supports DSP platforms from: Ambiq, Analog Devices, Cirrus, Cypress, DSPG, Foretmedia, Knowles, Motorola, NXP, Qualcomm, Renesas, ST Micro and TI.

https://www.sensory.com/voicehub

Microsoft details T-ULRv2 model that can translate between 94 languages

From Kyle Wiggers at VentureBeat

The same week Facebook open-sourced M2M-100, an AI model that can translate between over 100 languages, Microsoft detailed an algorithm of its own — Turing Universal Language Representation (T-ULRv2) — that can interpret 94 languages. The company claims T-ULRv2 achieves the top results in XTREME, a natural language processing benchmark created by Google, and will use it to improve features like Semantic Search in Word and Suggested Replies in Outlook and Teams ahead of availability in private preview via Azure.

T-ULRv2, a joint collaboration between Microsoft Research and the Microsoft Turing team, contains a total of 550 million parameters, or internal variables that the model leverages to make predictions. (By comparison, M2M-100 has around 15 billion parameters). Microsoft researchers trained T-ULRv2 on a multilingual data corpus from the web that consists of the aforementioned 94 languages. During training, the model learned to translate by predicting masked words from sentences in different languages, occasionally drawing on context clues in pairs of translations like English and French.

As Microsoft VP Saurabh Tiwary and assistant managing director Ming Zhou note in a blog post, the XTREME benchmark covers 40 languages spanning 12 families and 9 tasks that require reasoning about varying levels of syntax. The languages are selected to maximize diversity, coverage in existing tasks, and availability of training data, and the tasks cover a range of paradigms including sentence text classification, structured prediction, sentence retrieval, and cross-lingual question answering. For models to be successful on the XTREME benchmarks, then, they must learn representations that generalize to many standard cross-lingual transfer settings.

The jury is out on T-ULRv2’s potential for bias and its grasp of general knowledge. Some research suggests benchmarks such as XTREME don’t measure models’ knowledge well and that models like T-ULRv2 can exhibit toxicity and prejudice against demographic groups. But the model is in any case a step toward Microsoft’s grand “AI at scale” vision, which seeks to push AI capabilities by training algorithms with increasingly large amounts of data and compute. Already, the company has used its Turing family of models to bolster language understanding across Bing, Office, Dynamics, and its other productivity products.

T-ULRv2 will power current and future language services available through Azure Cognitive Services, Microsoft says. It will also be available as a part of a program for building custom applications, which was announced at Microsoft Ignite 2020 earlier this year. Developers can submit requests for access.

https://venturebeat.com/2020/10/20/microsoft-details-t-urlv2-model-that-can-translate-between-94-languages/, https://www.microsoft.com/en-us/research/blog/microsoft-turing-universal-language-representation-model-t-ulrv2-tops-xtreme-leaderboard/

Tisane Labs adds Wikidata extraction feature on Microsoft Azure

Tisane Labs, a supplier of text analytics AI solutions, announced a new feature in Tisane API, already available on Microsoft Azure Marketplace and AppSource. With the new feature, Tisane API now allows tagging and extraction of Wikidata entities, complementing the capabilities provided by Azure Cognitive Services and supporting nearly 30 languages. Users can easily obtain Wikidata IDs from Tisane’s JSON response providing the ability to annotate text with images, GPS coordinates, important dates, 3rd party references, and whatever the ever-growing and open Wikidata database contains. Tisane API runs in the cloud utilizing Azure API Management, with a simple REST interface that can be linked from any popular programming platform today. Tisane Labs provides a range of tailored plans for its clients with the option of a custom installation on-premises and a free plan.

https://tisane.ai

« Older posts Newer posts »

© 2024 The Gilbane Advisor

Theme by Anders NorenUp ↑