From Kyle Wiggers at VentureBeat
The same week Facebook open-sourced M2M-100, an AI model that can translate between over 100 languages, Microsoft detailed an algorithm of its own — Turing Universal Language Representation (T-ULRv2) — that can interpret 94 languages. The company claims T-ULRv2 achieves the top results in XTREME, a natural language processing benchmark created by Google, and will use it to improve features like Semantic Search in Word and Suggested Replies in Outlook and Teams ahead of availability in private preview via Azure.
T-ULRv2, a joint collaboration between Microsoft Research and the Microsoft Turing team, contains a total of 550 million parameters, or internal variables that the model leverages to make predictions. (By comparison, M2M-100 has around 15 billion parameters). Microsoft researchers trained T-ULRv2 on a multilingual data corpus from the web that consists of the aforementioned 94 languages. During training, the model learned to translate by predicting masked words from sentences in different languages, occasionally drawing on context clues in pairs of translations like English and French.
As Microsoft VP Saurabh Tiwary and assistant managing director Ming Zhou note in a blog post, the XTREME benchmark covers 40 languages spanning 12 families and 9 tasks that require reasoning about varying levels of syntax. The languages are selected to maximize diversity, coverage in existing tasks, and availability of training data, and the tasks cover a range of paradigms including sentence text classification, structured prediction, sentence retrieval, and cross-lingual question answering. For models to be successful on the XTREME benchmarks, then, they must learn representations that generalize to many standard cross-lingual transfer settings.
The jury is out on T-ULRv2’s potential for bias and its grasp of general knowledge. Some research suggests benchmarks such as XTREME don’t measure models’ knowledge well and that models like T-ULRv2 can exhibit toxicity and prejudice against demographic groups. But the model is in any case a step toward Microsoft’s grand “AI at scale” vision, which seeks to push AI capabilities by training algorithms with increasingly large amounts of data and compute. Already, the company has used its Turing family of models to bolster language understanding across Bing, Office, Dynamics, and its other productivity products.
T-ULRv2 will power current and future language services available through Azure Cognitive Services, Microsoft says. It will also be available as a part of a program for building custom applications, which was announced at Microsoft Ignite 2020 earlier this year. Developers can submit requests for access.