Archive for machine translation

MultiCorpora Unveils MultiTrans 4.4

MultiCorpora announced its newest version of MultiTrans. The newest version 4.4 of MultiTrans delivers WordAlign technology which allows users to instantly retrieve translated terminology from previously translated documents. This advancement in language technology was made possible through collaborative development efforts with the Canadian National Research Center. The newest version enables components of machine translation to be integrated into its software suite. This offers additional translation options for organizations who consider machine translation as part of their business model. MultiCorpora has also leveraged Oracle’s technology to recycle translations from over 250 file formats and shorten file conversion speeds. These new MultiTrans features dovetail with the turn-key, fully integrated workflow processes previously released in version 4.3 of MultiTrans (2007). http://www.multicorpora.com

The Google Effect on Cross-Language Search

As the Internet continues to redefine ubiquitous, the issue of cross language search becomes more critical. It’s a pervasive challenge with extreme scalability requirements. Hard to imagine, but the Internet will be full by about 2010 according to the American Registry for Internet Numbers. ARIN’s recommendation for IPv6 demonstrates the potential breadth of information overload.

Organizations such as the European-based Cross-Language Evaluation Forum (CLEF) have moved beyond discussion and into in-depth testing on cross-language search for many years. With its “Leaping over Language Barriers” announcement, Google has moved beyond experimentation and toward productization of its cross-language search feature.

  • The Wall Street Journal’s Jessica Vascellaro weighs in here, and includes commentary on rival strategies from Yahoo and Microsoft.
  • Google Blogoscoped weighs in here.
  • Clay Tablet’s Ryan Coleman weighs in here.
  • Global by Design’s John Yunker has a review here.
  • And from Google themselves, here’s the beta UI, the FAQ, and the “unveiling” at the company’s Searchology event held earlier this month.

IMO, any discussion of what the interconnected world “looks like” in the future, whether focused on fill in your label here 2.0, social networking, customer experience, global elearning, etc., (should) eventually drill-down to translation and localization issues. Once we’re at that level of conversation, there’s more challenges to discuss — the ongoing evolution of automated translation, the balance between human and machine translation, the conundrum of rich media and image translation, and as Kaija will always remind us, the quality and context of search results as opposed to merely the quantity.

As a researcher, I’ve used Google’s “translate this” functionality and Yahoo’s Babel Fish (originally AltaVista’s) numerous times to “get the gist” of a non-English article. But my reliance on the results has been more for sanity-checking trends than for factual data gathering. Inconsistencies skew the truth. I just can’t trust it. Can we trust this? Time will tell. Is it a step in the right direction for the masses? No doubt.