Day: August 16, 2007

New Solutions for a Multilingual World

August 16, 2007 / Kaija Poysti / 0 Comments

I have said this many times before, and will say again: the world is multilingual, and more and more people are working daily in a multilingual environment. In companies, this multilingual environment is not only about translation, but about working with customers and colleagues whose native language is different from one’s own. That can lead to a lot of miscommunication, and I think that nobody has even started to measure the real costs or missed sales arising from it.

Communication starts with terminology, and that is where I see a lot of needs (and opportunities) for new solutions. Corporate terminology – “that which we call a widget by any other name goes in other companies” – is something that I think benefits from active input from corporate experts. Wikis seem an interesting way to enhance corporate communication, so I emailed with Greg Lloyd, CEO of Traction Software to ask whether he has seen wikis used for handling multilingual issues. He can be reached at grl@tractionsoftware.com.

Traction Software has been in the corporate blog/wiki business since July 2002, and has 250+ corporate customers. According to Greg, Traction’s TeamPage is best described in terms of Doug Engelbart’s NLS/Augment model, re-imagined for the Web (more at Traction Roots | Doug Engelbart.

KP: Do your customers use wikis to handle multilingual issues, such as terminology?
GL: We have an international pharma customer who wanted to provide an interactive online glossary of terms that have specialized meanings. For example, in writing a new drug application, many terms have specialized meanings and interpretations dictated by regulatory authorities in the U.S., Europe and other regions.
At this customer, glossary definitions are usually written by people with specialized experience in new drug applications and similar filings, but the glossaries are intended for working reference by everyone in the company – not limited to those who deliver translations. The company has offices around the world, but most working communication is in English or French. A majority of employees have very good reading knowledge of both languages, but aren’t necessarily aware of some specialized meanings and interpretations – including those which change as new regulations are issued.

We developed a “Glossary skin” to address this need. The Glossary skin is a Traction “skin” or UI presentation layer that in this case, provides a specialized and simplified Glossary view of the underling blog/wiki data stored in the TeamPage Journal. It gives the users versatile tools for handling terminology, such as looking up glossary terms, term definitions, guidance on how to use the term, and the possibility to comment a term or ask questions about it. All terms are in both English and French. Changes and additions can be tracked with standard blog/wiki features, and the users can also subscribe to RSS/Atom feeds on updates. These are just a few of the functionalities of the solution.

KP: Do the wiki glossaries integrate with other glossaries or localization tools, such as translation memories?
GL: For the Glossary Wiki there are no special translator tools built in. I believe that general purpose translation tools will likely best be loosely-coupled mashup style. I haven’t seen requests for industry specific glossaries from customers, but I think there may be a business opportunity.

KP: What kind of feedback have you received from your customer? Have there been requests for special functionalities?
GL: The pharma customer is very happy with the result, which is used company-wide. We’ve also demonstrated the Glossary skin to customers in Japan and other countries. Several have expressed interest and are piloting use of the Glossary skin, primarily for developing and delivering specialized glossaries for internal working communication as well as translating deliverables.

The ability for global enterprises to create interactive Glossaries for working communication among employees, suppliers and other stakeholders seems to be getting the most interest. Many global companies use English as a standard for internal communication, but the ability to add comments or questions in other languages is a big plus. The ability to create and delivery interactive Web glossaries in Japanese, Chinese, Arabic, Hebrew, etc. as well as European and other Asian languages is also very useful.

Traction uses UTF-8 Unicode to store, search and deliver content written in any combination of European and Asian alphabets in any blog/wiki space (or in the same page), so a multi-lingual global glossary is easy to deliver and can be simple to author using the standard Web browser interface.

KP: What have been the biggest advantages your customers have received from using a wiki to create a glossary, instead of using a specialized terminology management tool?
GL: The biggest advantages are: 1) Simple access using a Web browser, particularly when the wiki has specialized skin to make the Glossary application work with no training; 2) Simple group editing and history using the the wiki edit model; 3) Simple integration of comments and feedback; 4) Simple, scalable and secure deployment corporate-wide.

KP: Corporate wikis seem to be an interesting way to share information and expertise. Do you see them also being used for translation work?
GL: Yes, I can certainly see how the Glossary skin could be extended to support other wiki per-page translation models. At present the Glossary skin implementation is available to TeamPage customers as a Traction Skin Definition Language (SDL) plug-in. We’ll be packaging it along with its SDL source code as a free plug-in example later this summer. We’ll work with customers and partners to determine how to best provide translation wiki’s powered by Traction TeamPage.

Sharepoint and Search

August 16, 2007 / Lynda Moulton / 0 Comments

Sharepoint repositories are a prime content target for most search engines in the enterprise search arena, judging from the number of announcements I’ve previewed from search vendors in the last six month. This list is long and growing (Names link to press releases or product pages for Sharepoint search enabling):

Autonomy
Coveo
dtSearch
FAST
ISYS
Longitude from BA-Insight
Ontolica from Mondosoft
OpenText
Oracle
Recommind
Schemalogic
Vivisimo
X1
… and surely more I’ve missed

Almost a year ago I began using a pre-MOSS version of Sharepoint to collect documents for a team activity. Ironically, the project was the selection, acquisition, implementation of a (non-Sharepoint) content management system to manage a corporate intranet, extranet, and hosted public Web site. The version of Sharepoint that was “set up” for me was strictly out of the box. Not being a development, I was still able to muddle my way through setting up the site, established users, posting announcements and categories of content to which I uploaded about fifty or sixty documents.

The most annoying discovery was the lack of a default search option. Later updating to MOSS solved the problem but at the time it was a huge aggravation. Because I could not guarantee a search option would appear soon enough, I had to painstakingly create titles with dates in order to give team members a contextual description as they would browse the site. Some of the documents I wanted to share were published papers and reviews of products. Dates were not too relevant for those, so I “enhanced” the titles with my own notations to help the finders select what they needed.

These silly “homemade” solutions are not uncommon when a tool does not anticipate how we would want to be able to use it. They persist as ways to handle our information storage and retrieval challenges. Since the beginning of time humans have devised ways to store things that they might want to re-use at some point in the future. Organizing for findability is an art as much at it is science. Information science only takes one so far in establishing the organizing criteria and assigning those criteria to content. Search engines that rely strictly on the author’s language will leave a lot of relevant content on the shelf for the same reasons as using Dewey Decimal classification without the complementary card catalog of subject topics. The better search engines exploit every structured piece of data or tagged content associated with a document, and that includes all the surrounding metadata assigned by “categorizers.” Categorizers might be artful human indexers or automated processes. Search engines with highly refined, intelligent categorizers to enable semantically rich finding experiences bring even more sophistication to the search experience.

But back to Sharepoint, which does have an embedded search option now, I’ve heard more than one expert comment on the likelihood that it will not be the “search” of choice for Sharepoint. That is why we have so many search options scrambling to promote their own Sharepoint search. This is probably because the organizing framework around contributing content to Sharepoint is so loosey goosey that an aggregation of many Sharepoint sites across the organization will be just what we’ve experienced with all these other homegrown systems – a dump full of idiosyncratic organizing tricks.

What you want to do, thoughtfully, is assess whether the search engine you need will share only Sharepoint repositories OR both structured and unstructured repositories across a much larger domain of types of content and applications. It will be interesting to evaluate the options that are out there for searching Sharepoint gold mines. Key questions: Is a product targeting only Sharepoint sites or diverse content? How will content across many types of repositories be aggregated and reflected organized results displays? How will the security models of the various repositories interact with the search engine? Answering these three questions first will quickly narrow your list of candidates for enterprise search.

Mashing-up the Wikipedia Code

August 16, 2007 / Geoffrey Bock / 0 Comments

Now here’s an interesting tidbit from the BBC, courtesy of my daughter (who’s a graduate student in London): Wikipedia ‘shows CIA page edits.’ It seems that staffers at the CIA, the Democratic National Campaign Committee, the Vatican, and many other well known institutions (who may be trying to remain nameless) have been ‘caught’ sprucing up various wikipedia articles. (Well of course this is a tarty British take on the matter!)
And the secret sauce that pulls back the curtain? Revealed at the end of the article, a simple mashup that links the IP addresses of contributors to an article (obtained through the “history” page) with a directory of organizations owning IP addresses. Both are publicly available. The results are hardly surprising.
The point is that when information is so widely and freely available, we have to begin to worry about the sources of information and how it is presented. There’s not a lot of anonymity on the public web — and quite possibly this is a good thing. But building community also includes notions of trust, expertise, and terms of reference. For example, when starting eBay, Pierre Omidyar came up with the notion of “rate the buyer” and “rate the seller” as a way of organically building trust within the community of eBayers . . . and the rest is history.
I hate to admit it but perhaps Ronald Regan said it the best. “Trust but verify.” What’s interesting is that mashing-up sources and IP addresses provides a whole new dimension to verification. I wonder what else is possible? Let’s start a discussion — comments?

Day: August 16, 2007

New Solutions for a Multilingual World

Sharepoint and Search

Mashing-up the Wikipedia Code

Choose Language

Policies

Contact