Curated for content, computing, and digital experience professionals

Year: 2008 (Page 23 of 36)

Resources & Opportunity: W3C’s ITS Interest Group

Cross-post from the Globalization blog.
At the end of March, the W3C announced the launch of the Internationalization Tag Set (ITS) Interest Group (IG) as a forum to foster a community of users that promotes the tag set’s adoption and further development. Like Unicode’s CLDR initiative, the emphasis on community interaction and collaboration underscores the ever-increasing, Web-driven impact of cooperative spirit.

As the Web nears its 20th birthday, we would imagine efforts such as ITS IG continue to be music to the ears of its inventor and W3C founder, Tim Berners-Lee. This particular interest group is certainly not the first nor the last of the educational and outreach efforts the W3C has launched since 1994.

It is also not the first nor the last of the activities from W3C’s Internationalization (I18n) Activity, known worldwide as simply I18n. The mission? “To ensure that W3C’s formats and protocols are usable worldwide in all languages and in all writing systems.” The goals? Ensure universal access, support the internationalization and localization of documents, and help reduce the time and cost associated with internationalization and localization projects. Consistent and admirable objectives, described eloquently by Richard Ishida, Activity Lead for the I18n Core Working Group in his article, It’s All About Customer Focus.

I18n accomplishments include a treasure trove of information from specifications and recommendations to educational materials to the newest initiative, hosting the Planet I18n Blog aggregator. Worth checking out; give yourself time to stay a while.

Webinar: Analytics-Driven Web Content

Thursday, May 8, 1:00 pm ET
Customers expect more than a one-size-fits-all web experience. They want “my-size-fits-me” content every time they interact with your company. Or they don’t come back.
In this webinar, marketing managers learn the latest approaches to using knowledge about visitors and behaviors to drive dynamic content delivery. Tony White, Gilbane’s lead analyst for web content management, and Brett Zucker, CTO for Bridgeline Software, discuss emerging technologies for serving up analytics-driven content that attracts customers, engenders loyalty, and improves site ROI. The webinar is sponsored by Bridgeline Software.
Registration is open.

Only Humans can Ensure the Value of Search in Your Enterprise

While considering what is most important in selecting the search tools for any given enterprise application, I took a few minutes off to look at the New York Times. This article, He Wrote 200,000 Books (but Computers Did Some of the Work), by Noam Cohen, gave me an idea about how to compare Internet search with enterprise search.

A staple of librarians’ reference and research arsenal has been a category of reference material called “bibliographies of bibliographies.” These works, specific to a subject domain, are aimed at a usually scholarly audience to bring a vast amount of content into focus for the researcher. Judging from the article, that is what Mr. Parker’s artificial intelligence is doing for the average person who needs general information about a topic. According to at least one reader, the results are hardly scholarly.

This article points out several things about computerized searching:

  • It does a very good job of finding a lot of information easily.
  • Generalized Internet searching retrieves only publicly accessible, free-for-consumption, content.
  • Publicly available content is not universally vetted for accuracy, authoritativeness, trustworthiness, or comprehensiveness, even though it may be all of these things.
  • Vast amounts of accurate, authoritative, trustworthy and comprehensive content does exist in electronic formats that search algorithms used by Mr. Parker or the rest of us on the Internet will never see. That is because it is behind-the-firewall or accessible only through permission (e.g. subscription, need-to-know). None of his published books will serve up that content.

Another concept that librarians and scholars understand is that of primary source material. It is original content, developed (written, recorded) by human beings as a result of thought, new analysis of existing content, bench science, or engineering. It is often judged, vetted, approved or otherwise deemed worthy of the primary source label by peers in the workplace, professional societies or professional publishers of scholarly journals. It is often the substance of what get republished as secondary and tertiary sources (e.g. review articles, bibliographies, books).

We all need secondary and tertiary sources to do our work, learn new things, and understand our work and our world better. However, advances in technology, business operations, and innovation depend on sharing primary source material in thoughtfully constructed domains in our enterprises of business, healthcare, or non-profits. Patient’s laboratory or mechanical device test data that spark creation of primary source content need surrounding context to be properly understood and assessed for value and relevancy.

To be valuable enterprise search needs to deliver context, relevance, opportunities for analysis and evaluation, and retrieval modes that give the best results for any user seeking valid content. There is a lot that computerized enterprise search can do to facilitate this type of research but that is not the whole story. There must still be real people who select the most appropriate search product for that enterprise and that defined business case. They must also decide content to be indexed by the search engine based on its value, what can be secured with proper authentication, how it should be categorized appropriately, and so on. To throw a computer search application at any retrieval need without human oversight is a waste of capital. It will result in disappointment, cynicism and skepticism about the value of automating search because the resulting output will be no better than Mr. Parker’s books.

Free Globalization Intelligence: Unicode’s CLDR Project

I recently had the pleasure of interviewing Arle Lommel, LISA OSCAR Standards Chair, to discuss the importance of Unicode’s Common Locale Data Repository (CLDR) project, which collects and provides data such as date/time formats, numeric formatting, translated language and country names, and time zone information that is needed to support globalization.

LC: What is the CLDR?
AL: The Common Locale Data Repository is a volunteer-developed and maintained resource coordinated and administered by the Unicode Consortium that is available for free. Its goal is to gather basic linguistic information for various “locales,” essentially combinations of a language and a location, like French in Switzerland.
LC: What does the resource encompass?
AL: CLDR gathers things like lists of language and country names, date formats, time zone names, and so forth. This is critical knowledge to know when developing projects for the markets represented by specific locales. By drilling down past the language level to look at the market level, CLDR data is designed to be relevant for a specific area of the world. Think of the difference between U.S. and British English, for example. You would clearly have a problem if British spellings were used in a U.S. project or prices appeared like “£10.54” instead of “$10.54.” Problems like these are very common when product developers don’t think through what the implications of their design decisions will be.
LC: What other issues does CLDR address?
AL: Other problems addressed by CLDR include the numeric form of dates, where something like “04.05.06” could mean “April 5, 2006,” “May 4, 2006,” or even “May 6, 2004,” depending on where you live. Clearly you have to know what people expect.
LC: What is the advantage of using CLDR?
AL: It makes resources available to anyone, at no cost. Without something like the CLDR, one would need to investigate all of market issues, pay to translate things like country names into each language, and so forth. Activities such as this can add significantly to the cost of a project. The CLDR provides them for free and provides the critical advantage of consistency.
LC: Why should content creators care about the CLDR?
AL: At LISA we have heard time and again that not taking international issues into consideration from a project’s earliest phases doubles the cost of a project and makes it take twice as long. While many issues relate to decisions made by programmers, some of the issues do relate to the job of technical authors and other content creators. While it’s unlikely that a technical writer will need to use a CLDR list of language names in Finnish directly, for instance, the content creator might design an online form in which a user fills out what language he or she would like to be contacted in. If there is insufficient room to display the language name because it is longer in Finnish (a common problem when going from English to Finnish), the end user may have difficulty, something that could have been prevented by the content author if he or she had been given the resources to test the design early on. The CLDR makes the information available that allows authors to prevent basic problems that create issues for users around the world.
LC: How can professionals contribute to the CLDR?
AL: Right now the biggest need of the CLDR is for native (or very good) speakers of non-English languages to (1) supply missing data, and (2) verify that existing data points are correct. Because the CLDR is volunteer driven, people of all levels of competence and ability are able to contribute as much or as little as they want. Unicode welcomes this participation. The real need is for people to know about and use the CLDR. In my experience even the savviest of developers often don’t know about the CLDR and what it contains, so they spend time and money on recreating a resource that they could have for free.
LC: How is LISA supporting CLDR?
AL: We are committed to supporting Unicode and the CLDR, so we have launched an initiative where people who sign up with LISA to contribute to the CLDR and who spend ten or more hours working on the project are eligible to receive individual LISA membership for a year as a token of our appreciation for their contribution. So if any readers have the needed language/locale skills to supply data missing from the CLDR or to review existing data, they can contact me to get started.

XML In Practice White Papers Now Available

White papers on W3C standards in practice and component content management in practice are now available in the Gilbane white paper library.

Using XML and Databases: W3C Standards in Practice serves as a handy reference guide to the current status of the major XML standards.

Component Content Management in Practice: Meeting the Demands of the Most Complex Content Applications provides an overview of the requirements for technology that manages content at a granular level. To quote the executive summary:

[The paper] compares the requirements of component content management with the capabilities of more general content management technologies, notably web content management and document management. It then looks at the technology behind CCMS in depth, and concludes with example applications where CCMS can have the most impact on an enterprise.

No registration is required to read or download the papers.

Semantic Technologies and our CTO Blog

We host a number of blogs, some more active than others. One of the least active (although it still gets a surprising amount of traffic) has been our CTO blog. However, I am happy to say that Colin Britton started blogging on semantic technologies yesterday. As a co-founder and CTO of Metatomix he led the development of a commercial product based on RDF – a not very well understood W3C semantic web standard. Colin’s first post on the CTO blog starts a series that will help shed a little more light on semantic technologies and their practical applications.

Some of you know that I remain skeptical of the new world “Semantic Web” vision, but I do think semantic technologies are important and have a lot to offer, and Colin will help you see why. Check out his first post and let him know what you think about semantic technologies and what you would like to know about.

« Older posts Newer posts »

© 2024 The Gilbane Advisor

Theme by Anders NorenUp ↑