The Gilbane Advisor

Curated for content, computing, and digital experience professionals

Page 285 of 918

Webinar: Analytics-Driven Web Content

Thursday, May 8, 1:00 pm ET
Customers expect more than a one-size-fits-all web experience. They want “my-size-fits-me” content every time they interact with your company. Or they don’t come back.
In this webinar, marketing managers learn the latest approaches to using knowledge about visitors and behaviors to drive dynamic content delivery. Tony White, Gilbane’s lead analyst for web content management, and Brett Zucker, CTO for Bridgeline Software, discuss emerging technologies for serving up analytics-driven content that attracts customers, engenders loyalty, and improves site ROI. The webinar is sponsored by Bridgeline Software.
Registration is open.

Only Humans can Ensure the Value of Search in Your Enterprise

While considering what is most important in selecting the search tools for any given enterprise application, I took a few minutes off to look at the New York Times. This article, He Wrote 200,000 Books (but Computers Did Some of the Work), by Noam Cohen, gave me an idea about how to compare Internet search with enterprise search.

A staple of librarians’ reference and research arsenal has been a category of reference material called “bibliographies of bibliographies.” These works, specific to a subject domain, are aimed at a usually scholarly audience to bring a vast amount of content into focus for the researcher. Judging from the article, that is what Mr. Parker’s artificial intelligence is doing for the average person who needs general information about a topic. According to at least one reader, the results are hardly scholarly.

This article points out several things about computerized searching:

  • It does a very good job of finding a lot of information easily.
  • Generalized Internet searching retrieves only publicly accessible, free-for-consumption, content.
  • Publicly available content is not universally vetted for accuracy, authoritativeness, trustworthiness, or comprehensiveness, even though it may be all of these things.
  • Vast amounts of accurate, authoritative, trustworthy and comprehensive content does exist in electronic formats that search algorithms used by Mr. Parker or the rest of us on the Internet will never see. That is because it is behind-the-firewall or accessible only through permission (e.g. subscription, need-to-know). None of his published books will serve up that content.

Another concept that librarians and scholars understand is that of primary source material. It is original content, developed (written, recorded) by human beings as a result of thought, new analysis of existing content, bench science, or engineering. It is often judged, vetted, approved or otherwise deemed worthy of the primary source label by peers in the workplace, professional societies or professional publishers of scholarly journals. It is often the substance of what get republished as secondary and tertiary sources (e.g. review articles, bibliographies, books).

We all need secondary and tertiary sources to do our work, learn new things, and understand our work and our world better. However, advances in technology, business operations, and innovation depend on sharing primary source material in thoughtfully constructed domains in our enterprises of business, healthcare, or non-profits. Patient’s laboratory or mechanical device test data that spark creation of primary source content need surrounding context to be properly understood and assessed for value and relevancy.

To be valuable enterprise search needs to deliver context, relevance, opportunities for analysis and evaluation, and retrieval modes that give the best results for any user seeking valid content. There is a lot that computerized enterprise search can do to facilitate this type of research but that is not the whole story. There must still be real people who select the most appropriate search product for that enterprise and that defined business case. They must also decide content to be indexed by the search engine based on its value, what can be secured with proper authentication, how it should be categorized appropriately, and so on. To throw a computer search application at any retrieval need without human oversight is a waste of capital. It will result in disappointment, cynicism and skepticism about the value of automating search because the resulting output will be no better than Mr. Parker’s books.

Free Globalization Intelligence: Unicode’s CLDR Project

I recently had the pleasure of interviewing Arle Lommel, LISA OSCAR Standards Chair, to discuss the importance of Unicode’s Common Locale Data Repository (CLDR) project, which collects and provides data such as date/time formats, numeric formatting, translated language and country names, and time zone information that is needed to support globalization.

LC: What is the CLDR?
AL: The Common Locale Data Repository is a volunteer-developed and maintained resource coordinated and administered by the Unicode Consortium that is available for free. Its goal is to gather basic linguistic information for various “locales,” essentially combinations of a language and a location, like French in Switzerland.
LC: What does the resource encompass?
AL: CLDR gathers things like lists of language and country names, date formats, time zone names, and so forth. This is critical knowledge to know when developing projects for the markets represented by specific locales. By drilling down past the language level to look at the market level, CLDR data is designed to be relevant for a specific area of the world. Think of the difference between U.S. and British English, for example. You would clearly have a problem if British spellings were used in a U.S. project or prices appeared like “£10.54” instead of “$10.54.” Problems like these are very common when product developers don’t think through what the implications of their design decisions will be.
LC: What other issues does CLDR address?
AL: Other problems addressed by CLDR include the numeric form of dates, where something like “04.05.06” could mean “April 5, 2006,” “May 4, 2006,” or even “May 6, 2004,” depending on where you live. Clearly you have to know what people expect.
LC: What is the advantage of using CLDR?
AL: It makes resources available to anyone, at no cost. Without something like the CLDR, one would need to investigate all of market issues, pay to translate things like country names into each language, and so forth. Activities such as this can add significantly to the cost of a project. The CLDR provides them for free and provides the critical advantage of consistency.
LC: Why should content creators care about the CLDR?
AL: At LISA we have heard time and again that not taking international issues into consideration from a project’s earliest phases doubles the cost of a project and makes it take twice as long. While many issues relate to decisions made by programmers, some of the issues do relate to the job of technical authors and other content creators. While it’s unlikely that a technical writer will need to use a CLDR list of language names in Finnish directly, for instance, the content creator might design an online form in which a user fills out what language he or she would like to be contacted in. If there is insufficient room to display the language name because it is longer in Finnish (a common problem when going from English to Finnish), the end user may have difficulty, something that could have been prevented by the content author if he or she had been given the resources to test the design early on. The CLDR makes the information available that allows authors to prevent basic problems that create issues for users around the world.
LC: How can professionals contribute to the CLDR?
AL: Right now the biggest need of the CLDR is for native (or very good) speakers of non-English languages to (1) supply missing data, and (2) verify that existing data points are correct. Because the CLDR is volunteer driven, people of all levels of competence and ability are able to contribute as much or as little as they want. Unicode welcomes this participation. The real need is for people to know about and use the CLDR. In my experience even the savviest of developers often don’t know about the CLDR and what it contains, so they spend time and money on recreating a resource that they could have for free.
LC: How is LISA supporting CLDR?
AL: We are committed to supporting Unicode and the CLDR, so we have launched an initiative where people who sign up with LISA to contribute to the CLDR and who spend ten or more hours working on the project are eligible to receive individual LISA membership for a year as a token of our appreciation for their contribution. So if any readers have the needed language/locale skills to supply data missing from the CLDR or to review existing data, they can contact me to get started.

XML In Practice White Papers Now Available

White papers on W3C standards in practice and component content management in practice are now available in the Gilbane white paper library.

Using XML and Databases: W3C Standards in Practice serves as a handy reference guide to the current status of the major XML standards.

Component Content Management in Practice: Meeting the Demands of the Most Complex Content Applications provides an overview of the requirements for technology that manages content at a granular level. To quote the executive summary:

[The paper] compares the requirements of component content management with the capabilities of more general content management technologies, notably web content management and document management. It then looks at the technology behind CCMS in depth, and concludes with example applications where CCMS can have the most impact on an enterprise.

No registration is required to read or download the papers.

Semantic Technologies and our CTO Blog

We host a number of blogs, some more active than others. One of the least active (although it still gets a surprising amount of traffic) has been our CTO blog. However, I am happy to say that Colin Britton started blogging on semantic technologies yesterday. As a co-founder and CTO of Metatomix he led the development of a commercial product based on RDF – a not very well understood W3C semantic web standard. Colin’s first post on the CTO blog starts a series that will help shed a little more light on semantic technologies and their practical applications.

Some of you know that I remain skeptical of the new world “Semantic Web” vision, but I do think semantic technologies are important and have a lot to offer, and Colin will help you see why. Check out his first post and let him know what you think about semantic technologies and what you would like to know about.

Canto Announces Cumulus 7.5.3 Product Line Tune-Up

Canto announced the immediate availability of Cumulus 7.5.3, a minor update that improves the performance and reliability of the entire Cumulus product line. The company says that Cumulus 7.5.3 is running on the recently released Service Pack 1 for Windows Vista, and OS X Leopard performance and reliability continues to be stable since the release of version 7.5.2, though a handful of improvements have been made to support Apple’s latest OS even better. Canto recommends all customers upgrade to Cumulus 7.5.3, regardless of operating system, to benefit from global fixes and improvements. Customers on active service agreements can download the update free of charge from Canto’s Customer Portal. The Cumulus product line was last updated in December, 2007, with the release of Cumulus 7.5.2. http:/www.canto.com/

Introduction to Semantic Technology

Ten years ago I had a belief that a meta-data approach to managing enterprise information was a valid way to go. The various structures, relationships and complexities of IT systems led to disjointed information. By relating the information elements to each other, rather than synchronizing the information together, we _might_ stand a chance.

At the same time a new set of standards was emerging, standards to describe, relate and query a new information model, based on meta-data, these became know as the Semantic Web, outlined in a Scientific American article (http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21 ) in 2001.

Fast forward to 2008 – where are we with this vision. Some part of me is thrilled, another part disappointed. We have adoption of these standards and this approach at use in everyday information management situations. Major software companies and startup’s alike are implementing Semantic Technology in their offerings and products. However, I am disappointed that we still find it hard to communicate what this semantic technology means and how valuable it is. Most technologists I meet glaze over at the mention of the Semantic Web or any of it’s standards, yet when asked if they think RSS is significant, praise it’s contributions.

Over a series of posts to this blog, I would like to try and explain, share and show some of the value of Semantic Technology and why one should be looking at it.

Let’s start with what is Semantic Technology and what are the standards that define it’s openness. To quote Wikipedia “In software, semantic technology encodes meanings separately from data and content files, and separately from application code.” This abstraction is a core tenant and value provided by a Semantic approach to information management. The idea that our database or programming patterns do no restrict the form or boundaries of our information is a large shift from traditional IT solutions. The idea that our business logic should not be tied to the code that implements it, nor the information that it operates on is all provided through this semantic representation. So firstly ABSTRACTION is a key definition.

The benefit of this is that systems, machines, solutions, whatever term you wish to use can interact with each other – share, understand and reason, without having been explicitly programmed to understand each other.

With this you get to better manage CHANGE. Your content and systems can evole or change with the changes managed through the Semantic Technology layer.

So what makes up Semantic Technology, one sees the word in a number of soltuions or technologies, are they all created equal.

In my view, Semantic Technology can only truly claim to be so, if it is based on and implements the standards laid out through the (W3C) World Wide Web Consortium standards process. http://www.w3.org/2001/sw/

The vision of the Semantic Web and the standards required to support it continue to expand, but the anchor standards have been laid out for a while.

RDF – The model and syntax for describing information. It is important to understand that with the RDF standards there are multiple things defined to create this standard – the model (or data model) , the syntax (how it is written/serialized) and the formal semantics (or logic described by the use of rdf). In 2004, the original RDF specification was revised and published as 6 separate documents, each covering an important area of the standard.

RDF-S – Provides a typing system for RDF and the basic constructs for expressing Ontologies and relationships within the meta data structure.

OWL – To quote the W3C paper, this facilitates greater machine interpretability of Web content than that supported by XML, RDF, and RDF-S by providing additional vocabulary along with a formal semantics.

SPARQL – While anyone with a Semantic Technology solution invented there own query language (why was this never there one in the first place!), SPARQL, pronounced “sparkle” is the w3c standardization of one. It is HUGE for Semantic Technology and makes all the effort with the other three standards worthwhile.

These standards are quite a pile to sift through, understanding the capabilities embodied in them takes significant effort, but it is the role of technologists in this arena to remove that need for you to understand them. It is our job to provide tools, solutions and capabilities that leverage the these standards bringing semantic technology to life and deliver the power defined within them.

But that is the subject of another post. So what does this all mean in real life? In my next post I will layout a concrete example using product information as an example.

 

« Older posts Newer posts »

© 2024 The Gilbane Advisor

Theme by Anders NorenUp ↑