Over the past few weeks, since publishing Smart Content in the Enterprise, I’ve had several fascinating lunchtime conversations with colleagues concerned about content technologies. Our exchanges wind up with a familiar refrain that goes something like this. “Geoffrey, you have great insights about smart content but what am I supposed to do with all this information?” Ah, it’s the damning with faint praise gambit that often signals an analysis paralysis conundrum for decision-making.

Let me make one thing perfectly clear -- I do not have an out-of-the-box prescription for a solution. It’s not simply a matter of focusing on your customer experience, optimizing your content for search, investing in a component content management platform, or adopting DITA – although, depending on the situation, I may recommend some combination of these items as part of a smart content strategy.

For me, smart content remains a work in progress. I expect to develop the prescriptive road map in the months ahead. Here’s a quick take on where I am right now.

  • For publishers, it’s all about transforming the publishing paradigm through content enrichment – defining the appropriate level of granularity and then adding the semantic metadata for automated processing.
  • For application developers, it’s all about getting the information architecture right and ensuring that it’s extensible. There needs to be sensible storage, the right editing and management tools, multiple methods for organizing content, as well as a flexible rendering and production environment.
  • For business leaders and decision makers, there needs to be an upfront investment in the right set of content technologies that will increase profits, reduce operating costs, and mitigate risks. No, I am not talking about rocket science. But you do need a technology strategy and a business plan.

As highlighted by the case studies included in the report, I can point to multiple examples where organizations have done the right things to produce notable results. Dale and I will continue the smart content discussions at the Gilbane Boston conference right after Thanksgiving, both through our preconference workshop, and at a conference session “Smart Content in the Real World: Case Studies and Real Results.”

We are also launching a Smart Content Readiness Service, where we will engage with organizations on a consulting basis to identify:

  • The business drivers where smart content will ensure competitive advantage when distributing business information to customers and stakeholders
  • The technologies, tools, and skills required to componentized content, and target distribution to various audiences using multiple devices
  • The operational roles and governance needed to support smart content development and deployment across an organization
  • The implementation planning strategies and challenges to upgrade content and creation and delivery environments

Please contact me if you are interested in learning more.

In short, to answer my lunchtime colleagues, I cannot (yet) prescribe a fully baked solution. It’s too early for the recipes and the cookbook. But I do believe that the business opportunities and benefits are readily at hand. At this point, I would invite you to join the discussion by letting me know what you expect, what approaches you’ve tried, where you’ve wound up, what you think needs to come next – and how we might help you.

The Pull of Content Value

user-pic
Vote 4 Votes  

Traditionally, publishing is a pushy process. When I have something to say, I write it down. Perhaps I revise it, check with colleagues, and verify my facts with appropriate authorities. Then I publish it, and move on to the next thing – without directly interacting with my audience and stakeholders. Whether I distribute the content electronically or in a hard copy format, I leave it to my readers to determine the value of whatever I publish.

However, as we describe in our recently completed report Smart Content in the Enterprise, XML applications can transform this conventional publishing paradigm. By smart content, we mean content that is granular at the appropriate level, semantically rich, useful across applications, and meaningful for collaborative interaction.

From a business perspective, smart content adds value to published information in new and compelling ways. Let’s consider the experiences of NetApp and Warrior Gateway, two of the organizations featured in our report.

NetApp
As a provider of storage and data management solutions, NetApp has invested a lot of time and effort embracing DITA and restructuring its technical documentation. By systematically tagging and managing content components, and by focusing on the underlying content development processes, writers and editors can keep up with the pace of product releases.

But there is more to this publishing process orientation. Beyond simply producing product information faster and cheaper, NetApp is poised to make publishing better. The company can now easily support its reseller partners by providing them with the DITA tagged content that they can directly incorporate into their own OEM solutions. Resellers' customers get just the information they need, directly from the source. With its XML application, NetApp incorporates its partners and stakeholders into its information value chain.

Warrior Gateway
As a content aggregator, Warrior Gateway collects, organizes, enriches, and redistributes content about a wide range of health, welfare, and veteran-related services to soldiers, veterans, and their families. Rather than simply compiling an online catalog of service providers’ listings, Warrior Gateway restructures the content that government, military, and local organizations produce, and enriches it by adding veteran-related categories and other information. Furthermore, Warrior Gateway adds a social dimension by encouraging contributions from veterans and family members.

Once stored within the XML application powering Warrior Gateway, the content is easily reorganized and reclassified to provide the veterans’ perspective about areas of interest and importance. Volunteers working with Warrior Gateway can add new categories when necessary. Service providers can claim their profile and improve their own data details. Even the public users can contribute to content to the gateway, a crowd sourcing strategy to efficiently collect feedback from users. With contributions from multiple stakeholders, the published listings can be enriched over time without requiring a large internal staff to add the extra information.

Capturing New Business Value
There’s a lot more detail about how the XML applications work in our case studies – I recommend that you check them out.

What I find intriguing is the range of promising and potentially profitable business models engendered by smart content.  Enterprise publishers have new options and can go beyond simply pushing content through a publishing process. Now they can build on their investments, and capture the pull of content value.

Authoring in a structured text environment has traditionally been done with dedicated structured editors. These tools enable validation and user assisted markup features that help the user create complete and valid content. But these structured editors are somewhat complicated and unusual and require training in their use for the user to become proficient. The learning curve is not very steep but it does exist.

Many organizations have come to see documentation departments as a process bottleneck and try to engage others throughout the enterprise in the content creation and review processes. Engineers and developers can contribute to documentation and have a unique technical perspective. Installation and support personnel are on the front lines and have unique insight into how the product and related documentation is used. Telephone operators not only need the information at their fingertips, but can also augment it with comments and ides that occur while supporting users. Third-party partners and reviewers may also have a unique perspective and role to play in a distributed, collaborative content creation, management, review, and delivery ecosystem.

Our recently completed research on XML Smart Content in the Enterprise indicates that as we strive to move content creation and management out of the documentation department silo, we will also need to consider how the data is encoded and the usefulness of the data model in meeting our expanded business requirements. Smart content is multipurpose content designed with several uses in mind. Smart content is modular to support being assembled in a variety of forms. And smart content is structured content that has been enriched with semantic information to better identify it's topic and role to aide processing and searching. For these reasons, smart content also improves distributed collaboration. Let me elaborate.

One of the challenges for distributed collaboration is the infrequency of user participation and therefore, unfamiliarity with structured editing tools. It makes sense to simplify the editing process and tools for infrequent users. They can't always take a refresher course in the editor and it's features. They may be working remotely, even on a customer site installing equipment or software. These infrequent users need structured editing tools that are designed for them. These collaboration tools need to be intuitive and easy to figure out, easily accessible from just about anywhere, and should be affordable and have flexible licensing to allow a larger number of users to participate in the management of the content. This usually means one of two things: either the editor will be a plug in to another popular word processing system (e.g., MS Word), or it will be accessed though a thin-client browser, like a Wiki editor. In some environments, it is possible that both may be need in addition to traditional structured editing tools. Smart content modularity and enrichment allows flexibility in editing tools and process design. This allows the  use of a variety of editing tools and flexibility in process design, and therefore expanding who can collaborate from throughout the enterprise.

Also, infrequent contributors may not be able to master navigating and operating within a  complex repository and workflow environment either for the same familiarity reasons. Serving up information to a remote collaborator might be enhanced with keywords and other metadata that is designed to optimize searching and access to the content. Even a little metadata can provide a lot of simplicity to an infrequent user. Product codes, version information, and a couple of dates would allow a user to hone in on the likely content topics and select content to edit from a well targeted list of search results. Relationships between content modules that are indicated in metadata can alert a user that when one object is updated, other related objects may need to be reviewed for potential update as well.

It is becoming increasingly clear that there is no one model for XML or smart content creation and editing. Just as a carpenter may have several saws, each designed for a particular type of cut, a robust smart content structured content environment may have more than one editor in use. It behooves us to design our systems and tools to meet the desired business processes and user functionality, rather than limit our processes to the features of one tool.

One of the conclusions of our report Smart Content in the Enterprise (forthcoming next week) is how a little bit of enrichment goes a long way. It’s important to build on your XML infrastructure, enrich your content a little bit (to the extent that your business environment is able to support), and expect to iterate over time.

Consider what happened at Citrix, reported in our case study Optimizing the Customer Experience at Citrix: Restructuring Documentation and Training for Web Delivery. The company had adopted DITA for structured publishing several years ago. Yet just repurposing the content in product manuals for print and electronic distribution, and publishing the same information as HTML and PDF documents, did not change the customer experience.

A few years ago, Citrix information specialists had a key insight: customers expected to find support information by googling the web. To be sure, there was a lot of content about various Citrix products out in cyberspace, but very little of it came directly from Citrix. Consequently the most popular solutions available via web-wide searching were not always reliable, and the detailed information from Citrix (buried in their own manuals) was rarely found.

What did Citrix do? Despite limited resources, the documentation group began to add search metadata to the product manuals. With DITA, there was already a predefined structure for topics, used to define sections, chapters, and manuals. Authors and editors could simply include additional tagged metadata that identified and classified the contents – and thus expose the information to Google and other web-wide search engines.

Nor was there a lot of time or many resources for up-front design and detailed analysis. To paraphrase a perceptive information architect we interviewed, “Getting started was a lot like throwing the stuff against a wall to see what sticks.” At first tags simply summarized existing chapter and section headings. Significantly, this was a good enough place to start.

Specifically, once Citrix was able to join the online conversation with its customers, it was also able to begin tracking popular search terms. Then over time and with successive product releases, the documentation group was able to add additional tagged metadata and provide ever more focused (and granular) content components.

What does this mean for developing smart content and leveraging the benefits of XML tagging? Certainly the more precise your content enrichment, the more findable your information is going to be. When considering the business benefits of search engine optimization, the quality of your tagging can always improve over time. But as a simple value proposition, getting started is the critical first step.

If you've been reading our recent posts on Gilbane's new research on XML adoption, you might be wondering how to get the report in advance of its availability from Gilbane later this month.

Smart Content in the Enterprise: How Next Generation XML Applications Deliver New Value to Multiple Stakeholders is currently offered by several of the study sponsors: IBM, JustSystems, MarkLogic, MindTouch, Ovitas, Quark, and SDL.

We'll also be discussing our research in real time during a webinar hosted by SDL on November 4. Look for details within the next few weeks.

If you're only reading this XML blog, be sure to check out my recent blog post Focusing on Smart Content, which I published in the main Gilbane blog.

We've published a new paper on addressing large-scale integration, storage, and access of complex information. As Dale mentions in his entry over on our main blog, the paper frames the discussion in terms of challenges to Open Government initiatives. We note, though, that the exploration of obstacles to effective, efficient processing of high volumes of data and content is relevant across many industries.

We're cross-posting here on the XML blog because the paper deals wtih XML content and the XML family of standards, including XQuery and XPath.

The Gilbane Beacon is available as a free download from Gilbane and from Mark Logic, sponsor of the paper.

The growth in web-centric communication has created a major focus on content management, web content management , component content management, and so on. This interest is driven primarily by increasing demand for rich, interactive, accessible information products delivered via the Web. The focus is not misplaced but may be missing part of the point. To be specific, in our focus on the "management" part of CM, we may be missing the first word in the phrase.... "Content."

It's true that the application of increasing amounts of computer and brain power to the processes associated with preparing and delivering the kind of information demanded by today's users can improve those products. But it does so within limits set by and at costs generated by the content "raw material" it gets from the content providers. In many cases, the content available to web product development processes is so structurally crude that it requries major clean-up and enhancement in order to adequately participate in the classification and delivery process. As the focus on elegant Web delivery increases, barring real changes in the condition of this raw content, the cost of enhancement is likely to grow proportionally, straining the involved organizations' ability to support it.

The answer may be in an increased focus on the processes and tools used to create the original content. We know that the original creator of most content knows the most about how it should be logically structured and most about the best way to classify it for search and retrieval. Trouble is, in most cases, we provide no means of capturing what the creator knows about his or her intellectual product. Moreover, because many creators have never been able to fully populate the metadata needed to classify and deliver their content, in past eras, professional catalogers were employed to complete this final step. In today's world, however, we have virtually eliminated the cataloger, assuming instead that the prodigious computer power available to us could develop the needed classification and structure from the content itself. That approach can and does work, but it will require better raw material if it is to achieve the level of effectiveness needed to keep the Web from becoming a virtual haystack in which finding the needle is more good luck than good measure. Native XML editors instead of today's visually oriented word processors, spreadsheets, graphics and other media forms with content-specific XML under them, increased use of native XML databases and a host of rich content-centric resources are part of this content evolution.

Most important, however, may be promulgation of the realization across society that creating content includes more than just making it look good on the screen, and that the creator shares in that responsibility. This won't be an easy or quick process, requiring more likely generations than years, but if we don't begin soon, we may end up with a Web 3 or 4 or 5.0 trying to deliver content that isn't even yet 1.0.

As the world of search becomes more and more sophisticated (and that process has been underway for decades,) we may be approaching the limits of software's ability to improve its ability to find what a searcher wants. If that is true, and I suspect that it is, we will finally be forced to follow the trail of crumbs up the content life cycle... to its source. Indeed, most of the challenges inherent in today's search strategy and products appears to grow from the fact that while we continually increase our demands for intelligence on the back end, we have done little if anything to address the chaos that exists on the front end. You name it, different word processing formats, spreadsheets, HTML tagged text, database delimited files, and so on are all dumped into what we think of as a coherent, easily searchable body of intellectual property. It isn't and isn't likely to become so any time soon unless we address the source. Having spent some time in the library automation world, I can remember the sometimes bitter controversies over having just two major foundations for cataloging source material (Dewey and LC; add a third if you include the NICEM A/V scheme.) Had we known back then that the process of finding intellectual property would devolve into the chaos we now confront, with every search engine and database product essentialy rolling its own approach to rational search, we would have considered ourselves blessed. In the end, it seems, we must begin to see the source material, its physcial formats, its logical organization and its inclusion of rational cataloging and taxonomy elements as the conceptual raw material for its own location. As long as the word processing world teaches that anyone creating anything can make it look like it should in a dozen different ways, ignoring any semblance of finding-aid inclusion, we probably won't have a truly workable ability to find what we want without reworking the content or wading through a haystack of misses to find our desired hits. Unfortunately, the solutions of yesteryear, including after-creation cataloging by a professional cataloger, probably won't work now either, for cost if no other reason. We will be forced to approach the creators of valuable content, asking them for a minimum of preparation for searching their product, and providing the necessary software tools to make that possible. We can't act too soon because, despite the growth of software elegance and raw computer power, this situation will likely get worse as the sheer volume of valuable content grows. Regards, Barry Read more: Enterprise Search Practice Blog:  http://gilbane.com/search_blog/

In a world that seems increasingly about technology itself, it has become tempting to assume that the questions and challenges of new and better information products is about the technology.  While it is true that technology is the key enabler of the new information world we are building, it is also true that the decision making and judgment involved in how that technology is to be organized and deployed is of equal--and not decreasing--importance.  Indeed, as the products move toward increasing sophistication and flexibility--smart content you might say--the importance of the human and organizational parts of the information life cycle become even more important. 

It is a truism that you cannot deliver information products you can't create and manage, and with the circle of participants in that creation and management ever widening, we must be sensitive to the limits of the creators.  Moreover, while just "getting it up on the web" used to be at least sufficient to justify deployment of information products, today's information consumer has a much more extensive and demanding list of features required before he will accept web-based information.  The publisher who forgets  or ignores that list is for trouble.

In a half-day session preceding the Gilbane conference next week, the Gilbance consulting team will tackle some of the real world challenges inherent in this rapidly changing information world, providing both sign posts for issues likely to come up and "in the trenches" suggestions for how to deal with them.  The goal of the session, scheduled for the afternoon of December 1, is that the attendees leave with a better handle on how to proceed in the quest for better information products and the role "smart content" should play. 

The presenters, in addition to their expertise in the technology and tools of information, bring a unique resource to their efforts: years of design, implementation and evaluation of real organizations facing real challenges.

Bill's latest Tweet

NewsShark

Sign-up for our weekly NewsShark newsletter.
Content technology industry news without the hype:

* Email

* First Name

* Last Name

* = Required Field