Recently in DITA - Darwin Information Typing Architecture Category

Over the past few weeks, since publishing Smart Content in the Enterprise, I’ve had several fascinating lunchtime conversations with colleagues concerned about content technologies. Our exchanges wind up with a familiar refrain that goes something like this. “Geoffrey, you have great insights about smart content but what am I supposed to do with all this information?” Ah, it’s the damning with faint praise gambit that often signals an analysis paralysis conundrum for decision-making.

Let me make one thing perfectly clear -- I do not have an out-of-the-box prescription for a solution. It’s not simply a matter of focusing on your customer experience, optimizing your content for search, investing in a component content management platform, or adopting DITA – although, depending on the situation, I may recommend some combination of these items as part of a smart content strategy.

For me, smart content remains a work in progress. I expect to develop the prescriptive road map in the months ahead. Here’s a quick take on where I am right now.

  • For publishers, it’s all about transforming the publishing paradigm through content enrichment – defining the appropriate level of granularity and then adding the semantic metadata for automated processing.
  • For application developers, it’s all about getting the information architecture right and ensuring that it’s extensible. There needs to be sensible storage, the right editing and management tools, multiple methods for organizing content, as well as a flexible rendering and production environment.
  • For business leaders and decision makers, there needs to be an upfront investment in the right set of content technologies that will increase profits, reduce operating costs, and mitigate risks. No, I am not talking about rocket science. But you do need a technology strategy and a business plan.

As highlighted by the case studies included in the report, I can point to multiple examples where organizations have done the right things to produce notable results. Dale and I will continue the smart content discussions at the Gilbane Boston conference right after Thanksgiving, both through our preconference workshop, and at a conference session “Smart Content in the Real World: Case Studies and Real Results.”

We are also launching a Smart Content Readiness Service, where we will engage with organizations on a consulting basis to identify:

  • The business drivers where smart content will ensure competitive advantage when distributing business information to customers and stakeholders
  • The technologies, tools, and skills required to componentized content, and target distribution to various audiences using multiple devices
  • The operational roles and governance needed to support smart content development and deployment across an organization
  • The implementation planning strategies and challenges to upgrade content and creation and delivery environments

Please contact me if you are interested in learning more.

In short, to answer my lunchtime colleagues, I cannot (yet) prescribe a fully baked solution. It’s too early for the recipes and the cookbook. But I do believe that the business opportunities and benefits are readily at hand. At this point, I would invite you to join the discussion by letting me know what you expect, what approaches you’ve tried, where you’ve wound up, what you think needs to come next – and how we might help you.

The Pull of Content Value

user-pic
Vote 4 Votes  

Traditionally, publishing is a pushy process. When I have something to say, I write it down. Perhaps I revise it, check with colleagues, and verify my facts with appropriate authorities. Then I publish it, and move on to the next thing – without directly interacting with my audience and stakeholders. Whether I distribute the content electronically or in a hard copy format, I leave it to my readers to determine the value of whatever I publish.

However, as we describe in our recently completed report Smart Content in the Enterprise, XML applications can transform this conventional publishing paradigm. By smart content, we mean content that is granular at the appropriate level, semantically rich, useful across applications, and meaningful for collaborative interaction.

From a business perspective, smart content adds value to published information in new and compelling ways. Let’s consider the experiences of NetApp and Warrior Gateway, two of the organizations featured in our report.

NetApp
As a provider of storage and data management solutions, NetApp has invested a lot of time and effort embracing DITA and restructuring its technical documentation. By systematically tagging and managing content components, and by focusing on the underlying content development processes, writers and editors can keep up with the pace of product releases.

But there is more to this publishing process orientation. Beyond simply producing product information faster and cheaper, NetApp is poised to make publishing better. The company can now easily support its reseller partners by providing them with the DITA tagged content that they can directly incorporate into their own OEM solutions. Resellers' customers get just the information they need, directly from the source. With its XML application, NetApp incorporates its partners and stakeholders into its information value chain.

Warrior Gateway
As a content aggregator, Warrior Gateway collects, organizes, enriches, and redistributes content about a wide range of health, welfare, and veteran-related services to soldiers, veterans, and their families. Rather than simply compiling an online catalog of service providers’ listings, Warrior Gateway restructures the content that government, military, and local organizations produce, and enriches it by adding veteran-related categories and other information. Furthermore, Warrior Gateway adds a social dimension by encouraging contributions from veterans and family members.

Once stored within the XML application powering Warrior Gateway, the content is easily reorganized and reclassified to provide the veterans’ perspective about areas of interest and importance. Volunteers working with Warrior Gateway can add new categories when necessary. Service providers can claim their profile and improve their own data details. Even the public users can contribute to content to the gateway, a crowd sourcing strategy to efficiently collect feedback from users. With contributions from multiple stakeholders, the published listings can be enriched over time without requiring a large internal staff to add the extra information.

Capturing New Business Value
There’s a lot more detail about how the XML applications work in our case studies – I recommend that you check them out.

What I find intriguing is the range of promising and potentially profitable business models engendered by smart content.  Enterprise publishers have new options and can go beyond simply pushing content through a publishing process. Now they can build on their investments, and capture the pull of content value.

One of the conclusions of our report Smart Content in the Enterprise (forthcoming next week) is how a little bit of enrichment goes a long way. It’s important to build on your XML infrastructure, enrich your content a little bit (to the extent that your business environment is able to support), and expect to iterate over time.

Consider what happened at Citrix, reported in our case study Optimizing the Customer Experience at Citrix: Restructuring Documentation and Training for Web Delivery. The company had adopted DITA for structured publishing several years ago. Yet just repurposing the content in product manuals for print and electronic distribution, and publishing the same information as HTML and PDF documents, did not change the customer experience.

A few years ago, Citrix information specialists had a key insight: customers expected to find support information by googling the web. To be sure, there was a lot of content about various Citrix products out in cyberspace, but very little of it came directly from Citrix. Consequently the most popular solutions available via web-wide searching were not always reliable, and the detailed information from Citrix (buried in their own manuals) was rarely found.

What did Citrix do? Despite limited resources, the documentation group began to add search metadata to the product manuals. With DITA, there was already a predefined structure for topics, used to define sections, chapters, and manuals. Authors and editors could simply include additional tagged metadata that identified and classified the contents – and thus expose the information to Google and other web-wide search engines.

Nor was there a lot of time or many resources for up-front design and detailed analysis. To paraphrase a perceptive information architect we interviewed, “Getting started was a lot like throwing the stuff against a wall to see what sticks.” At first tags simply summarized existing chapter and section headings. Significantly, this was a good enough place to start.

Specifically, once Citrix was able to join the online conversation with its customers, it was also able to begin tracking popular search terms. Then over time and with successive product releases, the documentation group was able to add additional tagged metadata and provide ever more focused (and granular) content components.

What does this mean for developing smart content and leveraging the benefits of XML tagging? Certainly the more precise your content enrichment, the more findable your information is going to be. When considering the business benefits of search engine optimization, the quality of your tagging can always improve over time. But as a simple value proposition, getting started is the critical first step.

If you're only reading this XML blog, be sure to check out my recent blog post Focusing on Smart Content, which I published in the main Gilbane blog.

As part of next week's Gilbane Boston Conference, the XML practice will be delivering a pre-conference workshop, "Managing Smart Content: How to Deploy XML Technologies across Your Organization." The instructors will be Geoff Bock, Dale Waldt, Bill Trippe, Barry Schaeffer and Neal Hannon--a group of experts that represents decades of technical and management experience on XML initiatives.

A tip of the virtual hat to Senior Analyst Geoff Bock for organizing this.

I recently wrote a short Gilbane Spotlight article for the EMC XML community site about the state of Iowa going paperless (article can be found here) in regards to its Administrative Code publication. It got me to thinking, "When is a book no longer a book?"

Originally the admin code was produced as a 10,000 page loose-leaf publication service containing all the regulations of the state. For the last 10 years it has also appeared on the Web as PDFs of pages, and more recently, independent data chunks in HTML. And now they have discontinued the commercial printing of the loose-leaf version and only rely on the electronic versions to inform the public. They still produce PDF pages that resemble the printed volumes that are intended for local printing of select sections by public users of the information. But the electronic HTML version is being enhanced to improve reusability of the content, present it in alternative forms and integrated with related materials, etc. Think mashups and improved search capabilities. The content is managed in an XML-based Single Source Publishing system that produces all output forms.

I have migrated many, many printed publications to XML SSP platforms. Most follow the same evolutionary path regarding how the information is delivered to consumers. First they are printed. Then a second electronic copy is produced simultaneously with the print using separate production processes. Then the data is organized in a single database and reformatted to allow editing that can produce both print and electronic. Eventually the data gets enhanced and possibly broken into chunks to better enable reusing the content, but the print is still a viable output format. Later, the print is discontinued as the subscription list falls and the print product is no longer feasible. Or the electronic version is so much better, that people stop buying the print version.
So back to the original question, is it no longer a book? Is it when you stop printing pages? Or when you stop producing the content in page-oriented PDFs? Or does it have to do with how you manage and store the information?

Other changes take place in how the information is edited, formatted, and stored that might influence the answer to the question. For instance, if the content is still managed as a series of flat files, like chapters, and assembled for print, it seems to me that it is still a book, especially if it still contains content that is very book oriented, like tables of contents and other front matter, indexes, and even page numbers. Eventually, the content may be reorganized as logical chunks stored in a database, extracted for one or more output formats and organized appropriately for each delivery version, as in SSP systems. Print artifacts like TOCs may be completely generated and not stored as persistent objects, or they can be created and managed as build lists or maps (like with DITA). As long as one version is still book-like, IMHO it is still a book.

I would posit that once the printed versions are discontinued, and all electronic versions no longer contain print-specific artifacts, then maybe this is no longer a book, but simply content.

Holiday weeks can be sleepy weeks in enterprise software news, but this week has seen one significant press release each day in the XML content management market, or component content management (CCM) market if you prefer.

First, the necessary disclosures and caveats. Of the six companies mentioned, we've worked with all of them, I believe, and I actually worked for XyEnterprise back in the 1980s and early 1990s. That said, each of these announcements is significant.

SDL, through both organic growth and acquistion, has grown into a substantial business that spans globalization technology, globalization services, CCM technology, and WCM technology. My colleagues Mary Laplante and Leonor Ciarlone know them much better as a company, but I believe it is safe to say that SDL is in a unique position spanning essentially four markets, but four markets that make a great deal of sense under a single umbrella. The product support content managed in a CCM technology is the best point of integration for globalization/translation tools. A CCM technology is also an excellent underpinning for a global company's web presence or web precenses (the latter more likely, especially when one considers the need for localized web sites). And services are an essential piece of this puzzle. It's the rare company that staffs heavily for localization, and even when they do, very few would staff full time to cover all of their language needs. Is SDL in a position to represent one-stop shopping for large companies with complex product content that needs to be localized into many languages? Again, my colleagues could answer that question more precisely, but it's not a crazy question to ask.

Mary has more on SDL XySoft over in the globalization blog.

The acquisition also breathes new life into XyEnterprise, a company with highly functional, mature technology and excellent executive leadership. We take it as a very positive sign that XyEnterprise CEO Kevin Duffy will become the CEO of the newly combined business unit, reporting to Mark Lancaster, Chairman and CEO of SDL.

The Really Strategies acquistion of DocZone is on a smaller scale of course, but it is is significant in that these two companies represent two leading trends in the CCM marketplace--management of component content in native XML repositories (MarkLogic Server for RSuite and Documentum Content Store for one version of DocZone) and Software as a Service (SaaS). Count me among those who have been skeptical at times about SaaS for CCM, but DocZone, under Dan Dube's leadership, has made it work. Really Strategies, in the mean time, has developed an impressive CCM offering on top of Mark Logic Server, and they have quietly built up a strong customer list.  We think the combined companies complement each other, and the new management team is excellent, with Barry Bealer as CEO, co-founder Lisa Bos as CTO, Ann Michael in charge of services, and Dan Dube as VP Sales and Marketing.

Which brings us to Quark and EMC. Both companies have been developing more CCM capabilities. EMC acquired X-Hive, and a lot of XML expertise along with it. They have since added more XML expertise on both the product management and engineering side. As they have integrated X-Hive into the Documentum platform, they have logically looked to build out more capabilities and applications for vertical markets. The integration with Quark XML Author makes perfect sense for them, giving their customers and prospects a ready mechanism for XML authoring in a familiar editorial tool.

For Quark's part, the move is a logical and very positive next step. They had previously announced this kind of integration with IBM Content Manager, which has a strong presence in the manufacturing space. With EMC, Quark now has a strong partner in the pharma space. Documentum has long dominated pharma, and Quark XML Author, under Michael Boses and previous owner In.Vision, had built up a long list of pharma customers. Boses and his team know the pharma data structures inside and out, and it will be interesting to see the details of how Quark XML Author will integrate with Documentum and its storage mechanisms. (I am sure both EMC and Quark see the potential as more than just the pharma market--government is also a good target here--but the pharma angle will be fruitful I am sure.)

So, what news is on tap for tomorrow?

As part of our Gilbane Onsite Technology Strategy Workshop Series, we are happy to announce a new workshop, Implementing DITA.

Course Description

DITA, the Darwin Information Typing Architecture is an emerging standard for content creation, management, and distribution. How does DITA differ from other XML applications? Will it work for my vertical industry’s content? From technical documentation, to training manuals, from scientific papers to statutory publishing. DITA addresses one of the most challenging aspects of XML implementation, developing a data model that can be user and shared with information partners. Even so, DITA implementation requires effective process, software, and content management strategies to achieve the benefits promised by the DITA business case, cost-effective, reusable content. This seminar will familiarize you with DITA concepts and terminology, describe business benefits, implementation challenges, and best practices for adopting DITA. How DITA enables key business processes will be explored, including content management, formatting & publishing, multi-lingual localization, and reusable open content. Attendees will be able to participate in developing an effective DITA content management strategy.

Audience

This is an introductory course suitable for anyone looking to better understand DITA standard, terminology, processes, benefits, and best practices. A basic understanding of computer processing applications and production processes is helpful. Familiarity with XML concepts and publishing helpful, but not required. No programming experience required.

Topics Covered

  • The Business Drivers for DITA Adoption

  • DITA Concepts and Terminology

  • The DITA Content Model

  • Organizing Content with DITA Maps

  • Processing, Storing & Publishing DITA Content

  • DITA Creation, Management & Processing Tools

  • Multi-lingual Publishing with DITA

  • Extending DITA to work with Other Data Standards

  • Best Practices & Pitfalls for DITA Implementation

For more information and to customize a workshop just for your organization, please contact Ralph Marto by email or at +617.497.9443 x117

Traditionally, the idea of structured content has always been associated with product documentation, but this is beginning to change. Featuring Bill Trippe, Lead Analyst at The Gilbane Group, and Bruce Sharpe, XMetaL Founding Technologist at JustSystems, a brand new podcast on The Business Value of Structured Content takes a look into why many companies are beginning to realize that structured content is more than just a technology for product documentation - it's a means to add business value to information across the whole enterprise. 

From departmental assets such as marketing website content, sales training materials, or technical support documents, structured content can be used to grow revenue, reduce costs, and mitigate risks, ultimately leading to an improved customer experience.  

Listen to the podcast and gain important insight on how structured content can

  • break through the boundaries of product documentation
  • help organizations meet high user expectations for when and where they can access content
  • prove to be especially valuable in our rough economic times
  • ...and more!

Boston-Area DITA Users Group

user-pic
Vote 1 Vote  

Robert D Anderson from IBM, Chief Architect of the DITA Open Toolkit, writes:

Hello,

This note is to announce that after some time off, the Boston area DITA
Users Group will be starting up again in 2009. To get things started, we
have created a new group Yahoo, so that we will be in sync with and
searchable by users of the many other Yahoo DITA lists. If you are
interested in joining the DITA Boston Users Group, please visit this page
for sign-up info.

We will soon be sending a survey to that list with proposed meeting topics,
so please sign up in order to help us decide what to feature. We will also
be looking for companies willing to host a meeting; if you already know you
are interested in hosting, please join the group and send a note to
ditabug-owner (which will go to me as well as to Liz Augustine and Lee Anne
Kowalski).

Bill's latest Tweet

NewsShark

Sign-up for our weekly NewsShark newsletter.
Content technology industry news without the hype:

* Email

* First Name

* Last Name

* = Required Field