Gilbane Report logoContent Management Technologies, Trends & Advice

Gilbane San Francisco and Boston banner
Gilbane Reports

The Gilbane Report: Volume 9, Number 7

Editorial Interfaces & Enterprise-enabled Content

September 2001

Download a PDF version of this article

Read the news for this issue.

Editorial Interfaces & Enterprise-enabled Content

We have defined enterprise content management as “content management that goes beyond Web publishing to manage all enterprise content for all enterprise applications.” Enterprise content management cannot be accomplished by a content management system alone. Enterprise content management requires content with rich metadata and flexible business rules that can be accessed and used by many different types of users and applications. The applications and processes used for creating such enterprise-enabled content are critical, and deserve careful consideration.

As organizations implement enterprise content management they recognize quickly that authoring and editorial processes that rely primarily on HTML collaboration and presentation tools are inadequate. Tools that may work well for web publishing or single-purpose applications quickly break down when faced with application and information integration requirements. How do you create and maintain enterprise-enabled content? XML editors are certainly one option, but many users need more traditional word processing access or highly customized interfaces to specific content components and metadata. Companies are unlikely to find a single approach that will meet all their user requirements. This month Bill and David help you think through the various approaches to creating enterprise-enabled content.

Editorial Interfaces & Enterprise-enabled Content

Today, there’s little question about structured content being an essential requirement for content management. As our last report discussed, content management and XML are part and parcel of the new enterprise infrastructure. Content is drawn from a variety of internal and external sources, must be kept up to date, often needs to effectively address various audiences, and be displayed and distributed through a growing number of formats and devices. The integration of content management systems into enterprise information portals, personalization servers, localization and translation applications, integrated business applications, and content security platforms—to name just some applications—demands that the content being managed has intelligence applied to it. We refer to such content as enterprise-enabled content; that is: content that can be accessed and utilized by multiple types of users and applications. Enterprise-enabled content has business rules and other metadata associated with it.

But if structured content is a requirement, does it follow that “structured editing” must go with it? And does “structured editing” mean XML editors for every user? So far, the answer from the marketplace seems to be no. HTML forms-based interfaces dominate, and if content management vendors see a “next phase” for the editorial interface, they seem to be leaning toward tighter integration of Microsoft Word. Some hope, of course, that the market will turn, en masse, to XML editing tools. But our analysis suggests that likely won’t happen anytime soon. The state of the art in editorial interfaces is a mixed bag, and likely will remain so—with some XML editing, some forms editing, some integration of tools like Microsoft Word, and an assortment of specialized tools for conversion, pre-tagging, and post-tagging of content as it goes in and out of content management repositories.

Underneath this mix of approaches, a useful question to ask is: “Who adds the structure and intelligence to the content, and how should it be added?” Not surprisingly, the answer is neither simple nor easy. But answering this question will force a company to understand its own culture and its business processes in important ways and help expose the technical requirements and the tools the enterprise might use.

Editorial Tools & Content Management Systems

For many enterprises, today’s Web content comes to IT producers already formed, from such diverse sources as technical documentation groups, HR staff, outside content providers and syndicators, and marketing departments. Up until fairly recently, the “Web team” has been narrowly focused on “getting the content up” and, within some organizations, enhancing or customizing the look and feel of the content. This narrow focus runs in opposition to the increasing trend within enterprises to create and interact directly with mission-critical Web content, such as with enterprise information portals. As enterprise content moves closer to the Web interface, enterprises need a tighter integration of editorial tools within their Web content management systems. Enterprises must balance their current main emphasis on delivery of Web-focused content with tools for content creation, collaboration, and shaping that allow more intelligence to be captured from the start.

In the early days of the Web, everyone who contributed to a site needed to be a savvy HTML user, or hand off material to someone who was. As more and more people in an organization contribute and update content, the need increases for simple-to-learn forms and tools that allow contributors to concentrate on the material they are creating, and not on the intricacies of how that content should be tagged for Web presentation. It's also worth noting that direct entry of content in HTML has other significant drawbacks, not the least of which are inconsistent formats and the replication and propagation of differing versions of content and data. In short, HTML-based content can become a data management nightmare.

Table 1. Characteristics of web content types

Static Pages Dynamically Served Pages Enterprise Enabled Content
Content Undifferentiated Chunking Processable
Metadata <META>tag <META> with application interpretation Multipurpose
Presentation HTML HTML HTML, XHTML, XML
Behavior HTML+Scripting Templates Servlets
Data Model None Relational (typically) Schema

What Kind of Content? What Kind of Contributor? What Kind of Editorial Interface?

When it comes to content management systems and editorial interface issues, there are several fundamental issues to consider, such as content volume and complexity and the circumstances in which content is created, imported, and edited. For example, some enterprises may have only relatively small amounts of simple and slowly changing content that gets presented to a monolithic audience, while other enterprises must provide complex, voluminous, and frequently changing content to a variety of audiences, including the localization of the content across languages and countries. If an enterprise requires personalization, security, or digital rights management, they need sufficiently detailed tagging to associate business rules and other metadata with the content.
Intelligent tagging of content is more effective and efficient when done earlier in the editorial process, closer to the knowledge worker who knows the elemental structure of the content matter and its business value and audiences. This immediately raises basic cultural and technical questions for an enterprise:

  • Do you want your knowledge workers tagging content at a fine-grain level (read: XML)?
  • If you do want your knowledge workers doing fine-grain tagging, how far are you willing to go to both enforce structure and support the knowledge worker in the task?

In other words, do we all need to learn XML tagging, or are there other methods and tools that can be brought to bear that will result in the content being sufficiently structured and tagged?

Range of Content Creators, Range of Content Actions

Even in the instances where the content being managed for an enterprise is relatively simple, there are many individuals within the enterprise who may interact with the content management system, and these individuals have different roles and bring different tasks to the content. Consider the varying needs of the original contributor, the manager reviewing the content, the executive approving the final version, and the person encoding the content in HTML and posting it on the site. Indeed, as the content of the enterprise becomes more complex, and the volume of content increases, so too do the roles and potential tasks associated with the content.

Activities that may affect the same content over its lifecycle and require adding intelligence include:

  • Content importing
  • Content authoring/creating
  • Content editing
  • XML taggin
  • Review/approva
  • Workflow
  • Version control
  • Localizatio
  • Updating
  • Multiple platform transformation
  • Personalization
  • Digital rights management
  • Syndication

HTML tools don’t do the trick for content managing of anything but the most simple, static, and limited volume Web content, and even then, require other workflow applications for tracking revisions and updating. Tools such as FrontPage and Dreamweaver and HomeSite remain useful and popular tools for HTML editing and for display template designing, but they are only useful for limited delivery channels. They are not appropriate for managing enterprise-enabled content.

The following table suggests the range of authoring options that should be provided to users. It's based on how much they contribute to the enterprise content, and how much flexibility is typically appropriate for authoring.

Table 2. User types and tool requirements

Type of User Preferred CM Authoring Tools
Occasional Contributor Easy to use Web forms with embedded workflow.
Knowledge Worker/Business Line Manager Word processor or other office application integrated with content management application; structured editing interface to content management; Web forms as above.
Power User All of the above, plus HTML and XML authoring.
IT Developer/Administrator All of the above, plus HTML and XML authoring, access to content management and authoring tool APIs for further customization and integration.

Different Editorial Interfaces, Different Content Management Solutions

While developments in editorial interfaces for content management systems lag behind most other aspects of these systems, it is already clear that there are several kinds of strategies for dealing with editorial issues. The strategies best for your situation depend upon the nature of both your content and your business needs.

Most content management tools focus on metadata capture through form- and template-based strategies in the hope of providing accessible and workable interfaces for creation of content. Others try to simplify the complex editorial processes of implementing structured content by using common office applications, either by building XML tagging capabilities on to Word, such as HyperVision’s WorX (www.hvltd.com), or as a tightly integrated postprocessor, such as Inera’s eXtyles (www.inera.com). Most (if not all) content management systems manage Word documents to some degree, with some doing a better job than others in terms of “chunking” and inferring the structure of the Word content. Others rely on at least being able to index the Word document with some relevant metadata. Microsoft Word also can serve as the front-end behind which another program enables XML or other object-level tagging. J.D. Edwards’ (www.jdedwards.com) Enterprise Content Manager, for example, uses Word as its front-end, managing componentized Word files in its single-source repository and uses the components to build output for multiple uses. And then there are the well established SGML/XML editors for those who recognize that more pain up front yields better and more useful results on the back end, such as Arbortext’s Epic Editor (www.arbortext.com).

Breaking content into components that are managed within a content management system is effective not only in regard to single-source publishing, but can provide measurable ROI for some applications like translating and managing multi-lingual content. The addition of structural tagging and business rules tagging to the content can serve to maximize the content’s usefulness from improved search and retrieval, personalization, and business-related issues such as compliance and security. As content management becomes more central to the operation of the enterprise, these capabilities become critical.

When it comes to considering editorial interfaces to or for content management systems, organizations have a number of options:

  1. Forms-based interfaces have become de rigueur for content management systems. This is mainly because the CMS applications are often based on relational repositories, and the shortest distance between a thin client and a relational database is an HTML form. The bulk of CMS applications are reliant on this kind of interface as the primary means of entering and updating content and assigning metadata at a coarse level.
  2. Word processing applications such as Microsoft Word can be used to create content to be stored in its native form, or saved out as HTML or other neutral format. The challenge, of course, is integrating the proprietary structures in Word with the more neutral structures the content management system stores—either relational databases, some sort of object storage, or XML. The level of granularity in tagging with this option can vary greatly; while Word itself only applies styles at a paragraph level, some editorial interfaces based on Word can impose XML- or object-level granularity.
  3. HTML editors can be used to create and edit content that is stored as HTML, or smaller, relatively discrete “chunks” of content that can be mapped from the HTML to the underlying data structures. This approach has little efficiency for content situations where content needs to be finely grained and/or is of great quality.
  4. Full-blown XML editing tools can be used to create and edit all or some of the content, and can also be used to interact with metadata. This approach provides the strongest capability for finely grained tagging of content, but also can be more expensive in terms of software seats, training, and support.
  5. Pre-processing and post-processing tools can be used. For example, proprietary formats such as Microsoft Word and Quark Express can be “debinarized” and then run through filters on their way into the CMS, and then reverse processes can be performed on the way out. The level of tag granularity is dependent on the quality and capabilities of the filters and varies greatly.

Organizations that have implemented CMS applications are typically using a combination of such approaches, with forms-based interfaces leading the way. But each of the content creation/editing approaches has its strengths and weaknesses.

In fact, some editorial interfaces are more appropriate for some types of business situations than others. Some content management tools focus on metadata-capture through form- and template-based strategies in the hope of providing accessible and workable interfaces for creation of content. Others try to simplify the complex editorial processes of implementing structured content by using common office applications.

Content Editing is an Application

The editorial interface to a content repository is, in fact, an application, or even a series of applications. This is, in effect, what the forms-based interface is, but it is often a haphazard application, a loose coupling at best.

The installed CM systems of today are replete with complex and multifaceted forms for content entry and updating. Indeed, first- and second-generation CMS projects were plagued with difficulties in first establishing such interfaces and later with maintaining them. The problem has been that if a CM customer implemented an editorial interface, and then later needed to change the underlying data structures then the enterprise was likely left with having to heavily modify or completely rewrite the editorial interface. This is especially true with CM applications that use relational databases as the underlying data store. This is not an approach that supports the goals of enterprise-enabled content.

One of the advantages of XML is that it makes such modifications easier. If the underlying data store is a relational database, and the interface is a heavily programmed form, a change to the underlying data structure is a complex, programming-heavy undertaking. If the underlying data structure is XML, a change to the underlying data structure typically means modifying the DTD and running some simple process so that the XML editor parses the text according to the revised DTD (which is usually a menu selection in a commercial XML editor).

This is not to say that all editorial interfaces must be XML editors. Indeed, the nature of enterprise content management is that the underlying data structures will be a combination of relational, XML, and other types. There will also be all manner of content in terms of length, value, and shelf life. The editorial interface(s) should then be appropriate to:

  1. The type of content—both the data type and the length.
  2. The type of user—from occasional contributor to IT administrator.
  3. The point in the lifecycle of the content—initial creation through editing and updating.
  4. The shelf life of the content.
  5. The level of the tagging granularity required.

Such considerations would encourage implementers of CMS systems to consider and support a variety of potential interfaces. Several examples follow:

  • A regular contributor to an XML database of complex and lengthy documents would be well served with a tightly integrated XML editor. But an ad hoc contributor to that same database should be given a simple-to-use and foolproof tool that exposes precisely the content this contributor needs to edit, and validates the content before returning it to the repository. This ad hoc user could also be supported by a workflow process that forwards his or her revised content to a more skilled editorial user to ensure the content was entered or updated correctly.
  • Corporate users of an Intranet that supports a small number of simple document types could use a set of Microsoft Word templates, with the Word files then being processed through a tool that normalizes the files into the format required by the CM repository. When the documents need to be modified, reverse processes could reconstruct the Word files for further updating and editing.
  • Metadata is likely a combination of XML, relational and other data types. By its nature, metadata is often structured, discrete, and relatively short in length. It also may include fixed values, choice groups (“yes” or “no,” or “X,” “Y,” or “Z.”), and other data types that lend themselves to structured interfaces and enforced validation. On the low end of the spectrum, an author contributing a Microsoft Word file could be required to fill out a simple form or even be required to fill out the “Property Sheet” embedded within Word (File Menu… Properties). In a more complex environment, a knowledge worker could be provided with an XML editing tool as an interface to the required metadata. The GUI of a commercial XML editor such as Arbortext Epic can be configured to behave like a forms-based interface, while in fact capturing and storing XML data.

This analysis suggests, and we think realistically, that not all content will be XML or will have to be XML. However, enterprise-enabled content needs to componentized and structured and needs to be accessible by a variety of editorial interfaces. These interfaces are best thought of as an application. The reality is that a mix of data types will continue to exist, and the editorial interfaces to such data must deal with this reality. Requirements analysis and design undertaken as you roll out enterprise content solutions is becoming more complex but also more critical. The analysis should be clear-eyed, realistic, and thorough. The resulting interfaces should be sturdy, useful, and self-revealing. The goals should include data quality and integrity, but also productivity. The business of content management is critical to many organizations; putting the right tools in the hands of the content creators may be the best money an organization can spend in the months ahead.

-- Bill Trippe, David R. Guenette

Subscribe to NewsShark
Content technology industry news without the hype

Email Address:*
First Name:*
Last name*
* = Required Field

RSS/XML Newsfeeds
Industry News
Event Announcements
Analyst Blog
Enterprise Search Blog
Publishing Technology Blog
Globalization Blog
Collaboration Blog
Web Content Management Blog


The Gilbane Report is published by Bluebill Advisors, Inc. © 1993 - 2005 The Gilbane Report. All Rights Reserved.
Contact | Editorial Policy | Privacy Policy | Site Map