The Gilbane Report: Volume 9, Number 7Editorial Interfaces & Enterprise-enabled Content
September 2001
Download a PDF version of this article Read the news for this issue.
Editorial Interfaces & Enterprise-enabled
Content
We have defined enterprise
content management as content management that goes beyond Web publishing
to manage all enterprise content for all enterprise applications. Enterprise
content management cannot be accomplished by a content management system alone.
Enterprise content management requires content with rich metadata and flexible
business rules that can be accessed and used by many different types of users
and applications. The applications and processes used for creating such enterprise-enabled
content are critical, and deserve careful consideration.
As organizations implement
enterprise content management they recognize quickly that authoring and editorial
processes that rely primarily on HTML collaboration and presentation tools are
inadequate. Tools that may work well for web publishing or single-purpose applications
quickly break down when faced with application and information integration requirements.
How do you create and maintain enterprise-enabled content? XML editors are certainly
one option, but many users need more traditional word processing access or highly
customized interfaces to specific content components and metadata. Companies
are unlikely to find a single approach that will meet all their user requirements.
This month Bill and David help you think through the various approaches to creating
enterprise-enabled content.
Editorial Interfaces &
Enterprise-enabled Content
Today, theres little
question about structured content being an essential requirement for content
management. As our last report discussed, content management and XML are part
and parcel of the new enterprise infrastructure. Content is drawn from a variety
of internal and external sources, must be kept up to date, often needs to effectively
address various audiences, and be displayed and distributed through a growing
number of formats and devices. The integration of content management systems
into enterprise information portals, personalization servers, localization and
translation applications, integrated business applications, and content security
platformsto name just some applicationsdemands that the content
being managed has intelligence applied to it. We refer to such content as enterprise-enabled
content; that is: content that can be accessed and utilized by multiple
types of users and applications. Enterprise-enabled content has business rules
and other metadata associated with it.
But if structured content
is a requirement, does it follow that structured editing must go
with it? And does structured editing mean XML editors for every
user? So far, the answer from the marketplace seems to be no. HTML forms-based
interfaces dominate, and if content management vendors see a next phase
for the editorial interface, they seem to be leaning toward tighter integration
of Microsoft Word. Some hope, of course, that the market will turn, en masse,
to XML editing tools. But our analysis suggests that likely wont happen
anytime soon. The state of the art in editorial interfaces is a mixed bag, and
likely will remain sowith some XML editing, some forms editing, some integration
of tools like Microsoft Word, and an assortment of specialized tools for conversion,
pre-tagging, and post-tagging of content as it goes in and out of content management
repositories.
Underneath this mix of approaches,
a useful question to ask is: Who adds the structure and intelligence to
the content, and how should it be added? Not surprisingly, the answer
is neither simple nor easy. But answering this question will force a company
to understand its own culture and its business processes in important ways and
help expose the technical requirements and the tools the enterprise might use.
Editorial Tools & Content
Management Systems
For many enterprises, todays
Web content comes to IT producers already formed, from such diverse sources
as technical documentation groups, HR staff, outside content providers and syndicators,
and marketing departments. Up until fairly recently, the Web team
has been narrowly focused on getting the content up and, within
some organizations, enhancing or customizing the look and feel of the content.
This narrow focus runs in opposition to the increasing trend within enterprises
to create and interact directly with mission-critical Web content, such as with
enterprise information portals. As enterprise content moves closer to the Web
interface, enterprises need a tighter integration of editorial tools within
their Web content management systems. Enterprises must balance their current
main emphasis on delivery of Web-focused content with tools for content creation,
collaboration, and shaping that allow more intelligence to be captured from
the start.
In the early days of the
Web, everyone who contributed to a site needed to be a savvy HTML user, or hand
off material to someone who was. As more and more people in an organization
contribute and update content, the need increases for simple-to-learn forms
and tools that allow contributors to concentrate on the material they are creating,
and not on the intricacies of how that content should be tagged for Web presentation.
It's also worth noting that direct entry of content in HTML has other significant
drawbacks, not the least of which are inconsistent formats and the replication
and propagation of differing versions of content and data. In short, HTML-based
content can become a data management nightmare.
Table 1. Characteristics
of web content types
|
Static Pages |
Dynamically Served
Pages |
Enterprise Enabled
Content |
| Content |
Undifferentiated |
Chunking |
Processable |
| Metadata |
<META>tag |
<META> with application
interpretation |
Multipurpose |
| Presentation |
HTML |
HTML |
HTML, XHTML, XML |
| Behavior |
HTML+Scripting |
Templates |
Servlets |
| Data Model |
None |
Relational (typically)
|
Schema |
What Kind of Content?
What Kind of Contributor? What Kind of Editorial Interface?
When it comes to content management systems and editorial interface issues,
there are several fundamental issues to consider, such as content volume and
complexity and the circumstances in which content is created, imported, and
edited. For example, some enterprises may have only relatively small amounts
of simple and slowly changing content that gets presented to a monolithic
audience, while other enterprises must provide complex, voluminous, and frequently
changing content to a variety of audiences, including the localization of
the content across languages and countries. If an enterprise requires personalization,
security, or digital rights management, they need sufficiently detailed tagging
to associate business rules and other metadata with the content.
Intelligent tagging of content is more effective and efficient when done earlier
in the editorial process, closer to the knowledge worker who knows the elemental
structure of the content matter and its business value and audiences. This
immediately raises basic cultural and technical questions for an enterprise:
- Do you want your knowledge
workers tagging content at a fine-grain level (read: XML)?
- If you do want your
knowledge workers doing fine-grain tagging, how far are you willing to go
to both enforce structure and support the knowledge worker in the task?
In other words, do we
all need to learn XML tagging, or are there other methods and tools that can
be brought to bear that will result in the content being sufficiently structured
and tagged?
Range of Content Creators,
Range of Content Actions
Even in the instances
where the content being managed for an enterprise is relatively simple, there
are many individuals within the enterprise who may interact with the content
management system, and these individuals have different roles and bring different
tasks to the content. Consider the varying needs of the original contributor,
the manager reviewing the content, the executive approving the final version,
and the person encoding the content in HTML and posting it on the site. Indeed,
as the content of the enterprise becomes more complex, and the volume of content
increases, so too do the roles and potential tasks associated with the content.
Activities that may affect
the same content over its lifecycle and require adding intelligence include:
- Content importing
- Content authoring/creating
- Content editing
- XML taggin
- Review/approva
- Workflow
- Version control
- Localizatio
- Updating
- Multiple platform transformation
- Personalization
- Digital rights management
- Syndication
HTML tools dont
do the trick for content managing of anything but the most simple, static,
and limited volume Web content, and even then, require other workflow applications
for tracking revisions and updating. Tools such as FrontPage and Dreamweaver
and HomeSite remain useful and popular tools for HTML editing and for display
template designing, but they are only useful for limited delivery channels.
They are not appropriate for managing enterprise-enabled content.
The following table suggests
the range of authoring options that should be provided to users. It's based
on how much they contribute to the enterprise content, and how much flexibility
is typically appropriate for authoring.
Table 2. User types
and tool requirements
| Type of User |
Preferred CM Authoring
Tools |
| Occasional Contributor |
Easy to use Web forms
with embedded workflow. |
| Knowledge Worker/Business
Line Manager |
Word processor or other
office application integrated with content management application; structured
editing interface to content management; Web forms as above. |
| Power User |
All of the above, plus
HTML and XML authoring. |
| IT Developer/Administrator
|
All of the above, plus
HTML and XML authoring, access to content management and authoring tool
APIs for further customization and integration. |
Different Editorial Interfaces,
Different Content Management Solutions
While developments in
editorial interfaces for content management systems lag behind most other
aspects of these systems, it is already clear that there are several kinds
of strategies for dealing with editorial issues. The strategies best for your
situation depend upon the nature of both your content and your business needs.
Most content management
tools focus on metadata capture through form- and template-based strategies
in the hope of providing accessible and workable interfaces for creation of
content. Others try to simplify the complex editorial processes of implementing
structured content by using common office applications, either by building
XML tagging capabilities on to Word, such as HyperVisions WorX (www.hvltd.com),
or as a tightly integrated postprocessor, such as Ineras eXtyles (www.inera.com).
Most (if not all) content management systems manage Word documents to some
degree, with some doing a better job than others in terms of chunking
and inferring the structure of the Word content. Others rely on at least being
able to index the Word document with some relevant metadata. Microsoft Word
also can serve as the front-end behind which another program enables XML or
other object-level tagging. J.D. Edwards (www.jdedwards.com)
Enterprise Content Manager, for example, uses Word as its front-end, managing
componentized Word files in its single-source repository and uses the components
to build output for multiple uses. And then there are the well established
SGML/XML editors for those who recognize that more pain up front yields better
and more useful results on the back end, such as Arbortexts Epic Editor
(www.arbortext.com).
Breaking content into
components that are managed within a content management system is effective
not only in regard to single-source publishing, but can provide measurable
ROI for some applications like translating and managing multi-lingual content.
The addition of structural tagging and business rules tagging to the content
can serve to maximize the contents usefulness from improved search and
retrieval, personalization, and business-related issues such as compliance
and security. As content management becomes more central to the operation
of the enterprise, these capabilities become critical.
When it comes to considering
editorial interfaces to or for content management systems, organizations have
a number of options:
- Forms-based interfaces
have become de rigueur for content management systems. This is mainly
because the CMS applications are often based on relational repositories,
and the shortest distance between a thin client and a relational database
is an HTML form. The bulk of CMS applications are reliant on this kind of
interface as the primary means of entering and updating content and assigning
metadata at a coarse level.
- Word processing applications
such as Microsoft Word can be used to create content to be stored in its
native form, or saved out as HTML or other neutral format. The challenge,
of course, is integrating the proprietary structures in Word with the more
neutral structures the content management system storeseither relational
databases, some sort of object storage, or XML. The level of granularity
in tagging with this option can vary greatly; while Word itself only applies
styles at a paragraph level, some editorial interfaces based on Word can
impose XML- or object-level granularity.
- HTML editors can be
used to create and edit content that is stored as HTML, or smaller, relatively
discrete chunks of content that can be mapped from the HTML
to the underlying data structures. This approach has little efficiency for
content situations where content needs to be finely grained and/or is of
great quality.
- Full-blown XML editing
tools can be used to create and edit all or some of the content, and can
also be used to interact with metadata. This approach provides the strongest
capability for finely grained tagging of content, but also can be more expensive
in terms of software seats, training, and support.
- Pre-processing and
post-processing tools can be used. For example, proprietary formats such
as Microsoft Word and Quark Express can be debinarized and then
run through filters on their way into the CMS, and then reverse processes
can be performed on the way out. The level of tag granularity is dependent
on the quality and capabilities of the filters and varies greatly.
Organizations that have
implemented CMS applications are typically using a combination of such approaches,
with forms-based interfaces leading the way. But each of the content creation/editing
approaches has its strengths and weaknesses.
In fact, some editorial
interfaces are more appropriate for some types of business situations than
others. Some content management tools focus on metadata-capture through form-
and template-based strategies in the hope of providing accessible and workable
interfaces for creation of content. Others try to simplify the complex editorial
processes of implementing structured content by using common office applications.
Content Editing is an
Application
The editorial interface
to a content repository is, in fact, an application, or even a series of applications.
This is, in effect, what the forms-based interface is, but it is often a haphazard
application, a loose coupling at best.
The installed CM systems
of today are replete with complex and multifaceted forms for content entry
and updating. Indeed, first- and second-generation CMS projects were plagued
with difficulties in first establishing such interfaces and later with maintaining
them. The problem has been that if a CM customer implemented an editorial
interface, and then later needed to change the underlying data structures
then the enterprise was likely left with having to heavily modify or completely
rewrite the editorial interface. This is especially true with CM applications
that use relational databases as the underlying data store. This is not an
approach that supports the goals of enterprise-enabled content.
One of the advantages
of XML is that it makes such modifications easier. If the underlying data
store is a relational database, and the interface is a heavily programmed
form, a change to the underlying data structure is a complex, programming-heavy
undertaking. If the underlying data structure is XML, a change to the underlying
data structure typically means modifying the DTD and running some simple process
so that the XML editor parses the text according to the revised DTD (which
is usually a menu selection in a commercial XML editor).
This is not to say that
all editorial interfaces must be XML editors. Indeed, the nature of enterprise
content management is that the underlying data structures will be a combination
of relational, XML, and other types. There will also be all manner of content
in terms of length, value, and shelf life. The editorial interface(s) should
then be appropriate to:
- The type of contentboth
the data type and the length.
- The type of userfrom
occasional contributor to IT administrator.
- The point in the lifecycle
of the contentinitial creation through editing and updating.
- The shelf life of the
content.
- The level of the tagging
granularity required.
Such considerations would
encourage implementers of CMS systems to consider and support a variety of
potential interfaces. Several examples follow:
- A regular contributor
to an XML database of complex and lengthy documents would be well served
with a tightly integrated XML editor. But an ad hoc contributor to
that same database should be given a simple-to-use and foolproof tool that
exposes precisely the content this contributor needs to edit, and validates
the content before returning it to the repository. This ad hoc user could
also be supported by a workflow process that forwards his or her revised
content to a more skilled editorial user to ensure the content was entered
or updated correctly.
- Corporate users of
an Intranet that supports a small number of simple document types could
use a set of Microsoft Word templates, with the Word files then being processed
through a tool that normalizes the files into the format required by the
CM repository. When the documents need to be modified, reverse processes
could reconstruct the Word files for further updating and editing.
- Metadata is likely
a combination of XML, relational and other data types. By its nature, metadata
is often structured, discrete, and relatively short in length. It also may
include fixed values, choice groups (yes or no,
or X, Y, or Z.), and other data types
that lend themselves to structured interfaces and enforced validation. On
the low end of the spectrum, an author contributing a Microsoft Word file
could be required to fill out a simple form or even be required to fill
out the Property Sheet embedded within Word (File Menu
Properties). In a more complex environment, a knowledge worker could be
provided with an XML editing tool as an interface to the required metadata.
The GUI of a commercial XML editor such as Arbortext Epic can be configured
to behave like a forms-based interface, while in fact capturing and storing
XML data.
This analysis suggests,
and we think realistically, that not all content will be XML or will have
to be XML. However, enterprise-enabled content needs to componentized
and structured and needs to be accessible by a variety of editorial interfaces.
These interfaces are best thought of as an application. The reality is that
a mix of data types will continue to exist, and the editorial interfaces to
such data must deal with this reality. Requirements analysis and design undertaken
as you roll out enterprise content solutions is becoming more complex but
also more critical. The analysis should be clear-eyed, realistic, and thorough.
The resulting interfaces should be sturdy, useful, and self-revealing. The
goals should include data quality and integrity, but also productivity. The
business of content management is critical to many organizations; putting
the right tools in the hands of the content creators may be the best money
an organization can spend in the months ahead.
-- Bill Trippe, David
R. Guenette
|