Recently in Collaboration Category

If you have been following recent XML Technologies blog entries, you will notice we have been talking a lot lately about XML Smart Content, what it is and the benefits it can bring to an organization. These include flexible, dynamic assembly for delivery to different audiences, search optimization to improve customer experience, and improvements for distributed collaboration. Great targets to aim for, but you may ask are we ready to pursue these opportunities? It might help to better understand the technology landscape involved in creating and delivering smart content.

The figure below illustrates the technology landscape for smart content. At the center are fundamental XML technologies for creating modular content, managing it as discrete chunks (with or without a formal content management system), and publishing it in an organized fashion. These are the basic technologies for "one source, one output" applications, sometimes referred to as Singe Source Publishing (SSP) systems.

SCLandscape.jpg

The innermost ring contains capabilities that are needed even when using a dedicated word processor or layout tool, including editing, rendering, and some limited content storage capabilities. In the middle ring are the technologies that enable single-sourcing content components for reuse in multiple outputs. They include a more robust content management environment, often with workflow management tools, as well as multi-channel formatting and delivery capabilities and structured editing tools. The outermost ring includes the technologies for smart content applications, which are described below in more detail.

It is good to note that smart content solutions rely on structured editing, component management, and multi-channel delivery as foundational capabilities, augmented with content enrichment, topic component assembly, and social publishing capabilities across a distributed network. Descriptions of the additional capabilities needed for smart content applications follow.

Content Enrichment / Metadata Management: Once a descriptive metadata taxonomy is created or adopted, its use for content enrichment will depend on tools for analyzing and/or applying the metadata. These can be manual dialogs, automated scripts and crawlers, or a combination of approaches. Automated scripts can be created to interrogate the content to determine what it is about and to extract key information for use as metadata. Automated tools are efficient and scalable, but generally do not apply metadata with the same accuracy as manual processes. Manual processes, while ensuring better enrichment, are labor intensive and not scalable for large volumes of content. A combination of manual and automated processes and tools is the most likely approach in a smart content environment. Taxonomies may be extensible over time and can require administrative tools for editorial control and term management.

Component Discovery / Assembly: Once data has been enriched, tools for searching and selecting content based on the enrichment criteria will enable more precise discovery and access. Search mechanisms can use metadata to improve search results compared to full text searching. Information architects and organizers of content can use smart searching to discover what content exists, and what still needs to be developed to proactively manage and curate the content. These same discovery and searching capabilities can be used to automatically create delivery maps and dynamically assemble content organized using them.

Distributed Collaboration / Social Publishing: Componentized information lends itself to a more granular update and maintenance process, enabling several users to simultaneously access topics that may appear in a single deliverable form to reduce schedules. Subject matter experts, both remote and local, may be included in review and content creation processes at key steps. Users of the information may want to "self-organize" the content of greatest interest to them, and even augment or comment upon specific topics. A distributed social publishing capability will enable a broader range of contributors to participate in the creation, review and updating of content in new ways.

Federated Content Management / Access: Smart content solutions can integrate content without duplicating it in multiple places, rather accessing it across the network in the original storage repository. This federated content approach requires the repositories to have integration capabilities to access content stored in other systems, platforms, and environments. A federated system architecture will rely on interoperability standards (such as CMIS), system agnostic expressions of data models (such as XML Schemas), and a robust network infrastructure (such as the Internet).

These capabilities address a broader range of business activity and, therefore, fulfill more business requirements than single-source content solutions. Assessing your ability to implement these capabilities is essential in evaluating your organizations readiness for a smart content solution.

Over the past few weeks, since publishing Smart Content in the Enterprise, I’ve had several fascinating lunchtime conversations with colleagues concerned about content technologies. Our exchanges wind up with a familiar refrain that goes something like this. “Geoffrey, you have great insights about smart content but what am I supposed to do with all this information?” Ah, it’s the damning with faint praise gambit that often signals an analysis paralysis conundrum for decision-making.

Let me make one thing perfectly clear -- I do not have an out-of-the-box prescription for a solution. It’s not simply a matter of focusing on your customer experience, optimizing your content for search, investing in a component content management platform, or adopting DITA – although, depending on the situation, I may recommend some combination of these items as part of a smart content strategy.

For me, smart content remains a work in progress. I expect to develop the prescriptive road map in the months ahead. Here’s a quick take on where I am right now.

  • For publishers, it’s all about transforming the publishing paradigm through content enrichment – defining the appropriate level of granularity and then adding the semantic metadata for automated processing.
  • For application developers, it’s all about getting the information architecture right and ensuring that it’s extensible. There needs to be sensible storage, the right editing and management tools, multiple methods for organizing content, as well as a flexible rendering and production environment.
  • For business leaders and decision makers, there needs to be an upfront investment in the right set of content technologies that will increase profits, reduce operating costs, and mitigate risks. No, I am not talking about rocket science. But you do need a technology strategy and a business plan.

As highlighted by the case studies included in the report, I can point to multiple examples where organizations have done the right things to produce notable results. Dale and I will continue the smart content discussions at the Gilbane Boston conference right after Thanksgiving, both through our preconference workshop, and at a conference session “Smart Content in the Real World: Case Studies and Real Results.”

We are also launching a Smart Content Readiness Service, where we will engage with organizations on a consulting basis to identify:

  • The business drivers where smart content will ensure competitive advantage when distributing business information to customers and stakeholders
  • The technologies, tools, and skills required to componentized content, and target distribution to various audiences using multiple devices
  • The operational roles and governance needed to support smart content development and deployment across an organization
  • The implementation planning strategies and challenges to upgrade content and creation and delivery environments

Please contact me if you are interested in learning more.

In short, to answer my lunchtime colleagues, I cannot (yet) prescribe a fully baked solution. It’s too early for the recipes and the cookbook. But I do believe that the business opportunities and benefits are readily at hand. At this point, I would invite you to join the discussion by letting me know what you expect, what approaches you’ve tried, where you’ve wound up, what you think needs to come next – and how we might help you.

Authoring in a structured text environment has traditionally been done with dedicated structured editors. These tools enable validation and user assisted markup features that help the user create complete and valid content. But these structured editors are somewhat complicated and unusual and require training in their use for the user to become proficient. The learning curve is not very steep but it does exist.

Many organizations have come to see documentation departments as a process bottleneck and try to engage others throughout the enterprise in the content creation and review processes. Engineers and developers can contribute to documentation and have a unique technical perspective. Installation and support personnel are on the front lines and have unique insight into how the product and related documentation is used. Telephone operators not only need the information at their fingertips, but can also augment it with comments and ides that occur while supporting users. Third-party partners and reviewers may also have a unique perspective and role to play in a distributed, collaborative content creation, management, review, and delivery ecosystem.

Our recently completed research on XML Smart Content in the Enterprise indicates that as we strive to move content creation and management out of the documentation department silo, we will also need to consider how the data is encoded and the usefulness of the data model in meeting our expanded business requirements. Smart content is multipurpose content designed with several uses in mind. Smart content is modular to support being assembled in a variety of forms. And smart content is structured content that has been enriched with semantic information to better identify it's topic and role to aide processing and searching. For these reasons, smart content also improves distributed collaboration. Let me elaborate.

One of the challenges for distributed collaboration is the infrequency of user participation and therefore, unfamiliarity with structured editing tools. It makes sense to simplify the editing process and tools for infrequent users. They can't always take a refresher course in the editor and it's features. They may be working remotely, even on a customer site installing equipment or software. These infrequent users need structured editing tools that are designed for them. These collaboration tools need to be intuitive and easy to figure out, easily accessible from just about anywhere, and should be affordable and have flexible licensing to allow a larger number of users to participate in the management of the content. This usually means one of two things: either the editor will be a plug in to another popular word processing system (e.g., MS Word), or it will be accessed though a thin-client browser, like a Wiki editor. In some environments, it is possible that both may be need in addition to traditional structured editing tools. Smart content modularity and enrichment allows flexibility in editing tools and process design. This allows the  use of a variety of editing tools and flexibility in process design, and therefore expanding who can collaborate from throughout the enterprise.

Also, infrequent contributors may not be able to master navigating and operating within a  complex repository and workflow environment either for the same familiarity reasons. Serving up information to a remote collaborator might be enhanced with keywords and other metadata that is designed to optimize searching and access to the content. Even a little metadata can provide a lot of simplicity to an infrequent user. Product codes, version information, and a couple of dates would allow a user to hone in on the likely content topics and select content to edit from a well targeted list of search results. Relationships between content modules that are indicated in metadata can alert a user that when one object is updated, other related objects may need to be reviewed for potential update as well.

It is becoming increasingly clear that there is no one model for XML or smart content creation and editing. Just as a carpenter may have several saws, each designed for a particular type of cut, a robust smart content structured content environment may have more than one editor in use. It behooves us to design our systems and tools to meet the desired business processes and user functionality, rather than limit our processes to the features of one tool.

Structured Editing & Wikis

user-pic
Vote 3 Votes  

If you know me you will realize that I tend to revisit XML authoring tools and processes frequently. It is one of my favorite topics. The intersection of structured tools and messy human thinking and behavior is an area fraught with usability issues, development challenges, and careful business case thinking. And therefore, a topic ripe for discussion.

I had an interesting conversation with a friend about word processors and XML editors the other day. His argument was that the word processing product model may not be the best, and certainly isn't the only, way to prepare and manage structured content.

A word processor is software that has evolved to support the creation of documents. The word processing software model was developed when people needed to create documents, and then later added formatting and other features. This model is more than 25 years old (I remember using a word processor for the first time in college in 1980).

Of course it was logical to emulate how typewriters worked since the vast majority of information at the time was destined for paper documents. Now word processors include features for writing, editing, reviewing, formatting, and limited structural elements like links, indexes, etc. Again, all very document oriented. The content produced may be reused for other purposes if transformed in a post process (e.g., it could output HTML & PDF for Web, breaking into chunks for a repository or secondary use, etc.), but there are limits and other constraints, especially if your information is primarily designed to be consumed in print or document form.

It is easy to think of XML-structured editors, and the word processor software model they are based upon, as the most likely way to create structured content. But in my opinion, structured editors pay too much homage to word processing features and processes. I also think too many  project teams assume that the only way to edit XML content is in an XML document editor. Don't get me wrong, many people have successfully deployed XML editors and achieved targeted business goals, myself included, but I can point out many instances where an alternative approach to editing content might be more efficient.

Database tools that organize the information logically and efficiently are not likely to store that data as documents. For instance, you may have an financial system with a lot of info in relational fields that is extracted to produce printable documents like monthly statements, invoices, etc.

Or software manuals that are customized for specific configurations using reusable data objects and related document maps instead maintaining the information as static, hierarchically-organized documents.

Or aircraft information that needs to match the configuration of a specific plane or tail number, selected from a complete library of data objects stored centrally.

Or statutes that start formatted as bills, then later appear as enacted laws, then later yet again as published, codified statutes, each with their own formatting and structural peccadilloes.

Or consider a travel guide publisher that collects information on thousands of hotels, restaurants, attractions, and services in dozens of countries and cities. Sure, the content is prepared with the intent of publishing it in a book, but it is easy to see how it can be useful for other uses, including providing hotel data to travel-related Web sites, or building specialized, custom booklets for special needs (e.g., a local guide for a conference, guides to historical neighborhoods, etc.). 

In these examples of what some might call database publishing, system designers need to ask them selves what would be the best tool for creating and maintaining the information. They are great candidates for a database, some application dialogs and wizards, and some extraction and transformation applications to feed Web and other platforms for consumption by users. They may not even involve an editor per se, but might rely entirely a Wiki or other dialog for content creation and editing.

Word processors require a mix of skills, including domain expertise on the subject being written about, grammar and editing, and some formatting & design, use of the software itself, etc. While I personally believe everyone, not just teachers and writers, should be skilled in writing well and making documents look legible and appealing, I realize many folks are best suited for other roles. That is why we divide labor into roles. Domain experts (e.g., lawyers, aircraft engineers, scientists and doctors, etc.) are usually responsible for accuracy and quality of the ideas and information, while editorial and product support people clean up the writing and formatting and make it presentable. So, for domain experts, it may be more efficient to provide a tool that only manages the content creation, structuring, linking, organization, etc. with limited word processing capabilities, and leave the formatting and organization to the system or another department or automated style sheets.

In my mind, a Wiki is a combination of text functionality and database management features that allow content to be created and managed in a broader Web content platform (which also may include static pages, search interfaces, pictures, PDFs, etc.). In this model, the Web is the primary use and printing is secondary. Domain experts are not bothered with concepts like page layout, running heads, tables of content generation, justification & hyphenation, etc., much to the delight of the domain experts!

I am bullish on Wikis as content creation and management tools, even when the content is destined for print. I have seen some that hide much of the structure and technical "connective tissue" from the author, but produce well formatted, integrated information. The blogging tool I am using to create this article is one example of a Wiki-like interface that has a few bells and whistles for adding structure (e.g., keywords) dedicated to a specific content creation purpose. It only emulates word processing slightly with limited formatting tools, but is loaded with other features designed to improve my blog entries. For instance, I can pick a keyword from a controlled taxonomy from a pull-down list. And all within a Web browser, not a fat client editor package. This tool is optimized for making blog content, but not for, let's say, scientific papers or repair manuals. It is targeted for a specific class of users, bloggers. Similarly, XML-editors as we have come to know them, are more adept at creating documents and document chunks than other interfaces.

Honestly, on more than one occasion I have pounded a nail with a wrench, or tightened a bolt with the wrong kind of pliers. Usually I get the same results, but sometimes it takes longer or has a less desirable result than if I had used a more appropriate tool. The same is true for editing tools.

On a final note, forgive me if I make a gratuitous plug, but authoring approaches and tools will be the subject of a panel I am chairing at the Gilbane San Francisco conference in early June if you want to hear more. </>

Tweet, Tweet

user-pic
Vote 0 Votes  

Should I Click It?

user-pic
Vote 0 Votes  

Open Office Icon.jpg

 
 
I am tempted...

 
 

Acrobat.com...

user-pic
Vote 0 Votes  

... was announced yesterday, and is available now as a public beta. By all means, check it out. I have been playing with Buzzword, and like it. I did manage to break it trying an Export to Word 2003 XML, but it is a Beta after all.

I do wonder about the export choices, which, apart from Acrobat, zipped XML, and plain text, are all Microsoft--Word 2003, Word 2007, and Word 2003 XML. This makes perfect sense if Adobe sees Buzzword as the Web interface in a Microsoft-centric document workflow. But I can see other use cases, especially ones where the content is destined for a Web CMS (or is already in a Web CMS and is being updated. In these cases, the Web CMS would likely not want the overhead of the complex Microsoft file structures.

I think we are getting a briefing on Acrobat.com shortly. I will see what Adobe has in mind.

Bill's latest Tweet

NewsShark

Sign-up for our weekly NewsShark newsletter.
Content technology industry news without the hype:

* Email

* First Name

* Last Name

* = Required Field