January, 1999

The New Gilbane Report

Well, not that new. The content of our report remains the same. Our previous subtitle “On Open Information & Document Systems” sounds a bit old-fashioned these days, but is still an accurate portrayal of our coverage – it is the terminology that has changed. “Open” referred to the use of content encoding standards and was largely a code word for SGML when we started. We believed then, as we believe now, that for information technology to truly advance it was more important to focus on making information inherently easier to share than to spend all our efforts on integrating products. Product integration provides very little benefit without information integration. Today XML is actually helping us with both kinds of integration more than we would have predicted. “Information & Document Systems” was our way of making the point that it didn’t makes sense to limit information management strategies to either structured (data) or unstructured (document) information. Useful information for the vast majority of corporate applications requires some combination of both. Today “knowledge” and “content” have evolved as politically correct terms largely because they don’t discriminate between structured and unstructured information.

The big change for our report is that we have moved to a monthly format and will be adding Web access so that we can better help you keep up with the rapid changes in our industry. We start out this year with a discussion of the most important IT driver today: e-commerce.

Frank Gilbane


Download a complete version of this issue that includes industry news and additional information (PDF)


Dynamic Content, XML & E-commerce

In some of our other publications we’ve been writing a lot about e-commerce lately. The overall message has been that whatever kind of information system you are building you need to understand how it relates to your company’s business-to-business e-commerce effort (existing or planned). Even if your product or project has no obvious connection to e-commerce, e.g., it is a purely internal intranet application in research or HR, you need to understand how developments and expectations of e-commerce solutions will heavily influence new products and technologies.

This message is equally important for vendors and users. While we have been addressing this issue with vendors via other channels, we thought it important to put this message in a historical perspective that would help corporate planners understand what is happening, why it is happening now, and how you should be incorporating it in your strategic planning.

First a definition: dynamic content is the convergence of structured data and unstructured data (documents) combined with the types of transaction processing associated with each. Now the three-part bottom line: all information systems now need to be able to manage dynamic content; XML is arguably the most important enabler of dynamic content; and e-commerce applications will drive other IT systems to deal with dynamic content.

The Evolution of Dynamic Content

Ten years ago the archetypal “document management” application involved scanning, storing, and retrieving images. The content was about as undynamic as you could get. There may have been data in the documents, but by the time a user got to see the document the data had been normalized into the same impenetrable blob of bits as the text and graphic information. Typically these document images were not changed — they were simply deleted or replaced. The transactions were similarly simple, i.e., store and retrieve, and transaction frequency was measured by the hour or even day.

The typical database application involved no textual or graphic content aside from very structured text in, e.g., address fields. Transactions were measured in seconds. Database structured content was often very dynamic, but usually only available to look at in a non-contextual report environment. Data and documents were separate entities processed in very different ways. Combining them was painful and the eventual result was static information in either paper or image form.

Five or six years ago, applications that integrated documents and data in ways that allowed both text and data to be accessed, even after they were combined, began to replace the older imaging-based solutions. There were lots of imaginative approaches that are still found in use today (For a historical peek at how this was done see The Gilbane Report Vol. 1, No. 3, 1994). What they all had in common was the use of some kind of a database. In certain cases, the document itself was a database that had scripts to fetch updated data from others databases. The most sophisticated of these required SGML and/or intimate knowledge of proprietary file formats because, one way or another, you needed to identify a piece of information, with a beginning and an end, to attach a script to or to associate a piece of data with. Everyone recognized that managing (document) components, whether text or data, was going to be required to truly manage information without data/document bias. But it was very hard and very costly.

What made these “document systems” interesting was that the content, whether text, graphic, or data, was interactive. Not in the way we would expect today with a web interface — the document interface may have only been updated occasionally, but the point was that the data was automatically integrated or assembled. You didn’t need to know SQL or buy lunch for an IS person in order to see data wrapped in textual/graphical context. Although the term “interactive” was often used to describe these systems, the process was usually serial and one-way — you had to go back to the database to change the data and some other process would eventually update the document. “Dynamic” was more than a stretch. Transactions remained segregated. There were glacial-like check-in/check-out-like transactions for documents, and relatively lightening-like transactions for data. The archetypal system of this stage in the evolution of dynamic content is a technical documentation system that is integrated with some combination of engineering, supply, and customer databases.

These kinds of systems are, in general, no longer sufficient to meet our expectations for information or content management — at least not for new systems. Our needs have changed dramatically. If a web page contains content that is the least bit out of date we feel justified in complaining. An electronic catalog with a price out of date is totally unacceptable. We now expect text, graphic, and data content to be up-to-date, almost instantly re- configurable, and truly interactive. Our content is dynamic and we expect to be able to work with it dynamically. This implies more than just integrated content types — it also implies integrated transactions.

Is this expectation reasonable? Or do we need to wait yet another few years? I think it is time to think of new information systems in these terms. While we may still have to make compromises in the face of the hard reality of performance, security, or just plain experience with new business models, we should be basing our IT strategies on a dynamic content information model. There are two reasons it is time: XML makes it feasible, and E- commerce is driving it.

XML

There are three reasons XML makes managing dynamic content feasible today, none of which would do the trick by itself. First, it allows for separate but equal encoding of multiple content types, transactions, and associated metadata. Second, the encoding allows for binding processes to this content in a way that is both application neutral and client/server neutral. And third, XML has succeeded in capturing the interest and imagination of the new generation of developer that isn’t tainted with old- fashioned views on the irreconcilable differences between structured and unstructured data.

For years we have argued that the basic elements of information technology content needed to evolve further before our systems could make the next leap forward in utility. The evolution seemed to have halted after progressing from bits to bytes to ASCII. It looks like XML is finally taking us to the next step. As Lauren Wood has pointed-out (Gilbane Report, Vol 6, No. 4: The W3C DOM – A Programmer’s View of Documents), XML is best viewed as the new ASCII of the internet. The success of XML won’t be measured by vendor product revenues, but by the amount of content in XML (for more on this see Tim Bray’s article in the Gilbane Report, Vol 6, No. 6: 1998 – A lot of Extensible Markup).

E-commerce

XML doesn’t equal e-commerce, but it provides the lubrication necessary for the wheels to turn. To use a popular term among analysts, it provides for “frictionless” content and transaction communication. Content and transactions can be understood and acted on by different and multiple chunks of code on clients or servers.

E-commerce will obviously drive technology development because of its direct relationship with revenue generation — it would be pretty difficult to come up with a more visible IT project in your organization. Obtaining funding for e-commerce efforts that are well thought out should be easy that is, if there isn’t already a directive from above to get cracking. What is more interesting and slightly subtler is how e-commerce will affect other areas of information technology. The breadth of its influence is hard to overestimate.

The most basic requirement of an e-commerce application is to be able to present configurable data and content to a customer that is complete enough for them to find out what they need to know to make a purchase, and then to make the purchase on the spot. Whether your product is hard (e.g., a toaster) or soft (e.g., information) there is at least textual/graphical content (e.g., product description), data (e.g., price), and an interactive transaction (e.g., filling out the order form) involved. All of these are likely to have metadata associated with them (e.g., toaster color, price discount, order history).

We also need additional kinds of transactions for e-commerce. We need content-oriented two-way transactions — not so a customer can change a price, but so a customer can specify a category of product they want updates on, or to send a message to customer service or accounting.

e-commerce dynamic content integration | Gilbane

E-commerce systems also have to be able to talk to your other IT systems in manufacturing, accounting, marketing, etc. They need to be able to share the various and changing content types in each of these systems and to be able to further process the different types of transactions. The bar is already getting pretty high in terms of dynamic content and interaction (e.g., Dell, Amazon, and Federal Express). The only significant difference between these advanced e-commerce applications and a more constained business- to-business application with a supplier is scale there is no difference in the need to manage multiple types of dynamic content. To integrate effectively with an e-commerce application, these other systems will have to have more or less the same ability to recognize and process varying content types automatically. This is why ERP vendors are interested in XML.

Building advanced e-commerce systems is still very difficult today, not surprising when you think about what is involved, but there is no stopping the demand nor the development resources targeted at e-commerce. It will get easier as more products come to market. This is not all entirely new territory. Business to business e-commerce is a direct descendent of the large scale technical documentation/PDM/EDI/supply chain efforts in defense aerospace such as the ambitious CALS program, and to some degree automotive and telecommunications and electroncs.

E-commerce will be the archetypal application of the next few years. The earlier archetypal systems we discussed were “document” oriented, but that qualifier is no longer relevant. Use e-commerce functionality as a yardstick. You don’t have to do everything at once, but if you want your IT strategy to be in synch with where IT development is headed, don’t design systems incapable of managing dynamic content even if they are not part of an e- commerce solution.

— Frank Gilbane