Recently in Authoring Category

If you have been following recent XML Technologies blog entries, you will notice we have been talking a lot lately about XML Smart Content, what it is and the benefits it can bring to an organization. These include flexible, dynamic assembly for delivery to different audiences, search optimization to improve customer experience, and improvements for distributed collaboration. Great targets to aim for, but you may ask are we ready to pursue these opportunities? It might help to better understand the technology landscape involved in creating and delivering smart content.

The figure below illustrates the technology landscape for smart content. At the center are fundamental XML technologies for creating modular content, managing it as discrete chunks (with or without a formal content management system), and publishing it in an organized fashion. These are the basic technologies for "one source, one output" applications, sometimes referred to as Singe Source Publishing (SSP) systems.

SCLandscape.jpg

The innermost ring contains capabilities that are needed even when using a dedicated word processor or layout tool, including editing, rendering, and some limited content storage capabilities. In the middle ring are the technologies that enable single-sourcing content components for reuse in multiple outputs. They include a more robust content management environment, often with workflow management tools, as well as multi-channel formatting and delivery capabilities and structured editing tools. The outermost ring includes the technologies for smart content applications, which are described below in more detail.

It is good to note that smart content solutions rely on structured editing, component management, and multi-channel delivery as foundational capabilities, augmented with content enrichment, topic component assembly, and social publishing capabilities across a distributed network. Descriptions of the additional capabilities needed for smart content applications follow.

Content Enrichment / Metadata Management: Once a descriptive metadata taxonomy is created or adopted, its use for content enrichment will depend on tools for analyzing and/or applying the metadata. These can be manual dialogs, automated scripts and crawlers, or a combination of approaches. Automated scripts can be created to interrogate the content to determine what it is about and to extract key information for use as metadata. Automated tools are efficient and scalable, but generally do not apply metadata with the same accuracy as manual processes. Manual processes, while ensuring better enrichment, are labor intensive and not scalable for large volumes of content. A combination of manual and automated processes and tools is the most likely approach in a smart content environment. Taxonomies may be extensible over time and can require administrative tools for editorial control and term management.

Component Discovery / Assembly: Once data has been enriched, tools for searching and selecting content based on the enrichment criteria will enable more precise discovery and access. Search mechanisms can use metadata to improve search results compared to full text searching. Information architects and organizers of content can use smart searching to discover what content exists, and what still needs to be developed to proactively manage and curate the content. These same discovery and searching capabilities can be used to automatically create delivery maps and dynamically assemble content organized using them.

Distributed Collaboration / Social Publishing: Componentized information lends itself to a more granular update and maintenance process, enabling several users to simultaneously access topics that may appear in a single deliverable form to reduce schedules. Subject matter experts, both remote and local, may be included in review and content creation processes at key steps. Users of the information may want to "self-organize" the content of greatest interest to them, and even augment or comment upon specific topics. A distributed social publishing capability will enable a broader range of contributors to participate in the creation, review and updating of content in new ways.

Federated Content Management / Access: Smart content solutions can integrate content without duplicating it in multiple places, rather accessing it across the network in the original storage repository. This federated content approach requires the repositories to have integration capabilities to access content stored in other systems, platforms, and environments. A federated system architecture will rely on interoperability standards (such as CMIS), system agnostic expressions of data models (such as XML Schemas), and a robust network infrastructure (such as the Internet).

These capabilities address a broader range of business activity and, therefore, fulfill more business requirements than single-source content solutions. Assessing your ability to implement these capabilities is essential in evaluating your organizations readiness for a smart content solution.

What is hot in XML these days? I have been to a few conferences and meetings, talked with many clients, participated in various research projects, and developed case studies on emerging approaches to XML adoption. DITA (Darwin Information Typing Architecture) is hot. Semantically enriched XML is hot. Both enable some interesting functionality for content delivered via print, on the web, and through mobile delivery channels. These include dynamic assembly of content organized into a variety of forms for custom uses, improved search and discovery of content, content interoperability across platforms, and distributed collaboration in creating and managing content.

On November 30, prior to the Gilbane Conference in Boston, Geoff Bock and I will be holding our 3rd workshop on Smart Content which is how we refer to semantically enriched, modular content (it's easier to say). In the seminar we will discuss what makes content smart, how it is being developed and deployed in several organizations, and dive into some technical details on DITA and semantic enrichment.  This highly interactive seminar has been well received in prior sessions, and will be updated with our recently completed research findings.  More information on the seminar is available at  http://gilbaneboston.com/workshops.html.

By the way, the research report, entitled Smart Content in the Enterprise, is now available at the research section at Gilbane.com. It includes several interesting case studies from a variety of organizations, and a lot of good information for those considering taking their content to the next level. We encourage you to download it (it is free). I also hope to see you in Boston at the workshop.

Authoring in a structured text environment has traditionally been done with dedicated structured editors. These tools enable validation and user assisted markup features that help the user create complete and valid content. But these structured editors are somewhat complicated and unusual and require training in their use for the user to become proficient. The learning curve is not very steep but it does exist.

Many organizations have come to see documentation departments as a process bottleneck and try to engage others throughout the enterprise in the content creation and review processes. Engineers and developers can contribute to documentation and have a unique technical perspective. Installation and support personnel are on the front lines and have unique insight into how the product and related documentation is used. Telephone operators not only need the information at their fingertips, but can also augment it with comments and ides that occur while supporting users. Third-party partners and reviewers may also have a unique perspective and role to play in a distributed, collaborative content creation, management, review, and delivery ecosystem.

Our recently completed research on XML Smart Content in the Enterprise indicates that as we strive to move content creation and management out of the documentation department silo, we will also need to consider how the data is encoded and the usefulness of the data model in meeting our expanded business requirements. Smart content is multipurpose content designed with several uses in mind. Smart content is modular to support being assembled in a variety of forms. And smart content is structured content that has been enriched with semantic information to better identify it's topic and role to aide processing and searching. For these reasons, smart content also improves distributed collaboration. Let me elaborate.

One of the challenges for distributed collaboration is the infrequency of user participation and therefore, unfamiliarity with structured editing tools. It makes sense to simplify the editing process and tools for infrequent users. They can't always take a refresher course in the editor and it's features. They may be working remotely, even on a customer site installing equipment or software. These infrequent users need structured editing tools that are designed for them. These collaboration tools need to be intuitive and easy to figure out, easily accessible from just about anywhere, and should be affordable and have flexible licensing to allow a larger number of users to participate in the management of the content. This usually means one of two things: either the editor will be a plug in to another popular word processing system (e.g., MS Word), or it will be accessed though a thin-client browser, like a Wiki editor. In some environments, it is possible that both may be need in addition to traditional structured editing tools. Smart content modularity and enrichment allows flexibility in editing tools and process design. This allows the  use of a variety of editing tools and flexibility in process design, and therefore expanding who can collaborate from throughout the enterprise.

Also, infrequent contributors may not be able to master navigating and operating within a  complex repository and workflow environment either for the same familiarity reasons. Serving up information to a remote collaborator might be enhanced with keywords and other metadata that is designed to optimize searching and access to the content. Even a little metadata can provide a lot of simplicity to an infrequent user. Product codes, version information, and a couple of dates would allow a user to hone in on the likely content topics and select content to edit from a well targeted list of search results. Relationships between content modules that are indicated in metadata can alert a user that when one object is updated, other related objects may need to be reviewed for potential update as well.

It is becoming increasingly clear that there is no one model for XML or smart content creation and editing. Just as a carpenter may have several saws, each designed for a particular type of cut, a robust smart content structured content environment may have more than one editor in use. It behooves us to design our systems and tools to meet the desired business processes and user functionality, rather than limit our processes to the features of one tool.

As part of next week's Gilbane Boston Conference, the XML practice will be delivering a pre-conference workshop, "Managing Smart Content: How to Deploy XML Technologies across Your Organization." The instructors will be Geoff Bock, Dale Waldt, Bill Trippe, Barry Schaeffer and Neal Hannon--a group of experts that represents decades of technical and management experience on XML initiatives.

A tip of the virtual hat to Senior Analyst Geoff Bock for organizing this.

Once Upon a Time...

user-pic
Vote 1 Vote  

... there was SVG. People were excited about it. Adobe and others supported it. Pundits saw a whole new graphical web that would leverage SVG heavily. Heck, I even wrote a book about it. 

Then things got quiet for a long time...

However, there are some signs that SVG might be experiencing a bit of a renaissance, if the quality of presentations at a recent conference is a strong indication. It's notable that Google hosted the conference and even more notable that Google is trying to bigfoot Microsoft into supporting SVG in IE, a move that would substantially boost SVG as an option for Web developers.

So a question for those out there interested in SVG. Where are some big projects out there? Are there organizations creating large bases of illustrations and other graphical content with SVG? I would love to talk to you and learn about your projects. You can email me or comment below.

UPDATE: Brad Neuberg of Google, who is quoted in the InfoWorld article linked above, sent along a link to a project at Google, SVG Web, a JavaScript library that supports SVG on many browsers, including Internet Explorer, Firefox, and Safari. According to the tool's website, using the library plus native SVG support, you can instantly target ~95% of the existing installed web base.

UPDATE: Ruud Steltenpool, the organizer for SVG Open 2009, sent a link to an incredibly useful compendium of links to SVG projects, tools, and other resources though he warns it is a little outdated.

I recently wrote a short Gilbane Spotlight article for the EMC XML community site about the state of Iowa going paperless (article can be found here) in regards to its Administrative Code publication. It got me to thinking, "When is a book no longer a book?"

Originally the admin code was produced as a 10,000 page loose-leaf publication service containing all the regulations of the state. For the last 10 years it has also appeared on the Web as PDFs of pages, and more recently, independent data chunks in HTML. And now they have discontinued the commercial printing of the loose-leaf version and only rely on the electronic versions to inform the public. They still produce PDF pages that resemble the printed volumes that are intended for local printing of select sections by public users of the information. But the electronic HTML version is being enhanced to improve reusability of the content, present it in alternative forms and integrated with related materials, etc. Think mashups and improved search capabilities. The content is managed in an XML-based Single Source Publishing system that produces all output forms.

I have migrated many, many printed publications to XML SSP platforms. Most follow the same evolutionary path regarding how the information is delivered to consumers. First they are printed. Then a second electronic copy is produced simultaneously with the print using separate production processes. Then the data is organized in a single database and reformatted to allow editing that can produce both print and electronic. Eventually the data gets enhanced and possibly broken into chunks to better enable reusing the content, but the print is still a viable output format. Later, the print is discontinued as the subscription list falls and the print product is no longer feasible. Or the electronic version is so much better, that people stop buying the print version.
So back to the original question, is it no longer a book? Is it when you stop printing pages? Or when you stop producing the content in page-oriented PDFs? Or does it have to do with how you manage and store the information?

Other changes take place in how the information is edited, formatted, and stored that might influence the answer to the question. For instance, if the content is still managed as a series of flat files, like chapters, and assembled for print, it seems to me that it is still a book, especially if it still contains content that is very book oriented, like tables of contents and other front matter, indexes, and even page numbers. Eventually, the content may be reorganized as logical chunks stored in a database, extracted for one or more output formats and organized appropriately for each delivery version, as in SSP systems. Print artifacts like TOCs may be completely generated and not stored as persistent objects, or they can be created and managed as build lists or maps (like with DITA). As long as one version is still book-like, IMHO it is still a book.

I would posit that once the printed versions are discontinued, and all electronic versions no longer contain print-specific artifacts, then maybe this is no longer a book, but simply content.

Anyone who works with XML has probably had to "sell" the idea of using the standard instead of alternative approaches, whether as an internal evangelist of XML or in a formal sales role. We have developed some pretty convincing arguments, such as automating redundant processes, quality checking and validation of content, reuse of content using a single source publishing approach, and so on. These types of benefits are easily understood by the technical documentation department or developers and administrators in the IT group. And they are easy arguments to make.

Even so, that leaves a lot of people who can benefit from the technology but may never need know that XML is part of the solution. The rest of the enterprise may not be in tune with the challenges faced by the documentation department, and instead focus on other aspects of running a business, like customer support, manufacturing, fulfillment, or finance, etc.. If you tell them the software solution you want to buy has "XML Inside" they may stare off into space and let their eyes glaze over, even fall asleep. But if you tell them you have a way to reduce expensive customer support phone calls by making improvements to their public-facing Web content and capabilities, you might get more of their attention.

I have been around the XML community for a very long time, and we tend to look into our belly buttons for the meaning of XML. This is often doen at the expense of looking around us and seeing what problems are out there before we start talking about solutions to apply to them. Everything looks like a nail because we have this really nifty hammer called XML. But when CD-ROMs were introduced, people didn't run around talking about the benefits of ISO 9660 (the standard that dictates how data is written to a CD). Okay they did at first to other technologists and executives in big companies adopting the standard, but rarely did the end consumer hear about the standard. Instead, we talked about the massive increase in data storage, and the flexibility of a consistent data storage format across operating systems. So we need to remember that XML is not what we want to accomplish, but rather how we may get things done to meet our goals. Therefore, we need to understand and describe our requirements in terms of these business drivers, not the tools we use to address them.

Part of the problem is that there are several potential audiences for the XML evangelism message, each with their own set of concerns and domain-specific challenges. End users want the ability to get the work out the door in a timely manner, at the right quality level, and that the tools are easy to use. Line Managers may add sensitivity to pricing, performance, maintenance and deployment costs, etc. These types of concerns I would classify as tactical departmental concerns focusing on operational efficiency (bottom line). 

Meanwhile Product Managers, Sales, Customer Service, Fulfillment, Finance, etc. are more geared toward enterprise goals and strategies such as reducing product support costs, and increasing revenue, in addition to operational efficiency. Even stated goals like synchronizing releases of software and documentation, making data more flexible and robust to enable new Web and mobile delivery options, are really only supporting the efforts to achieve the first two objectives of better customer service and increased sales, which I would classify as strategic enterprise concerns.

The deft XML evangelist, to succeed in the enterprise discussion, needs to know about a lot more than the technology and processes in the documentation department, or he or she will be limited to tactical, incremental improvements. The boss may want, instead, to focus on how the data can be improved to make robust Web content that can be dynamically assembled according to the viewer's profile. Or how critical updates can be delivered electronically and as fast as possible, while the complete collection of information is prepared for more time consuming, but equally valuable printed delivery in a multi-volume set of books. Or how content can be queried, rearranged, reformatted and delivered in a completely new way to increase revenue. Or how a business system can automatically generate financial reporting information in a form accurate and suitable enough for submission to the government, but without the army of documentation labor used previously.

At Gilbane we often talk about the maturity of XML approaches, not unlike the maturity model for software. We haven't finalized a spectrum of maturity levels yet, but I think of XML applications as ad hoc, departmental, and enterprise in nature. Ad hoc is where someone decides to use an XML format for a simple process, maybe configuration files driving printers or other applications. Often XML is adopted with no formal training and little knowledge outside of the domain in which it is being applied.

Departmental applications tend to focus on operational efficiency, especially as it relates to creating and distributing textual content. Departmental applications are governed by a single department head but may interact with other groups and delivery feeds, but can standalone in their own environment.  An enterprise application of XML would need governance from several departments or information partners, and would focus on customer or compliance facing issues and possibly growth of the business. They tend to have to work within a broader framework of applications and standards.

Each of these three application types requires different planning and justification. For ad hoc use of XML it is usually up to the individual developer to decide if XML is the right format, if a schema will be needed, and what the markup and data model are, etc. Very little "selling" is needed here except as friendly debate between developers, architects and line managers. Usually these applications can be tweaked and changed easily with little impact beyond local considerations.

Departmental application of XML usually requires a team representing all stakeholders involved in the process, from users to consumers of the info. There may be some departmental architectural standards, but exceptions to these are easier to accommodate than with enterprise applications. A careful leader of a departmental application will look upstream and down stream in the information flow to include some of their needs. Also, they need to realize that the editing process in their department may become more complex and require additional skills and resources, but that these drawbacks are more than offset but savings in other areas, such as page layout, or conversion to Web formats which can be highly automated. Don't forget to explain these benefits to the users whose work just got a little more complicated!

An Enterprise solution is by definition tied to the business drivers of the enterprise, even if that means some decisions may seem like they come at the expense of one department over another. This is where an evangelist could be useful, but not if they only focus on XML instead of the benefits it provides. Executives need to know how much revenue can be increased, how many problem reports can be avoided in customer service, and whether they can meet regulatory compliance guidelines, etc. This is a much more complicated set of issues with dependencies on and agreement with other departments needed to be successful. If you can't provide these types of answers, you may be stuck in departmental thinking.

XML may be the center of my universe (my belly button so to speak), but it is usually not the center of my project's sponsor's universe. I have to have the right message to covince them to make signifiaccnt investment in the way their enterprise operates.  </>

Structured Editing & Wikis

user-pic
Vote 3 Votes  

If you know me you will realize that I tend to revisit XML authoring tools and processes frequently. It is one of my favorite topics. The intersection of structured tools and messy human thinking and behavior is an area fraught with usability issues, development challenges, and careful business case thinking. And therefore, a topic ripe for discussion.

I had an interesting conversation with a friend about word processors and XML editors the other day. His argument was that the word processing product model may not be the best, and certainly isn't the only, way to prepare and manage structured content.

A word processor is software that has evolved to support the creation of documents. The word processing software model was developed when people needed to create documents, and then later added formatting and other features. This model is more than 25 years old (I remember using a word processor for the first time in college in 1980).

Of course it was logical to emulate how typewriters worked since the vast majority of information at the time was destined for paper documents. Now word processors include features for writing, editing, reviewing, formatting, and limited structural elements like links, indexes, etc. Again, all very document oriented. The content produced may be reused for other purposes if transformed in a post process (e.g., it could output HTML & PDF for Web, breaking into chunks for a repository or secondary use, etc.), but there are limits and other constraints, especially if your information is primarily designed to be consumed in print or document form.

It is easy to think of XML-structured editors, and the word processor software model they are based upon, as the most likely way to create structured content. But in my opinion, structured editors pay too much homage to word processing features and processes. I also think too many  project teams assume that the only way to edit XML content is in an XML document editor. Don't get me wrong, many people have successfully deployed XML editors and achieved targeted business goals, myself included, but I can point out many instances where an alternative approach to editing content might be more efficient.

Database tools that organize the information logically and efficiently are not likely to store that data as documents. For instance, you may have an financial system with a lot of info in relational fields that is extracted to produce printable documents like monthly statements, invoices, etc.

Or software manuals that are customized for specific configurations using reusable data objects and related document maps instead maintaining the information as static, hierarchically-organized documents.

Or aircraft information that needs to match the configuration of a specific plane or tail number, selected from a complete library of data objects stored centrally.

Or statutes that start formatted as bills, then later appear as enacted laws, then later yet again as published, codified statutes, each with their own formatting and structural peccadilloes.

Or consider a travel guide publisher that collects information on thousands of hotels, restaurants, attractions, and services in dozens of countries and cities. Sure, the content is prepared with the intent of publishing it in a book, but it is easy to see how it can be useful for other uses, including providing hotel data to travel-related Web sites, or building specialized, custom booklets for special needs (e.g., a local guide for a conference, guides to historical neighborhoods, etc.). 

In these examples of what some might call database publishing, system designers need to ask them selves what would be the best tool for creating and maintaining the information. They are great candidates for a database, some application dialogs and wizards, and some extraction and transformation applications to feed Web and other platforms for consumption by users. They may not even involve an editor per se, but might rely entirely a Wiki or other dialog for content creation and editing.

Word processors require a mix of skills, including domain expertise on the subject being written about, grammar and editing, and some formatting & design, use of the software itself, etc. While I personally believe everyone, not just teachers and writers, should be skilled in writing well and making documents look legible and appealing, I realize many folks are best suited for other roles. That is why we divide labor into roles. Domain experts (e.g., lawyers, aircraft engineers, scientists and doctors, etc.) are usually responsible for accuracy and quality of the ideas and information, while editorial and product support people clean up the writing and formatting and make it presentable. So, for domain experts, it may be more efficient to provide a tool that only manages the content creation, structuring, linking, organization, etc. with limited word processing capabilities, and leave the formatting and organization to the system or another department or automated style sheets.

In my mind, a Wiki is a combination of text functionality and database management features that allow content to be created and managed in a broader Web content platform (which also may include static pages, search interfaces, pictures, PDFs, etc.). In this model, the Web is the primary use and printing is secondary. Domain experts are not bothered with concepts like page layout, running heads, tables of content generation, justification & hyphenation, etc., much to the delight of the domain experts!

I am bullish on Wikis as content creation and management tools, even when the content is destined for print. I have seen some that hide much of the structure and technical "connective tissue" from the author, but produce well formatted, integrated information. The blogging tool I am using to create this article is one example of a Wiki-like interface that has a few bells and whistles for adding structure (e.g., keywords) dedicated to a specific content creation purpose. It only emulates word processing slightly with limited formatting tools, but is loaded with other features designed to improve my blog entries. For instance, I can pick a keyword from a controlled taxonomy from a pull-down list. And all within a Web browser, not a fat client editor package. This tool is optimized for making blog content, but not for, let's say, scientific papers or repair manuals. It is targeted for a specific class of users, bloggers. Similarly, XML-editors as we have come to know them, are more adept at creating documents and document chunks than other interfaces.

Honestly, on more than one occasion I have pounded a nail with a wrench, or tightened a bolt with the wrong kind of pliers. Usually I get the same results, but sometimes it takes longer or has a less desirable result than if I had used a more appropriate tool. The same is true for editing tools.

On a final note, forgive me if I make a gratuitous plug, but authoring approaches and tools will be the subject of a panel I am chairing at the Gilbane San Francisco conference in early June if you want to hear more. </>

As part of our Gilbane Onsite Technology Strategy Workshop Series, we are happy to announce a new workshop, Implementing DITA.

Course Description

DITA, the Darwin Information Typing Architecture is an emerging standard for content creation, management, and distribution. How does DITA differ from other XML applications? Will it work for my vertical industry’s content? From technical documentation, to training manuals, from scientific papers to statutory publishing. DITA addresses one of the most challenging aspects of XML implementation, developing a data model that can be user and shared with information partners. Even so, DITA implementation requires effective process, software, and content management strategies to achieve the benefits promised by the DITA business case, cost-effective, reusable content. This seminar will familiarize you with DITA concepts and terminology, describe business benefits, implementation challenges, and best practices for adopting DITA. How DITA enables key business processes will be explored, including content management, formatting & publishing, multi-lingual localization, and reusable open content. Attendees will be able to participate in developing an effective DITA content management strategy.

Audience

This is an introductory course suitable for anyone looking to better understand DITA standard, terminology, processes, benefits, and best practices. A basic understanding of computer processing applications and production processes is helpful. Familiarity with XML concepts and publishing helpful, but not required. No programming experience required.

Topics Covered

  • The Business Drivers for DITA Adoption

  • DITA Concepts and Terminology

  • The DITA Content Model

  • Organizing Content with DITA Maps

  • Processing, Storing & Publishing DITA Content

  • DITA Creation, Management & Processing Tools

  • Multi-lingual Publishing with DITA

  • Extending DITA to work with Other Data Standards

  • Best Practices & Pitfalls for DITA Implementation

For more information and to customize a workshop just for your organization, please contact Ralph Marto by email or at +617.497.9443 x117

An old colleague of mine from more than a dozen years ago found me on LinkedIn today. And within five minutes we got caught up after a gap of several years. I know, reestablishing lost connections happens all the time on social media sites. I just get a kick out of it every time it happens. But this is the XML blog, not the social media one, so...

My colleague works at a company that has been using SGML & XML technology for more than a 15 years. Their data is still in SGML. They feel they can always export to XML and do not plan to migrate their content and applications to SGML any time soon. The funny thing was that he was slightly embarrassed about still being in SGML.

Wait a minute! There is no reason to think SGML is dead and has to be replaced. Not in general. Maybe for specific applications a business case supports the upgrade, but it doesn't have to every time. Not yet.

I know of several organizations that still manage data in the SGML they developed years ago. Early adopters, like several big publishers, some state and federal government applications, and financial systems were developed when there was only one choice. SGML, like XML, is a structured format. They are very, very similar. One format can be used to create the other very easily. They already sunk their investment into developing the SGML system and data, as well as training their users in it's use. The incremental benefits of moving to XML do not support the costs of the migration. Not yet.

This brings up my main point, that structured data can be managed in many forms. These include XML, SGML, XHTML, databases, and probably other forms. The data may be structured, follow rules for hierarchy, occurrence and data typing, etc. but not be managed as XML, only exported as XML when needed. My personal opinion is that XML stored in databases provides some of the best combination of structured content management features, but different business needs suggest a variety of approaches may be suitable. Flat files stored in folders and formatted in old school SGML might still be enough and not warrant migration. Then again, it depends on the environment and the business objectives.

When XML first came out, someone coined the phrase that SGML stood for "Sounds Good, Maybe Later" because it was more expensive and difficult to implement. XML is more Web aware and is somewhat more clearly defined and therefore tools operate more consistently. Many organizations that felt SGML could not be justified were able to later justify migrating to XML. Others migrated right away to take advantage of the new tools or related standards. XML does eliminate some features of SGML that never seemed to work right too. It also demands Wellformed data, which reduces ambiguity and simplifies a few things. And tools have come a long way and are much more numerous, as expected.

XML is definitely more successful in terms of number and range of applications and XML adoption is an easier case to make today than SGML was back in the day. But many existing SGML applications still have legs. I would not suggest that a new application start off with SGML today, but I might modify the old saying to "Sounds Good, Migrate Later".

So, when is it a good idea to migrate from SGML to XML? There are many tools available that do things with XML data better than they do with other structured forms. Many XML tools support SGML as well, but DBMS systems now can managed content as XML data type and use XML XPath nodes in processing. WIKIs and other tools can produce XML content and utilize other standards based on XML, but not SGML that I am aware of. If you want to take advantage of features of Web or XML tools, you might want to start planning your migration. But if your system is operational and stable, the benefits might not yet justify the investment and disruption from migrating. Not yet! </>

Bill's latest Tweet

NewsShark

Sign-up for our weekly NewsShark newsletter.
Content technology industry news without the hype:

* Email

* First Name

* Last Name

* = Required Field