Curated for content, computing, and digital experience professionsals

Should you Migrate from SGML to XML?

An old colleague of mine from more than a dozen years ago found me on LinkedIn today. And within five minutes we got caught up after a gap of several years. I know, reestablishing lost connections happens all the time on social media sites. I just get a kick out of it every time it happens. But this is the XML blog, not the social media one, so…

My colleague works at a company that has been using SGML & XML technology for more than a 15 years. Their data is still in SGML. They feel they can always export to XML and do not plan to migrate their content and applications to SGML any time soon. The funny thing was that he was slightly embarrassed about still being in SGML.

Wait a minute! There is no reason to think SGML is dead and has to be replaced. Not in general. Maybe for specific applications a business case supports the upgrade, but it doesn’t have to every time. Not yet.

I know of several organizations that still manage data in the SGML they developed years ago. Early adopters, like several big publishers, some state and federal government applications, and financial systems were developed when there was only one choice. SGML, like XML, is a structured format. They are very, very similar. One format can be used to create the other very easily. They already sunk their investment into developing the SGML system and data, as well as training their users in it’s use. The incremental benefits of moving to XML do not support the costs of the migration. Not yet.

This brings up my main point, that structured data can be managed in many forms. These include XML, SGML, XHTML, databases, and probably other forms. The data may be structured, follow rules for hierarchy, occurrence and data typing, etc. but not be managed as XML, only exported as XML when needed. My personal opinion is that XML stored in databases provides some of the best combination of structured content management features, but different business needs suggest a variety of approaches may be suitable. Flat files stored in folders and formatted in old school SGML might still be enough and not warrant migration. Then again, it depends on the environment and the business objectives.

When XML first came out, someone coined the phrase that SGML stood for “Sounds Good, Maybe Later” because it was more expensive and difficult to implement. XML is more Web aware and is somewhat more clearly defined and therefore tools operate more consistently. Many organizations that felt SGML could not be justified were able to later justify migrating to XML. Others migrated right away to take advantage of the new tools or related standards. XML does eliminate some features of SGML that never seemed to work right too. It also demands Wellformed data, which reduces ambiguity and simplifies a few things. And tools have come a long way and are much more numerous, as expected.

XML is definitely more successful in terms of number and range of applications and XML adoption is an easier case to make today than SGML was back in the day. But many existing SGML applications still have legs. I would not suggest that a new application start off with SGML today, but I might modify the old saying to “Sounds Good, Migrate Later”.

So, when is it a good idea to migrate from SGML to XML? There are many tools available that do things with XML data better than they do with other structured forms. Many XML tools support SGML as well, but DBMS systems now can managed content as XML data type and use XML XPath nodes in processing. WIKIs and other tools can produce XML content and utilize other standards based on XML, but not SGML that I am aware of. If you want to take advantage of features of Web or XML tools, you might want to start planning your migration. But if your system is operational and stable, the benefits might not yet justify the investment and disruption from migrating. Not yet! </>



  1. Bill Trippe

    Hi Dale,
    Terrific entry. As always, you find the right frame to explain a complex topic quickly and accurately.
    One of my clients has asked this very question again and again. They have a mature, stable, and highly productive system that manages SGML content totaling several hundred thousand pages. They have push-button publishing to print (PDF) and HTML, and they have been developing mechanisms to create different variants of HTML and XML on the fly for licensees and partners. Their vendors are stable and support them very well, and their technical team has documented the systems and processes such that new enhancements can be made efficiently and predictably.
    As a result, when they have thoughtfully asked themselves the question, is the time right to convert to XML, they keep saying no. One possibility for them now is to create an XML layer on top of the SGML management system–either through high performance dynamic rendering of XML or through some regular batch rendering. They may undertake this to speed some development of XML and HTML licensing opportunities they see developing.

  2. Larry Porter

    Nice BLOG.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.