Curated for content, computing, and digital experience professionals

Month: September 2006 (Page 2 of 2)

The Future of DITA

DITA (which stands for “Darwin Information Typing Architecture”) is the hottest new technology in the technical publishing market. While still early in its adoption cycle, it has the potential to become the future de facto standard for not only technical publishing, but for all serious content management and dynamic publishing applications. Whether this happens, however, will depend on the vision and creativity of the DITA standards committee, DITA vendors and DITA consultants.

While IBM originally designed DITA for technical documentation, its benefits are potentially transferable to encyclopedias, journal articles, mutual fund prospectuses, insurance policies, retail catalogs, and many, many other applications. But will it really be flexible enough to meet these other needs?

At Flatirons Solutions we’ve been testing the boundaries of DITA’s extensibility, taking DITA out of its comfort zone and thereby creating some interesting proof points for its flexibility. So far, the results are very positive. Four specific applications illustrate this:

  • User personalized documentation – designed to support a variety of enterprise content libraries out of a single set of specializations, this application involved the use of 15 conditional processing attributes to drive dynamic production of personalized documents. An initial DocBook-based prototype was later re-designed for DITA.
  • Scholarly research database – this solution involved marrying DITA with the venerable Text Encoding Initiative (TEI), a nearly 20 year old scholarly markup standard originally written in SGML. DITA was used to split the historical material into searchable topics; TEI provided the rigorous scholarly markup and annotations.
  • Dynamic web publishing – designed for a large brokerage and business services firm, this application combines a single-source DITA-based authoring environment with an optimized dynamic processing pipeline that produces highly-personalized Web pages.
  • Commercial publishing – we are currently exploring the use of DITA for encyclopedia, journal, and textbook publishing, for clients who have traditionally focused on print, but who are now also moving to increasingly sophisticated electronic products.

Of course, in pushing the boundaries we’ve also found issues. A classic example is the restriction in DITA’s “task” specialization that each step in a procedure must begin with a simple declarative statement. To make it as readable as possible, the procedure cannot begin with a statement that includes a list or multiple paragraphs or a table or a note. But what do you do if your content breaks these rules? DITA’s answer is that you rewrite your content.

Rewriting content is not unreasonable if you accept that you’re moving to DITA in order to adopt industry best practices. However, what if you don’t agree that DITA’s built-in “best practices” are the only way to write good content? Or what if you have 500,000 pages of legacy content, all of which needto be rewritten before they can conform to DITA? Would you still consider it practical?

You can solve this by making up your own “task” specialization, bypassing the constraints of the built-in “task” model. That’s an advantage of DITA. But if you do that, you’re taking a risk that you won’t be able to leverage future vendor product features based on the standard “task” specialization. And in other cases, such as limitations in handling print publishing, workarounds can be harder to find.

DITA 1.1 has made great progress toward resolving some of these issues. To be truly extensible, however, I believe that future versions of DITA will need to:

  • Add more “out-of-the-box” specialization types which DITA vendors can build into their tools (for example, generic types for commercial publishing).
  • Further generalize the existing “out-of-the-box” specialization types (for example, allowing more flexibility in procedure steps).
  • Better handle packaging of content into published books, rather than focusing primarily on Web and Help output, and adapting this model for books.
  • Simplify the means to incorporate reusable content, handle “variables” within text, and link to related content.

At conferences I’ve heard it suggested that if people don’t want to obey DITA’s particular set of rules, they should consider using another standard. I’ve even heard people say that DITA doesn’t need to focus on book publishing because print is “old school.” In my opinion, this kind of parochial thinking needs to be seriously reconsidered.

Today, DITA stands at the crossroads. If it can be aggressively generalized and extended to meet the needs of commercial publishers, catalog and promotional content, and financial services and other vertical industry applications, then it has the chance to be “the” standard in XML-based dynamic publishing. If this doesn’t happen, DITA runs the risk of being relegated to a relatively elite technical publishing standard that’s only useful if you meet its particular set of assumptions and rules.

As an industry, which way will we go?

Look ahead a bit: here, here, and here.

Gilbane Group Launches Content Technology CTO Blog

For Immediate Release:

9/19/06

Analyst firm hosts chief technology officer blog for content management and information technology community 

Contact:
Mary Laplante
617.497.9443 ext 212
mary@gilbane.com

Cambridge MA, September 19, 2006. The Gilbane Group today announced they have launched a blog for Chief Technology Officers (CTOs) who are involved in enterprise content applications, whether vendor, integrator, or enterprise implementer. The content technology CTO Blog is hosted by the Gilbane Group as a service to the content and information technology community. The purpose of the blog is to facilitate ongoing discussion and debate on technologies, approaches and architectures relevant to enterprise content applications.

CTOs have a wealth of critical information about technologies that is not always accessible to enterprise customers. CTOs also have demanding jobs, and have limited time available to meet with each other with customers, or with other industry influencers. This blog is intended to encourage communication both between vendor CTOs and between enterprise customer CTOs and vendor CTOs. All CTOs are invited to participate.

“Long gone are the days when analyst firms, marketing departments and VARs had a lock on product technology information channels. But it is still a challenge for many companies to find the in-depth technical expertise they need for strategic IT decisions.” said Frank Gilbane, CEO, Gilbane Group. “Our mission has always been to facilitate dialogue about information technologies between technologists, implementers, customers, investors, analysts and consultants. The CTO blog is a great addition to our open information community approach and complements our conferences, analyst blog, website, and other activities.”

Two CTO Blog charter authors have already contributed posts during the pre-launch testing. John Newton, a Documentum founder and now founder and CTO of Alfresco, provides a provocative take on “content management 2.0”. Vern Imrich, CTO of Percussion Software, shares insights into the apparent contradiction of content management technology moving up and down the technology infrastructure stack at the same time. See  (CTO Blog posts are integrated in with the Gilbane Blog) to see the full posts, a list of topic areas, to comment, or learn more.

Additional charter authors of the Content Technology CTO Blog include:

  • Bill Cava, Ektron
  • James Gonthier, Refresh
  • Jason Hunter, Mark Logic
  • Vern Imrich, Percussion
  • John Newton, Alfresco
  • Bjørn Olstad, FAST
  • Eric Severson, Flatirons Solutions
  • Carl Sutter, CrownPeak

Vern Imrich, CTO of Percussion Software, said, “This is an ideal time for the Gilbane Group to launch their Content Technology CTO Blog. The content management market is maturing in a variety of different directions, and organizations are looking closely at how they can apply content management to everything from basic storage and retrieval to new content-driven applications used to produce measurable line-of-business returns. Giving CTOs a place to comment on the issues surrounding this dynamic market will provide significant value. I’m happy to participate and eager to hear what my peers have to say.”

About Gilbane Group Inc.
The Gilbane Report serves the content technology community with publications, conferences and consulting services. The Gilbane Report also administers the Content Technology Works program disseminating best practices with partners Software AG (TECdax:SOW), Sun Microsystems (NASDAQ:SUNW), Artesia Digital Media, a Division of Open Text, Astoria Software, ClearStory Systems (OTCBB:INSS), Context Media (Oracle, NASDAQ: ORCL), Convera (NASDAQ:CNVR), IBM (NYSE:IBM), Idiom, Mark Logic, omtool (NASDAQ:OMTL), Open Text Corporation (NASDAQ:OTEX), SDL International (London Stock Exchange:SDL), Vasont Systems, Vignette (NASDAQ:VGN), and WebSideStory (NASDAQ:WSSI). https://gilbane.com

###

New CTO blog

Over the summer we came up with the idea for hosting a blog for CTOs from all parts of the content and information industry to debate technologies and architectures. We finally got around to launching the Content Technology CTO Blog today. Here is the press release, and more info on how it works and how to contribute. John Newton, CTO of Alfresco and Vern Imrich CTO of Percussion already have posts up. Stop by and comment!

Acrobat Still Suffering from Schizophrenia

On Monday, in the wee hours of the night (my email was sent at 12:27 a.m.) Adobe emitted three short press releases announcing Acrobat 8. I’m a fan of Acrobat and PDF, so I always look forward to new versions of this ungainly but hugely-popular product. Sadly release #8, at first look-see, leaves me thoroughly unmoved.

The main press release captures the excitement behind the announcement: “The Acrobat 8 product line introduces several major innovations in the areas of document collaboration, PDF content reuse, PDF forms, packaging of multiple documents, and controlling sensitive information. For example, shared reviews put collaboration within the reach of virtually anyone with access to a shared network folder and Adobe Reader2. A participant in a shared review can see comments posted by others, track the status of the review, and work even when not connected – reducing duplicated work and enabling large groups to collaborate more efficiently. Acrobat 8 also enables PDF content to be exported into popular formats to enable reuse and repurposing of content.”

Most of these “innovations” are just “new and improved” old features.

If you’re looking for news, press release #2 is where to turn. Macromedia Breeze is now called Acrobat Connect, and will be available at a lower price-point and to smaller groups of users than the old not-so-gentle Breeze. This represents the first fruit of Adobe’s $3.4 billion acquisition of Macromedia. How do ya like them apples?
Press release #3 reveals that Adobe will continue to nurse Acrobat and PDF through its severe case of schizophrenia. Acrobat 8 ($449 by itself) will also be bundled into the awkwardly named Adobe Creative Suite 2.3 Premium edition ($1199). This is “to enable creative and print professionals to efficiently create, collaborate with, and automate output of Adobe PDF files.”

A few years back, Adobe came close to abandoning this group of PDF enthusiasts (and major buyers of Creative Suite). It realized the gross error of its ways at the 11th hour, and now makes sure to invite them for tea each time there’s something new happening with Acrobat. The features that appeal to “creative and print professionals” bear little resemblance to the features that appeal to the “knowledge workers,” who remain the big buyers of Acrobat itself. So while each group is told a slightly different story, Acrobat’s schizophrenia has not blocked its ever-growing popularity.

Also in press release #3 we find the second instance of the fruit-bearing acquisition. Not surprisingly, Adobe has decided to jettison the never-very-successful GoLive out of Creative Suite in favor of the incredibly successful Dreamweaver. Saving face, somewhat unconvincingly, we’re informed that “Adobe will continue to develop GoLive as a standalone product.” Right. That’s until Adobe finishes getting a little cash off the GoLive orphans as they make the switch (“upgrade”) to Dreamweaver.

The word on the street is that Creative Suite itself will be upgraded to V3 by early next year. Perhaps then the flaccid features of Acrobat 8 will start to make more sense. Or maybe some knowledge workers will acquire some knowledge, enough to tell us why this upgrade was released.

Index Data Releases Zebra 2.0

Index Data has released Zebra 2.0, a major upgrade of its Open Source database server and indexing engine. This upgrade makes index profiling much easier, supports increased tuning of search results, incorporates XML technology into core functionality, and increases performance speed. Some of the highlights of the improvements of Zebra 2.0 over the 1.3 version are: a 64-bit based index structure, elimination of the 2GB limit on register file size, new on-disk format providing increased stability and faster indexing and retrieval, new record filter using XSLT transformations to drive both indexing and retrieval, improved logging and analysis of external traffic, and revised and expanded documentation. Zebra 2.0 replaces the previous versions’ tight coupling to the Z39.50 BIB-1 attribute set with a new XML friendliness, making Zebra easy to use for such XML-based formats as Dublin Core, MODS, METS, MARCXML, OAI-PMH, RSS, etc. The software’s new plug-in architecture allows the skilled user to write his or her own record indexing and retrieval filters as loadable modules. The performance enhancements incorporated into version 2.0 mean that Zebra can now index and search even faster than version 1.3. In a test of Zebra 2.0, the software was able to build a 31 GB database of very large records in four elapsed hours on a 1800 GHz Dual AMD box. It processed an average of 2.2 MB of data per second. Zebra 2.0 offers more precise logging of external traffic, access and indexing, and log messages are now printed in a style similar to Apache server logs. http://www.indexdata.com/zebra

 

Newer posts »

© 2024 The Gilbane Advisor

Theme by Anders NorenUp ↑