The Gilbane Report: Volume 11, Number 10XSL-FO: Ready for Prime Time?
January 2004
Download a PDF version of this article Read the news for this issue. It's hard to believe it has been close to 20 years since the first attempts started to build a standard that would support the application of formatting characteristics and rules to descriptive markup. The ISO standard, DSSSL (Document Style Semantics and Specification Language) was a mammoth 10-year effort, and even the dramatically less ambitious US DoD Output Specification (known more widely but inaccurately as the FOSI for Formatting Output Specification Instance) was a multi-year endeavor by a large number of developers and subject matter experts. Back then it was difficult to convince either IT or programmers that this was a difficult or even interesting problem. It is still difficult, but happily there is now plenty of interest fueled by the years of exponential growth of XML content we can confidently predict.
We haven't been directly involved in these efforts for quite some time, but the industry is fortunate that many experts who were involved have continued to play an active role and have been joined by a diverse, talented, and generous group to work on this problem as part of the W3C's XSL-FO (Formatting Objects) committee and elsewhere. It is time for us all to pay more attention to how far we've come and what we still have left to do.
This month we are happy to welcome publishing technology industry veteran Thad McIlroy as a contributor. Thad has spent a lot of time looking at XSL-FO to see how he should advise both his publishing and software vendor customers, and takes a frank, refreshing, and illuminating look at XSL-FO from the perspective of the user.
XSL-FO: Ready for Prime Time?
I don't know why it is that every time I try to write about anything to do with the Web I find myself thinking of that old chestnut about the six learned blind men trying to describe an elephant. Each feels only a part of the elephant and describes it variously as a wall, a spear, a snake, a fan, a tree and a rope, none seeing the totality.
XSL-FO is just such an elephant. It is many different things, depending on which blind man you speak to. There is of course some truth in each of the disparate perspectives on XSL-FO, but I don't think that the true nature of this beast has been defined.
This article seeks to describe XSL-FO, its genesis and current status. More importantly, it tries to answer the questions: How important is XSL-FO and what should I be doing about it now, or in the future. What are my alternatives?
Without giving away the punch line, I'll offer two early observations to guide you while reading further. XSL-FO is a complex and multi-faceted specification, and promises publishers a great many benefits. But I think we're still in the very early days.
What is XSL-FO?
XSL-FO is an attempt to add formatting capabilities to a data tagging structure (XML) that was not necessarily intended to concern itself directly with format, an attempt to add print formatting capabilities to data that was more likely intended primarily for electronic distribution, Web or otherwise. Whether the authors of the FO specification acknowledge it or not, it's also a stab at a universal page layout language, one that moves the publishing world far beyond the proprietary days of PageMaker versus QuarkXPress versus Adobe InDesign. It's the beginning of a standard database and variable data publishing tool. It could form the basis for the ultimate cross-media publishing tool. It'sit'san elephant.
Just getting a clear definition of XSL-FO can be tricky. There's no ultimate controversy here, but there are significant differences of perspective and of language when XSL-FO is discussed.
What Does the W3C say?
First off there really is not a separate standard called XSL-FO. It's really just XSL, or the eXtensible Stylesheet Language. FO stands Formatting Objects, and formatting is really what XSL is mostly about. But adding confusion to the conceptualization of this beast is that XSL is really three different recommendations. As the W3C Web site declares:
The Extensible Stylesheet Language Family (XSL) XSL is a family of recommendations for defining XML document transformation and presentation. It consists of three parts:
- XSL Transformations (XSLT), a language for transforming XML
- The XML Path Language (XPath), an expression language used by XSLT to access or refer to parts of an XML document. (XPath is also used by the XML Linking specification)
- XSL Formatting Objects (XSL-FO), an XML vocabulary for specifying formatting semantics
For this article I'll use XSL to refer to the full specification, and XSL-FO (or just FO) to reference the specific subsection of the specification that deals with formatting objects.
OK, so far?
Schizophrenic Standards: What is XML Really For?
Imagine how tough it would be if your father was SGML, and your mother was the anarchy of the Internet! What a difficult time you could have trying to find your real purpose in life.
Someone asked me the other day: Does anyone really think about XML in the context of SGML anymore? Well they should, because that's clearly where it came from. XML is a pared-down version of SGML. According to the official 1.0 XML specification, The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document. Its goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML.
As a clear descendent of SGML, XML might logically be thought to be primarily involved with documents and their expression. XSL was one of the original three XML standards. Where did things go wrong? With all of the energy of the Internet and the Web, and all of the mad greed of the late 1990s, before we knew it XML suddenly became primarily an enabler of commercial data transactions. It hardly seemed worth the trouble of expressing this data visually.
Eventually the limitations of HTML began to nag, and Cascading Style Sheets were dropped into the pot to improve graphic expression (on the Web) including that of XML-tagged data.
But still nothing to do with print. I remember clearly at Seybold Seminars in the late 90s I would ask Web CMS and system vendors if they had any print options. The quizzical look I got back said it clearly: Why would you want to do that?
Where Does XSL-FO Come From?
Fortunately some concerned participants in the W3C thought that it might be a good idea to make it possible to create professional-level print from XML-tagged data.
Perhaps the best source of information for the thinking behind XSL-FO comes from a fine article written by Stephen Deach for The Seybold Reports (What is XSL-FO and When Should I Use It, Vol. 2, No. 17, December 9, 2002). He points to three problems that the XSL Working Group faced:
- There was no language to describe the pagination of complex documents on the Web.
- There was no way to deal with long documents and complex layouts.
- The typography in CSS had been designed for browsers, not for print.
Deach then outlines five goals that motivated the group during XSL development, including maintaining a pure XML syntax; that the language be declarative, rather than procedural; a need to build on CSS2; to support cross-media publishing; and to match or exceed the typographic and layout features of existing page formatters.
The 1.0 XSL specification reveals more of XSL's ancestry: XSL builds on the prior work on Cascading Style Sheets (CSS2) and the Document Style Semantics and Specification Language (DSSSL). While many of XSL's formatting objects and properties correspond to the common set of properties, this would not be sufficient by itself to accomplish all the goals of XSL. In particular, XSL introduces a model for pagination and layout that extends what is currently available and that can in turn be extended, in a straightforward way, to page structures beyond the simple page models described in this specification.
Some of you will have forgotten DSSSL (pronounced Dissle). It's an international standard: ISO/IEC 10179:1996(E). This is the SGML antecedent of XSL-FO, and it's very much about print. But the authors of DSSSL foresaw the electronic future and wrote that the specification is intended for use in a wide variety of SGML application environments, including both electronic publishing and conventional printing.
The authors of DSSSL also introduced the distinction between transformation and formatting that essential to XSL. As they wrote: The DSSSL conceptual model has two distinct processes: (1) a transformation process and (2) a formatting process. The two processes may be used in conjunction with each other, or each may be used alone.
Data Conversion Laboratory, in its Website glossary, writes: XSL is a stylesheet language that gives us the ability to specify how data coded with XML will format on screen (emphasis added). This language was developed based on the ISO companion standard for SGML known as DSSSL
On screen? What could they possibly mean on screen? That's not what XSL is about. Or is it? As Deach describes in the cross-media objectives: XSL should cover the basic presentation requirements fora wide range of display devices, including reflow or repagination for palmtop devices, and for the accessibility requirements that are now mandated by many governments.
Therein lays another example of this schizophrenia involving all things XML. Is the prime purpose print, or is it electronic presentation? OK, it's both. So can one standardized approach really address the cross-media challenge? Or will it meet the same fate as every other product or system that claims to handle cross-media? Failure. Adobe itself in the latest version of InDesign essentially admits that the cross-media dream had not worked out as previously expected. The cross-media feature of InDesign CS is to bundle up all the print text and graphics and ship them over to GoLive, a Web publishing application.
The Complexity Problem
XSL-FO is nothing if not complex. As Ken Holman puts it charitably in his very good tutorial What is XSL-FO? (available on XML.com), The Recommendation itself is a rigorous, lengthy and involved technical specificationthe document remains out of reach for many people who just want to write stylesheets and print their information. The more enthusiastic Rodolfo Raya, in an article Using XSL-FO to Create Printable Documents, (on IBM's DeveloperWorks XML zone) suggests that If you plan to master FO, you should learn on your own how to use the 56 different objects that comprise XSL-FO. Thanks for the suggestion, Rodolfo, but I'm a little too busy right now to study 56 new objects!
I've now read four or five XSL-FO tutorials, and my head is filled with fo namespaces, block area, reference areas, and tree structures, and I still don't know a thing.
To give you a little more meat than the above, let me add a couple of quotations from the 1.1 specification that provide context:
XSL is a language for expressing stylesheets. Given a class of arbitrarily structured XML documents or data files, designers use an XSL stylesheet to express their intentions about how that structured content should be presented; that is, how the source content should be styled, laid out, and paginated onto some presentation medium, such as a window in a Web browser or a hand-held device, or a set of physical pages in a catalog, report, pamphlet, or book.
An XSL stylesheet processor accepts a document or data in XML and an XSL stylesheet and produces the presentation of that XML source content that was intended by the designer of that stylesheet. There are two aspects of this presentation process: first, constructing a result tree from the XML source tree and second, interpreting the result tree to produce formatted results suitable for presentation on a display, on paper, in speech, or onto other media. The first aspect is called tree transformation and the second is called formatting. The process of formatting is performed by the formatter. This formatter may simply be a rendering engine inside a browser
XSL was developed to give designers control over the features needed when documents are paginated as well as to provide an equivalent frame' based structure for browsing on the Web. To achieve this control, XSL has extended the set of formatting objects and formatting properties. In addition, the selection of XML source components that can be styled (elements, attributes, text nodes, comments, and processing instructions) is based on XSLT and XPath, thus providing the user with an extremely powerful selection mechanism.
The design of the formatting objects and properties extensions was first inspired by DSSSL. The actual extensions, however, do not always look like the DSSSL constructs on which they were based. To either conform more closely with the CSS2 specification or to handle cases more simply than in DSSSL, some extensions have diverged from DSSSL.
Both the accepted 1.0 specification (416 pages) and the 1.1 recommendation are available on the W3C site. Read ' em and weep.
The Pagination Problem
Noted above as one of the objectives of the FO Working Group was to match or exceed the typographic and layout features of existing page formatters.
Existing page formatters are a large and diverse group. First there are the interactive applications, ranging from consumer-oriented software like Microsoft Publisher, through higher-end layout applications like QuarkXPress and Adobe InDesign. A second class of interactive applications have specialized market foci, for example toward technical documentation, like Adobe FrameMaker, or perhaps towards packaging applications, like the Artwork Systems software.
A more relevant third class of software is batch pagination systems. There are numerous players in this field, but most significant are the public domain TeX system, and the high-end proprietary systems XyEnterprise XPP, and Advent 3B2. Combined they have perhaps 50 years in the market, and hundreds of man-years of development. Their typographic and batch layout features are considered state-of-the art. When taken in tandem with the widely-lauded page layout and typographic sophistication of Adobe InDesign, it would be difficult to imagine that XSL-FO matches or exceed the typographic and layout features of existing page formatters. But a Web search finds no tests that confirm or deny. It's a subject ripe for study. What features represent the state of the art ? And what features are required for each pagination marketspace? Is there any objective data available?
Beyond today's state of the art, efforts continue to improve both the quality of typography and of pagination in software. For example Hermann Zapf's About micro-typography and the hz-program ( Electronic Publishing , Vol. 6 (3), 283288, September 1993) suggests a new algorithm to improve typographic quality. There have been enormous efforts over the years to optimize typographic appearing by adjusting letterspacing and kerning. For the first time, Zapf seeks to combine type scaling minor adjustments to the width of letter forms with kerning to create optimal spacing typographic output. As Zapf writes, the hz-program workspartly based on a typographically acceptable expansion or condensing of letters, called scaling. Connected with this is a kerning program which calculates kerning values at 100 pairs per second. The kerning is not limited only to negative changes of space between two critical characters, but also allows in some cases positive kerning, which means the addition of space. As far as I know the hz-program has never been implemented in a commercial system, but if it were, the results could be dramatic.
At the same time a 2003 paper, On the Pagination of Complex Documents by Anne Bruggemann-Klein, Rolf Klein and Stefan Wohlfeil (R. Klein et al., Eds.: Computer Science in Perspective , LNCS 2598, pp. 4968, Springer-Verlag, 2003) argues that The pagination problem of complex documents is in placing text and floating objects on pages in such a way that each object appears close to, but not before, its text reference. Current electronic formatting systems do not offer the pagination quality provided by human experts in traditional book printing. One reason is that a good placement of text and floating objects cannot be achieved in a single pass over the input. We show that this approach works only in a very restricted document model; but in a realistic setting no online algorithm can approximate optimal pagination quality We propose to use the total number of page turns necessary for reading the document and for looking up all referenced objects. This objective function can be optimized by dynamic programming, in time proportional to the number of text blocks times the number of floating objects.
These are but two examples of ongoing attempts to improve both the typography and pagination of print documents. There is still much work to be done. It remains to be seen whether we can ever develop a fully-automated pagination system that will achieve optimal results without operator intervention for the vast majority of complex documents. And so one of the questions that surround XSL-FO is whether there will be value in an optional WYSIWYG formatter that would permit interactive tweaking as a final pagination phase.
The Cross-Media Challenge
I've read a couple of enthusiastic reports about XSL (such as those quoted above), and find myself ultimately thinking so what? There's no significant technological breakthrough in XSL, except perhaps the degree of innate multi-language support. The breakthrough is more commercial than technological: creating what was previously available only in proprietary systems in a system based on open (royalty-free) standards. I'm all for that, but forgive me if I don't offer a standing ovation. I need more.
The killer app for XSL is the opportunity to create the underpinnings for the broad cross-media delivery of content. But XSL just isn't there yet.
As Jacco van Ossenbruggen, Joost Geurts, Lynda Hardman and Lloyd Rutledge point out in their article, Towards a Multimedia Formatting Vocabulary (ACM 1581136803/03/0005), Multimedia content providers need to publish their content for a wide variety of Web devices and to facilitate the creation of on-line presentations from content stored in structured XML documents or multimedia databases. To do this effectively, the well-known advantages of document engineering techniques need to be made applicable to multimedia content.
While they recognize the value of the W3C standard SMIL 2.0 for multimedia output, they argue that it is difficult, however, to fully integrate (standards such as SMIL) in a complete document transformation processing chain. In order to achieve the desired processing of data-driven, time-based, media-centric presentations, the text-flow based formatting vocabularies used by style languages such as XSL, CSS and DSSSL need to be extended.
Beyond print and the Web, the concept of cross-media is growing as new applications come into focus. On February 3, 2004 the W3C announced the advancement of the Voice Extensible Markup Language (VoiceXML) Version 2.0 to Proposed Recommendation VoiceXML uses XML to bring speech, touch-tone input, digitized audio, recording, telephony, and computer-human conversations to the Web. At the same time, structures proposed outside of the W3C, XUL (XML User Interface Language, pronounced Zool) and Microsoft's XAML (eXtensible Application Markup Language, pronounced Zammel) use XML encodings to simplify interface design.
The possibilities for extensive cross-media integration with XSL are huge; the realization is as yet extremely limited.
Limitations
By all accounts XSL-FO can be considered a robust system, at least for technical documents. There's very little information out there yet on what works best, and what doesn't really work.
Probably the most detailed paper around is Eliot Kimber of ISOGEN International's presentation Using XSL Formatting Objects for Production-Quality Document Printing offered at XML 2002 in Baltimore. As Kimber points out: XSL Formatting Objects has unavoidable limitations from two principal causes: missing layout features and the limitations inherent in the two-step XMLxt-pages processing model. He says also that FO is not a full solution for index generation.
Kimber, while generally very much on XSL's side, points also to a range of specific limitations, including an inability to deal with:
- Text that flows around arbitrary curved areas (but text flowing around rectangular areas is possible using side floats). There are no extensions that satisfy this requirement.
- Page-location sensitive inclusion or exclusion of content. For example, there is no direct way to condition the text of a cross reference based on whether or not the target of the reference occurs on the same page as the reference itself. There are no extensions that satisfy this requirement.
- Any other presentation tuning semantics that require feedback.
Ken Holman echoes Kimber's theme when he writes, Unfortunately there are many common' requirements that just couldn't be met with XSL-FO 1.0 that will be addressed in future versions. I understand that had the committee tried to add everything in the first version, it would never have been released due to feature creep. The first version was necessary to understand how it was going to be used. (The recommendations for version 1.1 were published in mid-December, but appear to be more of a bug fix for 1.0, than a new version.)
Advantages
It has always been a challenge to produce high quality print output from SGML (and then so too from XML). Hence the gargantuan effort with DSSSL. Specialized typesetting tools like Advent 3B2, Datalogics Composer, Arbortext Publisher and XyVision XPP provide (or provided) expensive SGML solutions, purely for the specialist. (Adobe FrameMaker+SGML was a much less expensive interactive offering.) By creating the XSL-FO standard, can we get away from the degree of expense and complexity demanded by SGML publishing solutions? I'm not certain. At the same time, where is the encouragement to move to lesser-cost software if the underlying system complexity only makes the user long for a professional vendor, willing to help makes things work, cost be damned?
Eliot Kimber of ISOGEN is a supporter, and says that ISOGEN's experience is that creating an XSLT- and FO-based style sheet requires about one half the effort of creating the equivalent style sheet in a proprietary system. In addition, the incremental cost of adding new document types or new layouts to an existing family of document types or layouts goes down over time as you refine your XSLT code to be more modular, making it easier to add new functionality or new input or output choices. No other SGML- or XML-based composition system has this characteristic.
Adobe's Steven Deach suggests that XSL-FO could be best for documents such as financial-planning guides, owner and maintenance manuals and legal agreements and contracts. It's difficult to see why anyone would embrace the complexity of FO for these technically straightforward applications, much less abandon a current system (of which there are many) in favor of FO.
Kimber, on the other hand points out that one important and distinguishing aspect of the FO design is its support for internationalized documents. FO is designed explicitly to not be biased in favor of any particular writing order, writing direction, page orientation or other culture-specific aspect of text presentation. Thus FO has been designed from the start to support, for example, right-to-left writing systems like Hebrew and Arabic and top-to-bottom writing directions like Traditional Chinese, as well as Western writing systems. It has also been designed to accommodate complex glyph layout requirements, such as those of Thai. Is this enough to justify the effort?
Vendors Supporting XSL
There are primarily three classes of vendors actively supporting XSL-FO. The first is small or relatively small vendors developing tools to aid in stylesheet development or FO rendering. The second is the two largest commercial structured batch software systems, XyEnterprise XPP and Advent 3B2. Both are proposing solutions for encompassing XSL-FO data within their current products, and implicitly thereby both endorsing and deriding the standard. Each seeks to communicate to the market full-compliance, but in the meantime highlight FO's current shortcomings, as well as their relative strengths against these shortcomings.
A third class of vendors includes Adobe and Microsoft. Each has limited support for FO at this time. (Oddly Adobe FrameMaker, previously a leader in SGML support, does not support FO.) I expect we'll be hearing much more about FO from vendors in the next year or so.
We have tracked 22 different vendors with meaningful FO implementations (and would love to hear from any more). Listed in alphabetical order, they are:
Vendor |
Product/
Price |
Short Description |
3B2
www.3b2.com |
3B2-FO
$100 |
"3B2-FO is a high speed, reliable, feature rich XSL-FO rendering tool developed by Advent 3B2." |
Adobe Systems
www.adobe.com |
Adobe Document Server |
"Adobe Document Server is our first product to combine XSL-FO with Adobe formatting technologies." |
Antenna House
www.antennahouse.com |
XSL Formatter
$5,000 for server license |
"V2 is a professional formatting solution that conforms to XSL-FO V1.0 W3C Recommendation and supports over 50 languages. |
Apache project
http://xml.apache.org |
FOP
Open Source |
"FOP (Formatting Objects Processor) is the world's first print formatter driven by XSL formatting objects (XSL-FO) and the world's first output independent formatter." |
Arbortext
www.arbortext.com |
Epic Editor
$695 |
"Arbortext intends to continue to offer innovative, high-quality support for XSL-FO to satisfy its customers' most demanding requirements." |
Chive Products
www.chive.com |
Apoc XSL-FO
$1,339 |
"Apoc XSL-FO is a tool for rendering PDF documents from a formatting tree. Apoc XSL-FO is compliant with a subset of the XSL-FO 1.0 specification and can be easily integrated into any .NET application." |
Digital Dreams Software Solutions
www.dig-dreams.de |
jFO
30 € |
"jFO is a java tool for generating formatting objects (XSL-FO). It offers a XSL-FO java API, an RTF (Rich Text Format) to XSL-FO converter and a report engine based on RTF importer." |
Hewlett-Packard
www-uk.hpl.hp.com/people/fabgia/foa/foa.html |
FOA
Open Source |
"FOA is the world's first XSL-FO Authoring tool. It is a Java application that gives users a graphical interface to author XSL-FO stylesheets." |
IBM
http://www.alphaworks.ibm.com/tech/xfc |
XSL Formatting
Objects Composer |
XSL Formatting Objects Composer (XFC) is a typesetting and display engine that implements a substantial portion of XSL Formatting Objects (FO) XFC produces either an interactive onscreen display using Java2D or an output file using PDF. A single formatting engine drives both Java2D and PDF output through a common interface, Other outputs are possible, and some are being developed. |
InDelv Software
www.indelv.com |
InDelv XF
Trial beta version |
"InDelv XF is an XML formatting and generation tool. It combines three different applications." |
Infonyte GmbH
www.infonyte.com |
XML Workbench |
"a graphical XML authoring environment for large documents and document collections." |
Inventive Designers
www.inventivedesigners.com |
Scriptura Engine Enterprise Edition
€ 2.995,00 per processor |
"Scriptura is a graphical XSL-FO designer, supporting static objects and dynamic data (from XML and JDBC), for generating XSLT, XSL-FO, XHTML, PDF and PCL." |
jCatalog Software AG
www.xslfast.com |
XSL-Fast
890 € |
"XSLfast is the world's first graphical editor for XSLFO documents." |
Microsoft
www.microsoft.com |
Microsoft Office and FrontPage
Various |
Support for XML in office; FrontPage will render the XML using XSLT. |
Novosoft
www.novosoft-us.com |
RTF to XML
$20 |
"RTF TO XML converts RTF files to XML according to the W3C Formatting Object specification and generates a pair of an XSL template and an XML textual data file." |
Pixware
www.xmlmind.com |
XMLmind FO Converter
$550 |
"XMLmind FO Converter is a Java component which converts XSL Formatting Objects (FO) to RTF." |
RenderX
www.xep.xattic.com |
XEP Rendering Engine
$299.95 for client edition |
"The XEP Rendering Engine converts XML documents into a printable form (PDF or PostScript) by applying XSL Formatting Objects styling." |
ReportLab
www.reportlab.com |
Enterprise Publishing and Reporting Server
$25,000 per server |
"ReportLab PDF - A Practical Alternative to XSL-FO" |
Sebastian Rahtz
www.tei-c.org |
PassiveTeX
Free |
"PassiveTeX provides a rapid development environment for experimenting with XSL FO, using a reliable pre-existing formatter." |
Visual Programming Limited
www.xmlpdf.com |
Ibex XSL-FO Formatter
$675 |
"Ibex is a XSL-FO Formatting Engine which takes XML in the XSL-FO format defined by the W3C XSL Recommendation and produces PDF files." |
Web Systems
www.webxsystems.com |
UltraXML |
"High-end WYSIWYG XML publishing system with real time ActiveXSL and Visual DTD editing integrated into one of the most high end, yet easy to use publishing system, UltraXML. Now you can see how your XML document will look as you create it, not after you |
XyEnterprise
www.xyenterprise.com |
XPP |
Support for XSL-FO is being integrated into XPP. |
Table 1. Vendors Supporting XSL-FO
What is XSL-FO Being Used For Today?
As far as I can determine, the use of XSL-FO today is limited in the extreme. The sense I get from the several FO mail lists (including XSL-List@lists.mulberrytech.com, http://groups.yahoo.com/group/XSL-FO/, http://forum.java.sun.com/forum.jsp?forum=34 and www-xsl-fo@w3.org) is that there are few users, and that those few users are either in early testing mode, or undertaking compositionally simple documents, such as forms. I know of a few publishers experimenting with FO pagination. I've seen little mention of cross-media applications.
There's little general knowledge to be gained from these mail lists, and not much sense that delaying an FO implementation will leave you very far behind the crowd.
Arguably the biggest potential for FO today is just creating better print output from Web browsers. As G. Ken Holman points out in his XSL-FO tutorial, We often take the printed form of information for granted, yet how many of us are satisfied with the print-screen functionality from a web browser? How many times have you printed a lengthy web document and found the paginated result to be as easily navigated as the electronic original?... When we want to produce a paginated presentation of our XML information, we necessarily must offer a different set of navigation tools to the consumers of our documents. These navigational aids have been honed since bound books have been used: headers, footers, page numbers and page number citations are some of the characteristics of printed pages we use to find our way around a collection of fixed-sized folios of information.
XSL-FO and PDF
Nearly all of the XSL-FO renders offer print output via PDF. It seems odd at first why PDF? But what alternative? QuarkXPress native format? OEB (Open e-Book)? PostScript? No, PDF is the logical format. It's well-structured (much moreso than PostScript), and well-documented (the 1172 page PDF Reference for PDF 1.5 can be downloaded without charge form Adobe's Web site). Though controlled by Adobe, no one is prevented from using it (nor required to pay a royalty for doing so.) It's the ultimate page-oriented print format, and a completely natural output file format for XSL-FO documents.
Adobe has embraced this FO-PDF workflow, and broadly endorses it for third-parties. I don't know whether to read this as a win for PDF or as Custer's Last Stand. In my view Adobe continues to struggle to find a clear role for PDF in an XML world. Encompassing XML within PDF seems natural until you question the bottom-line benefits. Is the XML document provider more fortunate to have PDF to represent document appearance, or is the PDF user more fortunate to have the granular markup provided through XML?
There has been a multi-year movement within both the XML and PDF communities to support the proposition of PDF and XML, rather than PDF or XML. I remain unconvinced.
Conclusions
After living with pagination systems for 20 years the main criteria I use for judging a new technology or software are that ease-of-use must underlie any successful system, while increased functional sophistication will be demanded over time. XSL-FO currently satisfies neither of these requirements. It's a bear to use, and the functionality does not break new ground against existing batch pagination systems.
So in the here and now, it's hard to present a compelling case for the switch to XSL-FO.
But there's much more to the equation than this.
Instead of manually creating ads, newspaper inserts, direct-mail pieces and brochures, companies will increasingly hook up template-driven layout engines to larger systems that streamline the document-creation process. The systems will take customer and order information, use that to select appropriate content and feed the results to the layout engine, which will in turn route the resulting digital file to the next step of the process.
Mark Walter, The Seybold Report , Volume 3, Number 19
Within this stream of thought Walter proposes XSL-FO is clearly a winner. FO's approach is clearly consistent with this changing dynamic in document production.
However I think that the question of the importance of XSL is perhaps more related to the question of the ultimate importance of XML.
The key value of XSL is that it's contained within the family of XML specifications, and adheres to the XML syntax. As such, it is potentially able to offer two advantages that were never available to SGML. The first is the innate ability to tie the appearance aspects of the publishing process with the workflow and commercial aspects of the processes, in a single data stream. Standards like JDF, AdML and NewsML arose during the XML era, not the SGML era, and promise enormous workflow and business benefits.
Another great advantage is that elusive Holy Grail, a process to automate cross-media publishing. There is certainly a lot of work to be done, but I have no doubt that it's well within the capacity of XML semantics and XML engineering to build a basis for that workflow. The cross-media promise of XSL is real, if nowhere near realization.
But most significantly, XSL-FO will catch on because the adoption of XML (and more importantly, XSLT) has become so widely entrenched across all industries, and has the unequivocal support of all the largest and most important vendors across the business process landscape. Working with XSLT moves a developer a big step closer to being able to implement FO, and that's a significant undercurrent of experience and energy propelling the standard forward.
The publishing industry has demonstrated repeatedly that it will favor standards over proprietary approaches, provided the software functionality related to the standard meets its business needs. As the XSL specification continues to mature, and as the softw
are supporting it becomes more robust and user-friendly, I think we'll have a winner on our hands.
Thad McIlroy, thad@arcadiahouse.com
|