Archive for XML

What’s Hot in XML? Workshop on Smart Content Describes Leading-Edge Content Applications

What is hot in XML these days? I have been to a few conferences and meetings, talked with many clients, participated in various research projects, and developed case studies on emerging approaches to XML adoption. DITA (Darwin Information Typing Architecture) is hot. Semantically enriched XML is hot. Both enable some interesting functionality for content delivered via print, on the web, and through mobile delivery channels. These include dynamic assembly of content organized into a variety of forms for custom uses, improved search and discovery of content, content interoperability across platforms, and distributed collaboration in creating and managing content.

On November 30, prior to the Gilbane Conference in Boston, Geoff Bock and I will be holding our 3rd workshop on Smart Content which is how we refer to semantically enriched, modular content (it’s easier to say). In the seminar we will discuss what makes content smart, how it is being developed and deployed in several organizations, and dive into some technical details on DITA and semantic enrichment.  This highly interactive seminar has been well received in prior sessions, and will be updated with our recently completed research findings.  More information on the seminar is available at  http://gilbaneboston.com/10/workshops.html.

By the way, t The research report, entitled Smart Content in the Enterprise, is now available at the research section at Gilbane.com. It (now available from Outsell Inc) includes several interesting case studies from a variety of organizations, and a lot of good information for those considering taking their content to the next level. We encourage you to download it (it is free). I also hope to see you in Boston at the workshop.

Why Aren’t Publishers Moving to XML Repositories More Quickly?

As we start to delve into some of the interim results of our survey of book publishing professionals, there is a great deal of good data to mull over. While the results are preliminary (and we welcome your participation here), some trends are emerging.

One interesting set of data points surround how publishers are viewing XML, how extensively they work with it, and what technologies they are using to support the management of the XML. Among those using XML, it’s significant that only about half have invested in some kind of storage mechanism specifically for XML, including both relational databases and dedicated XML repositories such as Mark Logic server.

While that overall number might or might not be so striking, I am struck by what some publishers feel is a barrier to adopting an XML repository, namely, the “Challenge of building XML knowledge, skills, or awareness.”  This trumped more traditional barriers to technology adoption such as cost and the maturity of the technology and would seem, on balance, to be a solvable problem.

 

What is Smart Content?

At Gilbane we talk of “Smart Content,” “Structured Content,” and “Unstructured Content.” We will be discussing these ideas in a seminar entitled “Managing Smart Content” at the Gilbane Conference next week in Boston. Below I share some ideas about these types of content and what they enable and require in terms of processes and systems.

When you add meaning to content you make it “smart” enough for computers to do some interesting things. Organizing, searching, processing, and discovery are greatly improved, which also increases the value of the data. Structured content allows some, but fewer, processes to be automated or simplified, and unstructured content enables very little to be streamlined and requires the most ongoing human intervention.

Most content is not very smart. In fact, most content is unstructured and usually more difficult to process automatically. Think flat text files, HTML without all the end tags, etc. Unstructured content is more difficult for computers to interpret and understand than structured content due to incompleteness and ambiguity inherent in the content. Unstructured content usually requires humans to decipher the structure and the meaning, or even to apply formatting for display rendering.

The next level up toward smart content is structured content. This includes wellformed XML documents, content compliant to a schema, or even RDMS databases. Some of the intelligence is included in the content, such as boundaries of element (or field) being clearly demarcated, and element names that mean something to users and systems that consume the information. Automatic processing of structured content includes reorganizing, breaking into components, rendering for print or display, and other processes streamlined by the structured content data models in use.

Smart Content diagram

Finally, smart content is structured content that also includes the semantic meaning of the information. The semantics can be in a variety of forms such as RDFa attributes applied to structured elements, or even semantically names elements. However it is done, the meaning is available to both humans and computers to process.

Smart content enables highly reusable content components and powerful automated dynamic document assembly. Searching can be enhanced with the inclusion of metadata and buried semantics in the content providing more clues as to what the data is about, where it came from, and how it is related to other content.Smart content enables very robust, valuable content ecosystems.

Deciding which level of rigor is needed for a specific set of content requires understanding the business drivers intended to be met. The more structure and intelligence you add to content, the more complicated and expensive the system development and content creation and management processes may become. More intelligence requires more investment, but may be justified through benefits achieved.

I think it is useful if the XML and CMS communities use consistent terms when talking about the rigor of their data models and the benefits they hope to achieve with them. Hopefully, these three terms, smart content, structured content, and unstructured content ring true and can be used productively to differentiate content and application types.

Webinar Series: Structured Content Throughout the Enterprise

Updated September 18
JustSystems has launched a comprehensive educational campaign intended to help technical communicators, LOB managers, and information managers extend the value of structured content outside of its established beachhead in techdoc applications. The campaign, titled “Developing a Strategic Roadmap for Structured Content,” comprises webinars, white papers, and an ROI Blueprint, a tool for identifying the business benefits of structured content throughout the enterprise. Gilbane Group is supporting the campaign with research, content, and webinar participation.
The three webinars look at how companies are leveraging structured content today, or planning to do so in the future. The first event is scheduled for September 11 and focuses on current practice and benchmarking your adoption against leading organizations. Guest speaker is Eric Severson, co-founder and CTO of Flatirons Solutions, the well-regarded professional services firm with deep expertise in content management and XML strategies and applications. Jake Sorofman from JustSystems rounds out the panel.
Register for one or all of the webinars in the series. Attendees will have access to the ROI Blueprint for Structured Content and will receive a Gilbane-authored state-of-the-market commentary after each event.
Update: The recording is now available.

Sun & Microsoft on Open Document Formats & XML Strategy

It wasn’t too long ago that all document formats were proprietary, and vendors that sold authoring and publishing software had a really unfair advantage over their customers because it was so difficult and costly for organizations to convert their content from one proprietary system to another. It was the granddaddy of descriptive markup, SGML, that led the way to the infinitely improved situation we have today with seemingly universal support for XML, and tools like XSL, XQuery etc. So, if most major software applications support reading/writing of XML, including the 800 pound gorilla of office documents Microsoft Office, hasn’t the issue of proprietary formats gone away?
If you are in charge of protecting your organizations content/document assets, you better not be thinking your problems are over. If you are involved in sharing content with other organizations or among applications, you already know how difficult it is to share information without loss — if it is that difficult to share, how easy will it be to migrate to future applications?
Our keynote debate in San Francisco next week is all about helping you understand how to best protect and share your content. While there are some differences between the Microsoft and Sun positions represented by Jean Paoli and Tim Bray, I think they agree more than they disagree on the critical issues you need to consider. We’ll be looking at different aspects of the issue including technology, licensing, cost, and complexity vs. flexibility. For some background see Jon Udell’s posts here and here, and the Cover Pages here. Both contain links to additional info.
I almost forgot… What does this have to do with my earlier posts on the future of content management and Longhorn? Well, Office applications, like all content applications, should benefit from an operating system that can manage content elements and attributes that could be described in XML. Would this make document interchange easier? I don’t know, but it might be fun to explore this question in the session.
If you have a specific question you would like us to cover on the panel, send me an email or add a comment to this post and we’ll summarize what happens.
UPDATE: Jon says he is in Jean’s camp on custom schemas and Tim’s on XHTML. At our Boston panel I think all of us agreed – of course neither Tim nor Jean were there. Jon is tagging his posts on the conference with gilbaneSF2005.
We are using the category and (more wordy) tag Gilbane Conference San Francisco 2005 for all our SF conference postings.

Excalibur Announces XML-Based Stand-Alone Video Logger

Excalibur Technologies announced an XML-based video logger, Screening Room Capture. The scaleable, standards-based video logger will be available as a stand-alone product or as part of Excalibur’s Screening Room product for end-to-end capture, encoding, indexing, management and re-purposing of video content. Screening Room Capture extracts visual and textual metadata from analog or digital video by controlling multiple subsystems for closed-captioned text extraction, voice-to-text servers, video analysis, manual annotation, device control, and timecode management. The product can control multiple video encoders no matter where they reside, enabling parallel, multiple format, simultaneous encoding. Due to its distributable nature, video logging capabilities and subsystems can now be scaled across as many computers as necessary. By encapsulating all metadata into XML, Screening Room Capture allows the easy integration of video logging into many other systems, such as an existing digital asset management or media asset management solution, or third party database. Screening Room Capture, when combined with Screening Room 2.2, provides a complete end-to-end system for video content management. Excalibur Screening Room is a fully integrated, modular system that gives any enterprise power to intelligently capture, manage, re-use and publish video content. www.excalib.com.

SoftQuad & Vignette Announce Integrated XML Solution

SoftQuad Software, Ltd. and Vignette Corporation announced that they have partnered to provide e-businesses with a comprehensive platform for implementing and deploying XML-based content solutions. Through the integration of SoftQuad’s XMetaL content-creation platform, and the Vignette V/5 eBusiness Platform, businesses can now leverage the use of XML technology. This partnership allows for the deployment of Web-based solutions that capture content from a wide variety of sources and integrate it with both internal and external systems. SoftQuad Software has become a Vignette Technology Partner and will also become a Vignette V2B Services partner. Vignette V2B Services streamline the process of purchasing, implementing and using e-business applications. Vignette V2B Services are provided via the Vignette V2B MarketPlace and Vignette V2B Communities. The integration of the two products gives e-businesses the ability to efficiently create and work with XML documents within a productive workflow environment, enabling the delivery of personalized content that attracts customers and builds successful e-relationships more quickly than ever before. With XMetaL’s word processor-like interface, more people within an organization can quickly create content directly in XML, or convert documents from other formats including Microsoft Word and Microsoft Excel, into XML. Content creators can then save XML documents directly within the Vignette V/5 eBusiness Platform production workflow. To further streamline the approval and revision process, XML content can be retrieved and submitted to Vignette’s Web-based workflow management interface or directly from within XMetaL. The transparency of the integration dramatically decreases the costs associated with providing an XML content solution and accelerates the speed with which e-businesses can deliver customized content to their customers. www.vignette.com, www.softquad.com

SoftQuad Announces MarketAgility, an XML Solution for B2B E-Commerce

SoftQuad Software, Ltd. announced MarketAgility, an XML-based content solution that gives businesses more power and control over the creation, management and real-time delivery of product information to e-marketplaces and e-procurement systems. MarketAgility provides suppliers with an efficient and cost-effective way to move product information from their enterprises to multiple electronic distribution channels. MarketAgility 1.0, formerly code named Global OnRamp, is scheduled for release in September, 2000. Suppliers until now have had trouble collecting their product information, which has generally been located in disparate sources throughout their enterprises. It’s also been difficult to translate that information into the various distinct formats required by different e-marketplaces. Even after these initial barriers have been overcome, suppliers have still found it hard to maintain up-to-date product and pricing information and to target and differentiate their products across multiple distribution channels. MarketAgility lets suppliers quickly leverage their existing infrastructure and business processes to collect product information from wherever it resides in the enterprise, whether in content management systems, electronic resource planning systems, enterprise databases, or Microsoft Word or Excel files. It then automatically delivers this information in a format that is fully customized for different e-markets in their specific dialect of XML. In addition, MarketAgility allows suppliers to maintain their competitive advantage by rapidly and incrementally updating product and pricing information across all channels. MarketAgility is comprised of three major components: the MarketAgility XML Connector, the MarketAgility XML Server, and the MarketAgility XML Transporter. MarketAgility also incorporates SoftQuad’s XMetaL technology, an enabler for XML content applications. MarketAgility allows content revisions to be made directly within XML. Suppliers can now easily supplement their product data with rich content and better differentiate themselves in e-markets. MarketAgility supports common industry standards such as BizTalk, W3C Schemas, xCBL and cXML. www.softquad.com