Archive for XML

What’s Hot in XML? Workshop on Smart Content Describes Leading-Edge Content Applications

What is hot in XML these days? I have been to a few conferences and meetings, talked with many clients, participated in various research projects, and developed case studies on emerging approaches to XML adoption. DITA (Darwin Information Typing Architecture) is hot. Semantically enriched XML is hot. Both enable some interesting functionality for content delivered via print, on the web, and through mobile delivery channels. These include dynamic assembly of content organized into a variety of forms for custom uses, improved search and discovery of content, content interoperability across platforms, and distributed collaboration in creating and managing content.

On November 30, prior to the Gilbane Conference in Boston, Geoff Bock and I will be holding our 3rd workshop on Smart Content which is how we refer to semantically enriched, modular content (it’s easier to say). In the seminar we will discuss what makes content smart, how it is being developed and deployed in several organizations, and dive into some technical details on DITA and semantic enrichment.  This highly interactive seminar has been well received in prior sessions, and will be updated with our recently completed research findings.  More information on the seminar is available at  http://gilbaneboston.com/10/workshops.html.

By the way, t The research report, entitled Smart Content in the Enterprise, is now available at the research section at Gilbane.com. It (now available from Outsell Inc) includes several interesting case studies from a variety of organizations, and a lot of good information for those considering taking their content to the next level. We encourage you to download it (it is free). I also hope to see you in Boston at the workshop.

Why Aren’t Publishers Moving to XML Repositories More Quickly?

As we start to delve into some of the interim results of our survey of book publishing professionals, there is a great deal of good data to mull over. While the results are preliminary (and we welcome your participation here), some trends are emerging.

One interesting set of data points surround how publishers are viewing XML, how extensively they work with it, and what technologies they are using to support the management of the XML. Among those using XML, it’s significant that only about half have invested in some kind of storage mechanism specifically for XML, including both relational databases and dedicated XML repositories such as Mark Logic server.

While that overall number might or might not be so striking, I am struck by what some publishers feel is a barrier to adopting an XML repository, namely, the “Challenge of building XML knowledge, skills, or awareness.”  This trumped more traditional barriers to technology adoption such as cost and the maturity of the technology and would seem, on balance, to be a solvable problem.

 

What is Smart Content?

At Gilbane we talk of “Smart Content,” “Structured Content,” and “Unstructured Content.” We will be discussing these ideas in a seminar entitled “Managing Smart Content” at the Gilbane Conference next week in Boston. Below I share some ideas about these types of content and what they enable and require in terms of processes and systems.

When you add meaning to content you make it “smart” enough for computers to do some interesting things. Organizing, searching, processing, and discovery are greatly improved, which also increases the value of the data. Structured content allows some, but fewer, processes to be automated or simplified, and unstructured content enables very little to be streamlined and requires the most ongoing human intervention.

Most content is not very smart. In fact, most content is unstructured and usually more difficult to process automatically. Think flat text files, HTML without all the end tags, etc. Unstructured content is more difficult for computers to interpret and understand than structured content due to incompleteness and ambiguity inherent in the content. Unstructured content usually requires humans to decipher the structure and the meaning, or even to apply formatting for display rendering.

The next level up toward smart content is structured content. This includes wellformed XML documents, content compliant to a schema, or even RDMS databases. Some of the intelligence is included in the content, such as boundaries of element (or field) being clearly demarcated, and element names that mean something to users and systems that consume the information. Automatic processing of structured content includes reorganizing, breaking into components, rendering for print or display, and other processes streamlined by the structured content data models in use.

Smart Content diagram

Finally, smart content is structured content that also includes the semantic meaning of the information. The semantics can be in a variety of forms such as RDFa attributes applied to structured elements, or even semantically names elements. However it is done, the meaning is available to both humans and computers to process.

Smart content enables highly reusable content components and powerful automated dynamic document assembly. Searching can be enhanced with the inclusion of metadata and buried semantics in the content providing more clues as to what the data is about, where it came from, and how it is related to other content.Smart content enables very robust, valuable content ecosystems.

Deciding which level of rigor is needed for a specific set of content requires understanding the business drivers intended to be met. The more structure and intelligence you add to content, the more complicated and expensive the system development and content creation and management processes may become. More intelligence requires more investment, but may be justified through benefits achieved.

I think it is useful if the XML and CMS communities use consistent terms when talking about the rigor of their data models and the benefits they hope to achieve with them. Hopefully, these three terms, smart content, structured content, and unstructured content ring true and can be used productively to differentiate content and application types.

JustSystems Announces XMetaL Author Enterprise and XMetaL Reviewer 6.0

JustSystems announced the availability of XMetaL Author Enterprise 6.0 and XMetaL Reviewer 6.0, the latest versions of the company’s collaborative XML structured authoring and document reviewing software tools. New in this release is an integration between the two products that unifies the XML authoring process with real-time, distributed web-based reviewing to accelerate documentation cycle. The XMetaL Author Enterprise 6.0 and XMetaL Reviewer 6.0 integration is designed for unified authoring and reviewing, so that authors have tools to initiate and manage reviews as well as a set of specialized editing commands that help them directly act upon suggestions. This integration works with the Darwin Information Typing Architecture (DITA) standard as well as other industry standards. Other key features of the new release include– an unlimited number of documents can now be managed within the realm of a single project; a rendition can be associated with the project and used for direct navigation from the place in the final document’s layout to the originating topic that is under review; and arbitrary attachments can be associated in any number with projects, project cycles and drafts. http://www.justsystems.com

Mark Logic Releases MarkLogic Toolkit for Excel

Mark Logic Corporation released the MarkLogic Toolkit for Excel. This new offering provides users a free way to integrate Microsoft Office Excel 2007 with MarkLogic Server. Earlier this year, Mark Logic  delivered a Toolkit for Word and a Connector for SharePoint. Together, these offerings allow users to extend the functionality of Microsoft Office products and build applications leveraging the native document format, Office Open XML (OOXML). Distributed under an open source model, MarkLogic Toolkit for Excel comes with an Excel add-in that allows users to deploy information applications into Excel, comprehensive libraries for managing and manipulating Excel data, and a sample application that leverages best practices. The MarkLogic Toolkit for Excel offers greater search functionality, allowing organizations to search across their Excel files for worksheets, cells, and formulas. Search results can be imported directly into the workbooks that users are actively authoring. Workbooks, worksheets, formulas, and cells can be exported directly from active Excel documents to MarkLogic Server for immediate use by queries and applications. The Toolkit for Excel allows customers to easily create new Excel workbooks from existing XML documents. Users can now manipulate and re-use workbooks stored in the repository with a built-in XQuery library. For instance, a financial services firm can replace the manual process of cutting-and-pasting information from XBRL documents to create reports in Excel with an automated system. Utilizing Toolkit for Excel, this streamlined process extracts relevant sections of XBRL reports, combines them, and saves them as an Excel file. The Toolkit also allows users to add and edit multiple custom metadata documents across workbooks. This improves the ability for users to discover and reuse information contained in Excel spreadsheets. To download MarkLogic Toolkit for Excel, visit the Mark Logic Developer Workshop located at http://developer.marklogic.com/code/, http://www.marklogic.com

DataDirect Announces New Release of XML Data Integration Suite

DataDirect Technologies, an operating company of Progress Software Corporation (NASDAQ- PRGS), announced the latest release of the DataDirect Data Integration Suite featuring new versions of its XML-based component technologies for data integration in traditional and service-oriented environments. Designed to meet the data transformation and aggregation needs of developers, the DataDirect Data Integration Suite contains the latest product releases of DataDirect XQuery, DataDirect XML Converters (Java and .NET) and Stylus Studio in one installation. DataDirect XQuery is an XQuery processor that enables developers to access and query XML, relational data, Web services, EDI, legacy, or a combination of data sources. New to version 4.0 is full support for the XQuery Update Facility (XUF), an extension of the XQuery language that allows making changes to data manipulated inside the XQuery. Now developers can more easily update individual XML documents, XML streams, and file collections from within their XQuery applications. The product also includes the ability to update and create Zip files, therefore supporting the OpenOffice XML format. The latest release of the DataDirect XML Converters are compatible with Microsoft BizTalk Server 2006 and are integrated in the Microsoft BizTalk development environment. For healthcare organizations needing to comply with the X12 electronic data interchange (EDI) standards and the latest Health Insurance Portability and Accountability Act (HIPAA) 5010 transaction definitions, the DataDirect XML Converters now include support for the HIPAA EDI dialects including 004010A1, 005010 and 005010A1 messages. Stylus Studio 2009 has a new EDI to XML module that works with DataDirect XML Converters in an interactive way. Users can now load EDI documents to view contents, test conversions, create customizations and preview XML. http://www.datadirect.com

Webinar Series: Structured Content Throughout the Enterprise

Updated September 18
JustSystems has launched a comprehensive educational campaign intended to help technical communicators, LOB managers, and information managers extend the value of structured content outside of its established beachhead in techdoc applications. The campaign, titled “Developing a Strategic Roadmap for Structured Content,” comprises webinars, white papers, and an ROI Blueprint, a tool for identifying the business benefits of structured content throughout the enterprise. Gilbane Group is supporting the campaign with research, content, and webinar participation.

The three webinars look at how companies are leveraging structured content today, or planning to do so in the future. The first event is scheduled for September 11 and focuses on current practice and benchmarking your adoption against leading organizations. Guest speaker is Eric Severson, co-founder and CTO of Flatirons Solutions, the well-regarded professional services firm with deep expertise in content management and XML strategies and applications. Jake Sorofman from JustSystems rounds out the panel.

Register for one or all of the webinars in the series. Attendees will have access to the ROI Blueprint for Structured Content and will receive a Gilbane-authored state-of-the-market commentary after each event.

Update: The recording is now available.

MadCap Software Unveils Roadmap for Native XML Family of Documentation and Content Authoring Products

MadCap Software unveiled its roadmap for a complete, native XML software family designed to solve all of a company’s documentation and authoring demands. The MadCap family will include five new products– MadCap Blaze, MadCap Press, MadCap Team Server, MadCap X-Edit, and MadCap X-Edit Express, as well as enhanced versions of MadCap Analyzer, MadCap Flare, MadCap Lingo and MadCap Mimic. The integrated MadCap family will provide companies with a solution for developing and delivering content in print, online and on the Web in their language of choice. The entire MadCap product family is based on a common native XML architecture to provide a complete workflow solution, from authoring and multimedia creation; to collaboration, reporting and analysis; to translation and localization. The MadCap family features twelve integrated products for content development and delivery, collaboration, and localization. The solutions are based on the same XML architecture with Unicode support that drives MadCap’s main product, Flare, a native XML multi-channel, single-source content authoring solution. All products also utilize MadCap’s XML user interface, which enables users to take advantage of XML without writing code. The beta version of Blaze is now available as a free 30-day trial release, which can be downloaded at http://www.madcapsoftware.com/