Archive for XML

NuMobile’s Stonewall Networks Releases Xidget Toolset for XML Developers

NuMobile, Inc. announced that its subsidiary, Stonewall Networks, Inc., has released the Xidget toolset. Stonewall is working to have Xidget become a World Wide Web Consortium (W3C) standard. Xidget is a fragment of the eXtensible Markup Language commonly referred to as XML. Stonewall developed Xidget as a toolset that utilizes a subset of XML during the development phase of Stonewall’s core product Cornerstone. The Xidget toolset is used to handle graphical calls at the user level to and from the internet. Other enhancements will include improving the time during the design and development life cycle. www.xidget.com

W3C Call for Review: XML Entity Definitions for Characters Proposed Recommendation

The W3C (World Wide Web Consortium) Math Working Group has published a Proposed Recommendation of “XML Entity Definitions for Characters.” This document presents a completed listing harmonizing the known uses in math and science of character entity names that appear throughout the XML world and Unicode. This document is the result of years of employing entity names on the Web. There were always a few named entities used for special characters in HTML, but a flood of new names came with the symbols of mathematics. Comments are welcome through 11 March. Learn more about the Math Activity. http://www.w3.org/Math/ http://www.w3.org/TR/2010/PR-xml-entity-names-20100211/

What’s Hot in XML? Workshop on Smart Content Describes Leading-Edge Content Applications

What is hot in XML these days? I have been to a few conferences and meetings, talked with many clients, participated in various research projects, and developed case studies on emerging approaches to XML adoption. DITA (Darwin Information Typing Architecture) is hot. Semantically enriched XML is hot. Both enable some interesting functionality for content delivered via print, on the web, and through mobile delivery channels. These include dynamic assembly of content organized into a variety of forms for custom uses, improved search and discovery of content, content interoperability across platforms, and distributed collaboration in creating and managing content.

On November 30, prior to the Gilbane Conference in Boston, Geoff Bock and I will be holding our 3rd workshop on Smart Content which is how we refer to semantically enriched, modular content (it’s easier to say). In the seminar we will discuss what makes content smart, how it is being developed and deployed in several organizations, and dive into some technical details on DITA and semantic enrichment.  This highly interactive seminar has been well received in prior sessions, and will be updated with our recently completed research findings.  More information on the seminar is available at  http://gilbaneboston.com/10/workshops.html.

By the way, t The research report, entitled Smart Content in the Enterprise, is now available at the research section at Gilbane.com. It (now available from Outsell Inc) includes several interesting case studies from a variety of organizations, and a lot of good information for those considering taking their content to the next level. We encourage you to download it (it is free). I also hope to see you in Boston at the workshop.

Why Aren’t Publishers Moving to XML Repositories More Quickly?

As we start to delve into some of the interim results of our survey of book publishing professionals, there is a great deal of good data to mull over. While the results are preliminary (and we welcome your participation here), some trends are emerging.

One interesting set of data points surround how publishers are viewing XML, how extensively they work with it, and what technologies they are using to support the management of the XML. Among those using XML, it’s significant that only about half have invested in some kind of storage mechanism specifically for XML, including both relational databases and dedicated XML repositories such as Mark Logic server.

While that overall number might or might not be so striking, I am struck by what some publishers feel is a barrier to adopting an XML repository, namely, the “Challenge of building XML knowledge, skills, or awareness.”  This trumped more traditional barriers to technology adoption such as cost and the maturity of the technology and would seem, on balance, to be a solvable problem.

 

What is Smart Content?

At Gilbane we talk of “Smart Content,” “Structured Content,” and “Unstructured Content.” We will be discussing these ideas in a seminar entitled “Managing Smart Content” at the Gilbane Conference next week in Boston. Below I share some ideas about these types of content and what they enable and require in terms of processes and systems.

When you add meaning to content you make it “smart” enough for computers to do some interesting things. Organizing, searching, processing, and discovery are greatly improved, which also increases the value of the data. Structured content allows some, but fewer, processes to be automated or simplified, and unstructured content enables very little to be streamlined and requires the most ongoing human intervention.

Most content is not very smart. In fact, most content is unstructured and usually more difficult to process automatically. Think flat text files, HTML without all the end tags, etc. Unstructured content is more difficult for computers to interpret and understand than structured content due to incompleteness and ambiguity inherent in the content. Unstructured content usually requires humans to decipher the structure and the meaning, or even to apply formatting for display rendering.

The next level up toward smart content is structured content. This includes wellformed XML documents, content compliant to a schema, or even RDMS databases. Some of the intelligence is included in the content, such as boundaries of element (or field) being clearly demarcated, and element names that mean something to users and systems that consume the information. Automatic processing of structured content includes reorganizing, breaking into components, rendering for print or display, and other processes streamlined by the structured content data models in use.

Smart Content diagram

Finally, smart content is structured content that also includes the semantic meaning of the information. The semantics can be in a variety of forms such as RDFa attributes applied to structured elements, or even semantically names elements. However it is done, the meaning is available to both humans and computers to process.

Smart content enables highly reusable content components and powerful automated dynamic document assembly. Searching can be enhanced with the inclusion of metadata and buried semantics in the content providing more clues as to what the data is about, where it came from, and how it is related to other content.Smart content enables very robust, valuable content ecosystems.

Deciding which level of rigor is needed for a specific set of content requires understanding the business drivers intended to be met. The more structure and intelligence you add to content, the more complicated and expensive the system development and content creation and management processes may become. More intelligence requires more investment, but may be justified through benefits achieved.

I think it is useful if the XML and CMS communities use consistent terms when talking about the rigor of their data models and the benefits they hope to achieve with them. Hopefully, these three terms, smart content, structured content, and unstructured content ring true and can be used productively to differentiate content and application types.

JustSystems Announces XMetaL Author Enterprise and XMetaL Reviewer 6.0

JustSystems announced the availability of XMetaL Author Enterprise 6.0 and XMetaL Reviewer 6.0, the latest versions of the company’s collaborative XML structured authoring and document reviewing software tools. New in this release is an integration between the two products that unifies the XML authoring process with real-time, distributed web-based reviewing to accelerate documentation cycle. The XMetaL Author Enterprise 6.0 and XMetaL Reviewer 6.0 integration is designed for unified authoring and reviewing, so that authors have tools to initiate and manage reviews as well as a set of specialized editing commands that help them directly act upon suggestions. This integration works with the Darwin Information Typing Architecture (DITA) standard as well as other industry standards. Other key features of the new release include– an unlimited number of documents can now be managed within the realm of a single project; a rendition can be associated with the project and used for direct navigation from the place in the final document’s layout to the originating topic that is under review; and arbitrary attachments can be associated in any number with projects, project cycles and drafts. http://www.justsystems.com

Mark Logic Releases MarkLogic Toolkit for Excel

Mark Logic Corporation released the MarkLogic Toolkit for Excel. This new offering provides users a free way to integrate Microsoft Office Excel 2007 with MarkLogic Server. Earlier this year, Mark Logic  delivered a Toolkit for Word and a Connector for SharePoint. Together, these offerings allow users to extend the functionality of Microsoft Office products and build applications leveraging the native document format, Office Open XML (OOXML). Distributed under an open source model, MarkLogic Toolkit for Excel comes with an Excel add-in that allows users to deploy information applications into Excel, comprehensive libraries for managing and manipulating Excel data, and a sample application that leverages best practices. The MarkLogic Toolkit for Excel offers greater search functionality, allowing organizations to search across their Excel files for worksheets, cells, and formulas. Search results can be imported directly into the workbooks that users are actively authoring. Workbooks, worksheets, formulas, and cells can be exported directly from active Excel documents to MarkLogic Server for immediate use by queries and applications. The Toolkit for Excel allows customers to easily create new Excel workbooks from existing XML documents. Users can now manipulate and re-use workbooks stored in the repository with a built-in XQuery library. For instance, a financial services firm can replace the manual process of cutting-and-pasting information from XBRL documents to create reports in Excel with an automated system. Utilizing Toolkit for Excel, this streamlined process extracts relevant sections of XBRL reports, combines them, and saves them as an Excel file. The Toolkit also allows users to add and edit multiple custom metadata documents across workbooks. This improves the ability for users to discover and reuse information contained in Excel spreadsheets. To download MarkLogic Toolkit for Excel, visit the Mark Logic Developer Workshop located at http://developer.marklogic.com/code/, http://www.marklogic.com

W3C Announces Update to CSS 2.1 Candidate Recommendation

The World Wide Web Consortium (W3C) Cascading Style Sheets (CSS) Working Group updated the Candidate Recommendation of “Cascading Style Sheets Level 2 Revision 1 (CSS 2.1) Specification.” CSS 2.1 is a style sheet language that allows authors and users to attach style (e.g., fonts and spacing) to structured documents (e.g., HTML documents and XML applications). CSS 2.1 corrects a few errors in CSS2 (the most important being a new definition of the height/width of absolutely positioned elements, more influence for HTML’s “style” attribute and a new calculation of the ‘clip’ property), and adds a few highly requested features which have already been widely implemented. But most of all CSS 2.1 represents a “snapshot” of CSS usage: it consists of all CSS features that are implemented interoperably. This draft incorporates errata resulting from implementation experience since the previous publication. http://www.w3.org/Style/