The Gilbane Advisor

Curated for content, computing, and digital experience professionals

Page 204 of 916

Book Publishers: Stick to Your Knitting

A Blueprint for Book Publishing Transformation: Seven Essential Processes to Re-Invent Publishing, The Gilbane Group’s Publishing Practice latest study, is due out any day now. One thing about the study that sets it apart from other ebook-oriented efforts is that Blueprint describes technologies, processes, markets, and other strategic considerations from the book publisher’s perspective. From the Executive Summary of our upcoming study:

For publishers and their technology and service partners, the challenge of the next few years will be to invest wisely in technology and process improvement while simultaneously being aggressive about pursuing new business models.
 

The message here is that book publishers really need to “stick to their knitting,” or, as we put it in the study:

The book publisher should be what it has always best been about—discovering, improving, and making public good and even great books.  But what has changed for book publishers is the radically different world in which they interact today, and that is the world of bits and bytes: digital content, digital communication, digital commerce.

If done right, today’s efforts toward digital publishing processes will “future proof” the publisher, because today’s efforts done right are aimed at adding value to the content in media neutral, forwardly compatible forms.

A central part of the “If done right” message is that book publishers still should focus on what publishers do with content, but that XML workflow has become essential to both print and digital publishing success. Here’s an interesting finding from Blueprint:

Nearly 48% of respondents say they use either an “XML-First” or “XML-Early” workflow.  We define an XML-First workflow as one where XML is used from the start with manuscript through production, and we define an “XML-Early” workflow as one where a word processor is used by authors, and then manuscript is converted to XML.”

Tomorrow, Aptara and The Gilbane Group are presenting a webinar, eBooks, Apps and Print? How to Effectively Produce it All Together, with myself and Bret Freeman, Digital Publishing Strategist, Aptara. The webinar takes place on Tuesday, September 28, 2010, at 11 a.m., EST, and you can register here.
 

Sophia Launches Sophia Search for Intelligent Enterprise Search and Contextual Discovery

Sophia, the provider of contextually aware enterprise search solutions, announced Sophia Search, a new search solution which uses a Semiotic-based linguistic model to identify intrinsic terms, phrases and relationships within unstructured content so that it can be recovered, consolidated and leveraged. Use of Sophia Search is designed to minimize compliance risk and reduce the cost of storing and managing enterprise information. Sophia Search is able to deliver a “three-dimensional” solution to discover, consolidate and optimize enterprise data, regardless of its data type or domain. Sophia Search helps organizations manage and analyze critical information by discovering the themes and intrinsic relationships behind their information, without taxonomies or ontologies, so that more relevant information may be discovered. By identifying both duplicates and near duplicates, Sophia Search allows organizations to effectively consolidate information and minimizing storage and management costs. Sophia Search features a patented Contextual Discovery Engine (CDE) which is based on the linguistic model of Semiotics, the science behind how humans understand the meaning of information in context. Sophia Search is available now to both customers and partners. Pricing starts at $30,000. http://www.sophiasearch.com/

Smart Content and the Pull of Search Engine Optimization

One of the conclusions of our report Smart Content in the Enterprise (forthcoming next week) is how a little bit of enrichment goes a long way. It’s important to build on your XML infrastructure, enrich your content a little bit (to the extent that your business environment is able to support), and expect to iterate over time.

Consider what happened at Citrix, reported in our case study Optimizing the Customer Experience at Citrix: Restructuring Documentation and Training for Web Delivery. The company had adopted DITA for structured publishing several years ago. Yet just repurposing the content in product manuals for print and electronic distribution, and publishing the same information as HTML and PDF documents, did not change the customer experience.

A few years ago, Citrix information specialists had a key insight: customers expected to find support information by googling the web. To be sure, there was a lot of content about various Citrix products out in cyberspace, but very little of it came directly from Citrix. Consequently the most popular solutions available via web-wide searching were not always reliable, and the detailed information from Citrix (buried in their own manuals) was rarely found.

What did Citrix do? Despite limited resources, the documentation group began to add search metadata to the product manuals. With DITA, there was already a predefined structure for topics, used to define sections, chapters, and manuals. Authors and editors could simply include additional tagged metadata that identified and classified the contents – and thus expose the information to Google and other web-wide search engines.

Nor was there a lot of time or many resources for up-front design and detailed analysis. To paraphrase a perceptive information architect we interviewed, “Getting started was a lot like throwing the stuff against a wall to see what sticks.” At first tags simply summarized existing chapter and section headings. Significantly, this was a good enough place to start.

Specifically, once Citrix was able to join the online conversation with its customers, it was also able to begin tracking popular search terms. Then over time and with successive product releases, the documentation group was able to add additional tagged metadata and provide ever more focused (and granular) content components.

What does this mean for developing smart content and leveraging the benefits of XML tagging? Certainly the more precise your content enrichment, the more findable your information is going to be. When considering the business benefits of search engine optimization, the quality of your tagging can always improve over time. But as a simple value proposition, getting started is the critical first step.

Revenge of the ECM nerds

cats

For those of you who aren’t familiar with who I am, I am the Marketing Specialist for Gilbane, more specifically the man behind the various social media curtains. One of my favorite parts of social media is memes, defined as, “a unit of cultural ideas, symbols or practices, which can be transmitted from one mind to another through writing, speech, gestures, rituals or other imitable phenomena.” The most famous example of a meme, almost synonymous with the internet now, is Lolcatz. One of the great pleasures I have managing the Gilbane accounts is the unique community. Defying stereotypes of computer geeks, the online CMS community has proven to be composed of a plethora of creative, witty, clever, and simply funny individuals spanning timezones, continents, and native languages. Earlier this year, we were treated with CMSHaikus, which I was happy to preserve in an ebook (the .pdf originally had Youtube videos embedded in it, but these have since been blocked due to a security patch). This time around, @Adriaanbloem took another meme and spun it with his own angle.

Adriaan bloem

The tweets that followed were a mixture of angst, disappointment, frustration, front-line experience, but most importantly humor! The sarcasm runs rampant here, but the jabs are taken at brands, vendors, scripting languages, developers, each other, and consulting agencies (although the “Godfather” and the agency in his name still seems to command respect as of this writing ).

The engine seems to have plenty of meme steam left in it, but when it’s gone you can read the #CMSRetraction Archive, or better yet follow the participants and become part of the quirky CMS Twitterrati. If I missed you on the list, drop me a line (@gilbane or @tallbonez) and I will be sure to add you!

Repurposing Content vs. Creating Multipurpose Content

In our recently completed research on Smart Content in the Enterprise we explored how organizations are taking advantage of benefits from XML throughout the enterprise and not just in the documentation department. Our findings include several key issues that leading edge XML implementers are addressing including new delivery requirements, new ways of creating and managing content, and the use of standards to create rich, interoperable content. In our case studies we examined how some are breaking out of the documentation department silo and enabling others inside or even outside the organization to contribute and collaborate on content. Some are even using crowd sourcing and social publishing to allow consumers of the information to annotate it and participate in its development. We found that expectations for content creation and management have changed significantly and we need to think about how we organize and manage our data to support these new requirements. One key finding of the research is that organizations are taking a different approach to repurposing their content, a more proactive approach that might better be called “multipurposing”.

In the XML world we have been talking about repurposing content for decades. Repurposing content usually means content that is created for one type of use is reorganized, converted, transformed, etc. for another use. Many organizations have successfully deployed XML systems that optimize delivery in multiple formats using what is often referred to as a Single Source Publishing (SSP) process where a single source of content is created and transformed into all desired deliverable formats (e.g., HTML, PDF, etc.).

Traditional delivery of content in the form of documents, whether in HTML or PDF, can be very limiting to users who want to search across multiple documents, reorganize document content into a form that is useful to the particular task at hand, or share portions with collaborators. As the functionality on Web sites and mobile devices becomes more sophisticated, new ways of delivering content are needed to take advantage of these capabilities. Dynamic assembly of content into custom views can be optimized with delivery of content components instead of whole documents. Powerful search features can be enhanced with metadata and other forms of content enrichment.

SSP and repurposing content traditionally focuses on the content creation, authoring, management and workflow steps up to delivery. In order for organizations to keep up with the potential of delivery systems and the emerging expectations of users, it behooves us to take a broader view of requirements for content systems and the underlying data model. Developers need to expand the scope of activities they evaluate and plan for when designing the system and the underlying data model. They should consider what metadata might improve faceted searching or dynamic assembly. In doing so they can identify the multiple purposes the content is destined for throughout the ecosystem in which it is created, managed and consumed.

Multipurpose content is designed with additional functionality in mind including faceted search, distributed collaboration and annotation, localization and translation, indexing, and even provisioning and other supply chain transactions. In short, multipurposing content focuses on the bigger picture to meet a broader set of business drivers throughout the enterprise, and even beyond to the needs of the information consumers.

It is easy to get carried away with data modeling and an overly complex data model usually requires more development, maintenance, and training than would otherwise be required to meet a set of business needs. You definitely want to avoid using specific processing terminology when naming elements (e.g., specific formatting, element names that describe processing actions instead of defining the role of the content). You can still create data models that address the broader range of activities without using specific commands or actions. Knowing a chunk of text is a “definition” instead of an “error message” is useful and far more easy to reinterpret for other uses than an “h2” element name or an attribute for display=’yes’. Breaking chapters into individual topics eases custom, dynamic assembly. Adding keywords and other enrichment can improve search results and the active management of the content. In short, multipurpose data models can and should be comprehensive and remain device agnostic to meet enterprise requirements for the content.

The difference between repurposing content and multipurpose content is a matter of degree and scope, and requires generic, agnostic components and element names. But most of all, multipurposing requires understanding the requirements of all processes in the desired enterprise environment up front when designing a system to make sure the model is sufficient to deliver designed outcomes and capabilities. Otherwise repurposing content will continue to be done as an afterthought process and possibly limit the usefulness of the content for some applications.

Early Access to Gilbane’s XML Report

If you’ve been reading our recent posts on Gilbane’s new research on XML adoption, you might be wondering how to get the report in advance of its availability from Gilbane later this month.

Smart Content in the Enterprise: How Next Generation XML Applications Deliver New Value to Multiple Stakeholders is currently offered by several of the study sponsors: IBM, JustSystems, MarkLogic, MindTouch, Ovitas, Quark, and SDL.

We’ll also be discussing our research in real time during a webinar hosted by SDL on November 4. Look for details within the next few weeks.

Leveraging Two Decades of Computational Linguistics for Semantic Search

Over the past three months I have had the pleasure of speaking with Kathleen Dahlgren, founder of Cognition, several times. I first learned about Cognition at the Boston Infonortics Search Engines meeting in 2009. That introduction led me to a closer look several months later when researching auto-categorization software. I was impressed with the comprehensive English language semantic net they had doggedly built over a 20+ year period.

A semantic net is a map of language that explicitly defines the many relationships among words and phrases. It might be very simple to illustrate something as fundamental as a small geographical locale and all named entities within it, or as complex as the entire base language of English with every concept mapped to illustrate all the ways that any one term is related to other terms, as illustrated in this tiny subset. Dr. Dahlgren and her team are among the few companies that have created a comprehensive semantic net for English.

In 2003, Dr. Dahlgren established Cognition as a software company to commercialize its semantic net, designing software to apply it to semantic search applications. As the Gilbane Group launched its new research on Semantic Software Technologies, Cognition signed on as a study co-sponsor and we engaged in several discussions with them that rounded out their history in this new marketplace. It was illustrative of pioneering in any new software domain.

Early adopters are key contributors to any software development. It is notable that Cognition has attracted experts in fields as diverse as medical research, legal e-discovery and Web semantic search. This gives the company valuable feedback for their commercial development. In any highly technical discipline, it is challenging and exciting to finding subject experts knowledgeable enough to contribute to product evolution and Cognition is learning from client experts where the best opportunities for growth lie.

Recent interviews with Cognition executives, and those of other sponsors, gave me the opportunity to get their reactions to my conclusions about this industry. These were the more interesting thoughts that came from Cognition after they had reviewed the Gilbane report:

  • Feedback from current clients and attendees at 2010 conferences, where Dr. Dahlgren was a featured speaker, confirms escalating awareness of the field; she feels that “This is the year of Semantics.” It is catching the imagination of IT folks who understand the diverse and important business problems to which semantic technology can be applied.
  • In addition to a significant upswing in semantics applied in life sciences, publishing, law and energy, Cognition sees specific opportunities for growth in risk assessment and risk management. Using semantics to detect signals, content salience, and measures of relevance are critical where the quantity of data and textual content is too voluminous for human filtering. There is not much evidence that financial services, banking and insurance are embracing semantic technologies yet, but it could dramatically improve their business intelligence and Cognition is well positioned to give support to leverage their already tested tools.
  • Enterprise semantic search will begin to overcome the poor reputation that traditional “string search” has suffered. There is growing recognition among IT professionals that in the enterprise 80% of the queries are unique; these cannot be interpreted based on popularity or social commentary. Determining relevance or accuracy of retrieved results depends on the types of software algorithms that apply computational linguistics, not pattern matching or statistical models.

In Dr. Dahlgren’s view, there is no question that a team approach to deploying semantic enterprise search is required. This means that IT professionals will work side-by-side with subject matter experts, search experts and vocabulary specialists to gain the best advantage from semantic search engines.

The unique language aspects of an enterprise content domain are as important as the software a company employs. The Cognition baseline semantic net, out-of-the-box, will always give reliable and better results than traditional string search engines. However, it gives top performance when enhanced with enterprise language, embedding all the ways that subject experts talk about their topical domain, jargon, acronyms, code phrases, etc.

With elements of its software already embedded in some notable commercial applications like Bing, Cognition is positioned for delivering excellent semantic search for an enterprise. They are taking on opportunities in areas like risk management that have been slow to adopt semantic tools. They will deliver software to these customers together with services and expertise to coach their clients through the implementation, deployment and maintenance essential to successful use. The enthusiasm expressed to me by Kathleen Dahlgren about semantics confirms what I also heard from Cognition clients. They are confident that the technology coupled with thoughtful guidance from their support services will be the true value-added for any enterprise semantic search application using Cognition.

The free download of the Gilbane study and deep-dive on Cognition was announced on their Web site at this page.

New Paper – Looking at Website Governance

I am delighted that I’ve just completed my first solo paper here as an analyst: Looking Outside the CMS Box for Enterprise Website Governance. I say solo, but I ought to start by saying I’m grateful for having had a great deal of support from Mary Laplante as my reform from vendor to analyst continues.

This paper has allowed me to pick at a subject that I’ve long had in the back of my mind, both in terms of CMS product strategy and of what we, as content management professionals, need to be cognizant of as we get swept up in engaging web experiences – that of corporate content governance.

When I write and talk about web engagement or the web experience, I often refer to the first impression – that your website meets all of your audience, prospects, customers or citizens. They don’t all see your shiny headquarters building, meet the friendly receptionist or see that you have todays copy of The Times on the coffee table – but they do see your website.

Mistakes such as a misspelling, an outdated page or a brand inconsistency all reflect badly on your attention to detail. This tarnishes the professionalism of your services, the reliability of your products, and attention you will pay to meeting consumer needs.

Of course, when those lapses are related to compliance issues (such as regulatory requirements and accessibility standards), they can be even more damaging, often resulting in financial penalties and a serious impact on your reputation.

I see this governance as the foundation for any content driven business application, but in this paper we focus on website governance and aim to answer the following questions:

  • What are the critical content governance risks and issues facing the organization? 
  • Is your CMS implementation meeting these challenges? 
  • What solutions are available to address governance needs that are not addressed by CMS? 

The paper is  now available for download from our Beacon library page and from Magus, who sponsored it.

Magus are also presenting business seminars on website governance and compliance  on October 12 in Washington, DC, and October 14 in New York. My colleague Scott Liewehr will be presenting at those events, drawing on the analysis in the Beacon as part of that seminar program. You can learn more about those events and register on the Magus website.

 

« Older posts Newer posts »

© 2024 The Gilbane Advisor

Theme by Anders NorenUp ↑