Curated for content, computing, and digital experience professionals

Month: November 2009 (Page 1 of 4)

Layering Technologies to Support the Enterprise with Semantic Search

Semantic search is a composite beast like many enterprise software applications. Most packages are made up of multiple technology components and often from multiple vendors. This raises some interesting thoughts as we prepare for Gilbane Boston 2009 to be held this week.

As part of a panel on semantic search, moderated by Hadley Reynolds of IDC, with Jeff Fried of Microsoft and Chris Lamb of the OpenCalais Initiative at Thomson Reuters, I wanted to give a high level view of semantic technologies currently in the marketplace. I contacted about a dozen vendors and selected six to highlight for the variety of semantic search offerings and business models.

One case study involves three vendors, each with a piece of the ultimate, customer-facing, product. My research took me to one company that I had reviewed a couple of years ago, and they sent me to their “customer” and to the customer’s customer. It took me a couple of conversations and emails to sort out the connections; in the end the relationships made perfect sense.

On one hand we have conglomerate software companies offering “solutions” to every imaginable enterprise business need. On the other, we see very unique, specialized point solutions to universal business problems with multiple dimensions and twists. Teaming by vendors, each with a solution to one dimension of a need, create compound product offerings that are adding up to a very large semantic search marketplace.

Consider an example of data gathering by a professional services firm. Let’s assume that my company has tens of thousands of documents collected in the course of research for many clients over many years. Researchers may move on to greater responsibility or other firms, leaving content unorganized except around confidential work for individual clients. We now want to exploit this corpus of content to create new products or services for various vertical markets. To understand what we have, we need to mine the content for themes and concepts.

The product of the mining exercise may have multiple uses: help us create a taxonomy of controlled terms, preparing a navigation scheme for a content portal, providing a feed to some business or text analytics tools that will help us create visual objects reflecting various configurations of content. A text mining vendor may be great at the mining aspect while other firms have better tools for analyzing, organizing and re-shaping the output.

Doing business with two or three vendors, experts in their own niches, may help us reach a conclusion about what to do with our information-rich pile of documents much faster. A multi-faceted approach can be a good way to bring a product or service to market more quickly than if we struggle with generic products from just one company.

When partners each have something of value to contribute, together they offer the benefits of the best of all options. This results in a new problem for businesses looking for the best in each area, namely, vendor relationship management. But it also saves organizations from dealing with huge firms offering many acquired products that have to be managed through a single point of contact, a generalist in everything and a specialist in nothing. Either way, you have to manage the players and how the components are going to work for you.

I really like what I see, semantic technology companies partnering with each other to give good-to-great solutions for all kinds of innovative applications. By the way, at the conference I am doing a quick snapshot on each: Cogito, Connotate (with Cormine and WorldTech), Lexalytics, Linguamatics, Sinequa and TEMIS.

Trade (eBook) Wars: Tiger Versus Grizzly Bear

The Gilbane Publishing Practice is diving deep into the transformation of publishing as more and more publishers realize that the digital domain can not be ignored.  Not that there aren’t plenty of publishers—especially in STM and other professional publishing efforts—already very active in digital publishing.  Still, trade publishing, for example, is seeing the very real opportunities in eBook markets, and we’re wrestling with what makes for best practices for them.

Not that anyone’s strategy makes for a “one-size-fits-all” approach. There are some trade publishers that have started in on or already have well-established single repository XML-based content management systems, the benefits of which are tremendous not just for eBooks, but for content re-use, custom publishing, localization and translation, and even to varying extents, integration with other line of publishing business systems. In trade publishing, however, there are plenty of publishers that have diverse collections of editorial and production platforms—often the result of the long history of mergers and acquisitions in this industry—and the level of integration within these editorial and production systems is ad hoc, at best, never mind effective tie-ins with marketing or sales systems, or royalties, or rights, etc.  You know who you are.

So, what is the trade publisher supposed to do?  While the ideal solution might be to create content chunks rich with meta-data that feed workflows across not just departments like production, but in and out of all of the other business systems as needed, there is a lot of time and money that goes into such a set up. For trade publishers with publishing systems that work—and maybe it doesn’t really matter if it’s taken a lot of gum and baling wire—what really is needed to add eBooks to the mix?

Companies like Aptara and even newer comer Tizra, along with well-established composition and conversion services, will tell you that if you can output in PDF, they can make eBook for you. And depending on the vendor, the eBook production may be very inexpensive, or have very sophisticated features, or be ready to market and sell, or some combination. SaaS is becoming more common for such processes, so investment, too, is relatively painless. Let’s think of this class of eBook production as “tigers.”  This class of solutions offers impressively quick solutions and a good range of capabilities across a growing number of vendors, and represent a strong competitive argument.

XML-based repository digital asset and content management platforms, with their ability to embed rich metadata that may even enable actionable content to other publishing systems—including sales and distribution—stand as a class we can think of a “grizzly bears.”  There is no doubt that this class of digital publishing solutions is a competitive strategy choice itself. One example is Wave Corporation, another is Mark Logic.  Some solutions work better with publishing business-specific platforms (e.g., Klopotek, Firebrand, MEI).

Of course it may not be an either/or question.  Recent news from codeMantra, about partnering with Mark Logic, points to the combining of the tiger and the grizzly. A “tizzly,” anyone?

Keep an eye open for our efforts to answer such questions, and if you are a vendor in this space, please be sure to contact Bill Trippe or Ralph Marto about participating in our multi-client reports. To read more about our Gilbane Publishing Practice consulting services, click here.

What is Smart Content?

At Gilbane we talk of “Smart Content,” “Structured Content,” and “Unstructured Content.” We will be discussing these ideas in a seminar entitled “Managing Smart Content” at the Gilbane Conference next week in Boston. Below I share some ideas about these types of content and what they enable and require in terms of processes and systems.

When you add meaning to content you make it “smart” enough for computers to do some interesting things. Organizing, searching, processing, and discovery are greatly improved, which also increases the value of the data. Structured content allows some, but fewer, processes to be automated or simplified, and unstructured content enables very little to be streamlined and requires the most ongoing human intervention.

Most content is not very smart. In fact, most content is unstructured and usually more difficult to process automatically. Think flat text files, HTML without all the end tags, etc. Unstructured content is more difficult for computers to interpret and understand than structured content due to incompleteness and ambiguity inherent in the content. Unstructured content usually requires humans to decipher the structure and the meaning, or even to apply formatting for display rendering.

The next level up toward smart content is structured content. This includes wellformed XML documents, content compliant to a schema, or even RDMS databases. Some of the intelligence is included in the content, such as boundaries of element (or field) being clearly demarcated, and element names that mean something to users and systems that consume the information. Automatic processing of structured content includes reorganizing, breaking into components, rendering for print or display, and other processes streamlined by the structured content data models in use.

Smart Content diagram

Finally, smart content is structured content that also includes the semantic meaning of the information. The semantics can be in a variety of forms such as RDFa attributes applied to structured elements, or even semantically names elements. However it is done, the meaning is available to both humans and computers to process.

Smart content enables highly reusable content components and powerful automated dynamic document assembly. Searching can be enhanced with the inclusion of metadata and buried semantics in the content providing more clues as to what the data is about, where it came from, and how it is related to other content. Smart content enables very robust, valuable content ecosystems.

Deciding which level of rigor is needed for a specific set of content requires understanding the business drivers intended to be met. The more structure and intelligence you add to content, the more complicated and expensive the system development and content creation and management processes may become. More intelligence requires more investment, but may be justified through benefits achieved.

I think it is useful if the XML and content management (CMS) communities use consistent terms when talking about the rigor of their data models and the benefits they hope to achieve with them. Hopefully, these three terms, smart content, structured content, and unstructured content ring true and can be used productively to differentiate content and application types.

New Gilbane Beacon Targets Digital Marketers

We’re pleased to announce the publication of a new Gilbane Beacon entitled Lessons for Digital Marketers: What Marketing Professionals Can Learn from the World’s Leading Publishers.

From the introduction:

". . .  Internet marketing will increase at the expense of traditional advertising, which is predicted to decline. This means that digital marketers will clearly be challenged to bring in the lion’s share of new customers and revenues. . . . 

"Gilbane believes that digital marketing managers can learn a great deal about leveraging content assets by drawing on the experiences of other content-rich organizations. One of the best candidate industries for lessons learned is the publishing industry. Challenges faced by CMOs and publishers are very similar: content closely tied to revenue streams, large volumes of diverse content types, rapidly evolving expectations regarding personalized content and interactivity, and requirement for frictionless publishing in order to meet the need for content immediacy. "

The paper is available for download now, along with a recording of the companion webinar. The paper will also be distributed at next week’s Gilbane Boston conference.

CrownPeak Launches New Online Marketing Tools

CrownPeak announced the launch of its Online Marketing Management Suite, with content management and marketing tools designed to enable online marketers to more easily and effectively engage target audiences. The completely new Suite of tools empowers business managers to test, target and measure content relevance in Web sites, landing pages, banner ads, mobile devices, social media and other online channels. Users can create “playlists” of persona segments based on implicit data such as referring URLS, external marketing campaigns, paid vs. organic search, geography or even specific IP ranges. Additionally, CrownPeak enables the creation of explicit segments based on what is “known” about each visitor from online registration or other forms (e.g. Webinar or white paper sign ups, polls and/or survey results). Also introduced within the new Suite are new form building tools to make it easier for CrownPeak customers to create any type of data collection form, and use that data for content targeting purposes. CrownPeak’s new tools can be integrated into other online marketing solutions and social media programs. From CRM solutions such as Salesforce.com, email solutions such as ExactTarget, and Web analytics solutions such as Omniture’s Site Catalyst and Google Analytics and Website Optimizer, CrownPeak provides pre-integrated solutions. The new CrownPeak capabilities are immediately available to users. http://www.crownpeak.com

Alfresco Releases OASIS CMIS 1.0 Public Review Implementation

Alfresco Software announced that it has included the OASIS Content Management Interoperability Services (CMIS) Version 1.0 in Alfresco Community 3.2 to enable developers and organizations to participate in the public review process. The OASIS CMIS Technical Committee (TC) has recently approved CMIS Version 1.0 as a Committee Draft and announced the start of a two month public review period. The objective of the CMIS specification is to deliver a common REST or Web Services API that can be used to develop write-once, run-anywhere, next generation content and social applications. The CMIS specification is backed by vendors including Alfresco, Adobe Systems, EMC, IBM, Microsoft, OpenText, Oracle and SAP. As an OASIS TC member, Alfresco is able to offer an implementation of CMIS for developers who wish to participate in the public review process. The public review ends December 22, 2009. The OASIS TC has issued an open invitation to comment and strongly encourage feedback from potential users and developers. CMIS 1.0 Public Review can be downloaded with Alfresco Community 3.2 at: http://wiki.alfresco.com/wiki/Download_Community_Edition.

Nuts and Bolts Tutorials at The Gilbane Conference

In a world that seems increasingly about technology itself, it has become tempting to assume that the questions and challenges of new and better information products is about the technology.  While it is true that technology is the key enabler of the new information world we are building, it is also true that the decision making and judgment involved in how that technology is to be organized and deployed is of equal–and not decreasing–importance.  Indeed, as the products move toward increasing sophistication and flexibility–smart content you might say–the importance of the human and organizational parts of the information life cycle become even more important. 

It is a truism that you cannot deliver information products you can’t create and manage, and with the circle of participants in that creation and management ever widening, we must be sensitive to the limits of the creators.  Moreover, while just "getting it up on the web" used to be at least sufficient to justify deployment of information products, today’s information consumer has a much more extensive and demanding list of features required before he will accept web-based information.  The publisher who forgets  or ignores that list is for trouble.

In a half-day session preceding the Gilbane conference next week, the Gilbance consulting team will tackle some of the real world challenges inherent in this rapidly changing information world, providing both sign posts for issues likely to come up and "in the trenches" suggestions for how to deal with them.  The goal of the session, scheduled for the afternoon of December 1, is that the attendees leave with a better handle on how to proceed in the quest for better information products and the role "smart content" should play. 

The presenters, in addition to their expertise in the technology and tools of information, bring a unique resource to their efforts: years of design, implementation and evaluation of real organizations facing real challenges.

Upcoming Workshop: Managing Smart Content: How to Deploy XML Technologies across Your Organization

As part of next week’s Gilbane Boston Conference, the XML practice will be delivering a pre-conference workshop, “Managing Smart Content: How to Deploy XML Technologies across Your Organization.” The instructors will be Geoff Bock, Dale Waldt, Bill Trippe, Barry Schaeffer and Neal Hannon–a group of experts that represents decades of technical and management experience on XML initiatives.

A tip of the virtual hat to Senior Analyst Geoff Bock for organizing this.

Smart content holds great promise. First with SGML and now with XML, we are marking up content with both formatting and semantic tags, and adding intelligence to electronic information. Using richly tagged XML documents that exploit predefined taxonomies, we are developing innovative applications for single source publishing, pharmaceutical labeling, and financial reporting. By managing content snippets in a granular yet coherent fashion, these applications are revolutionizing our capabilities to meet business needs and customers’ expectations.

What’s working and why? What are the lessons learned from these innovative applications? Does the rapid growth of web-based collaborative environments, together with the wide array of smart content editors, provide the keys to developing other business solutions? There are many promising approaches to tagging content while doing work. Yet we still face an uphill battle to smarten up our content and develop useful applications.

In this workshop, we the five members of the Gilbane practice on XML technologies will share our experiences and provide you with practical strategies for the future. We will address a range of topics, including:

  • The business drivers for smart content
  • Some innovative content management techniques that make authors and editors more productive
  • The migration paths from ‘conventional’ documents to smart content
  • How to apply industry-specific taxonomies to tag content for meaning
  • The prospects for mash-ups to integrate content from disparate application communities

We will discuss both the rapidly developing technologies available for creating, capturing, organizing, storing, and distributing smart content, as well as the organizational environment required to manage content as business processes. We will identify some of the IT challenges associated with managing information as smart content rather than as structured data, and map strategies to address them. We invite you to join the conversation about how best to exploit the power of XML as the foundation for managing smart content across your organization.

« Older posts

© 2024 The Gilbane Advisor

Theme by Anders NorenUp ↑