Curated for content, computing, and digital experience professionals

Month: November 2009 (Page 1 of 6)

Layering Technologies to Support the Enterprise with Semantic Search

Semantic search is a composite beast like many enterprise software applications. Most packages are made up of multiple technology components and often from multiple vendors. This raises some interesting thoughts as we prepare for Gilbane Boston 2009 to be held this week.

As part of a panel on semantic search, moderated by Hadley Reynolds of IDC, with Jeff Fried of Microsoft and Chris Lamb of the OpenCalais Initiative at Thomson Reuters, I wanted to give a high level view of semantic technologies currently in the marketplace. I contacted about a dozen vendors and selected six to highlight for the variety of semantic search offerings and business models.

One case study involves three vendors, each with a piece of the ultimate, customer-facing, product. My research took me to one company that I had reviewed a couple of years ago, and they sent me to their “customer” and to the customer’s customer. It took me a couple of conversations and emails to sort out the connections; in the end the relationships made perfect sense.

On one hand we have conglomerate software companies offering “solutions” to every imaginable enterprise business need. On the other, we see very unique, specialized point solutions to universal business problems with multiple dimensions and twists. Teaming by vendors, each with a solution to one dimension of a need, create compound product offerings that are adding up to a very large semantic search marketplace.

Consider an example of data gathering by a professional services firm. Let’s assume that my company has tens of thousands of documents collected in the course of research for many clients over many years. Researchers may move on to greater responsibility or other firms, leaving content unorganized except around confidential work for individual clients. We now want to exploit this corpus of content to create new products or services for various vertical markets. To understand what we have, we need to mine the content for themes and concepts.

The product of the mining exercise may have multiple uses: help us create a taxonomy of controlled terms, preparing a navigation scheme for a content portal, providing a feed to some business or text analytics tools that will help us create visual objects reflecting various configurations of content. A text mining vendor may be great at the mining aspect while other firms have better tools for analyzing, organizing and re-shaping the output.

Doing business with two or three vendors, experts in their own niches, may help us reach a conclusion about what to do with our information-rich pile of documents much faster. A multi-faceted approach can be a good way to bring a product or service to market more quickly than if we struggle with generic products from just one company.

When partners each have something of value to contribute, together they offer the benefits of the best of all options. This results in a new problem for businesses looking for the best in each area, namely, vendor relationship management. But it also saves organizations from dealing with huge firms offering many acquired products that have to be managed through a single point of contact, a generalist in everything and a specialist in nothing. Either way, you have to manage the players and how the components are going to work for you.

I really like what I see, semantic technology companies partnering with each other to give good-to-great solutions for all kinds of innovative applications. By the way, at the conference I am doing a quick snapshot on each: Cogito, Connotate (with Cormine and WorldTech), Lexalytics, Linguamatics, Sinequa and TEMIS.

Trade (eBook) Wars: Tiger Versus Grizzly Bear

The Gilbane Publishing Practice is diving deep into the transformation of publishing as more and more publishers realize that the digital domain can not be ignored.  Not that there aren’t plenty of publishers—especially in STM and other professional publishing efforts—already very active in digital publishing.  Still, trade publishing, for example, is seeing the very real opportunities in eBook markets, and we’re wrestling with what makes for best practices for them.

Not that anyone’s strategy makes for a “one-size-fits-all” approach. There are some trade publishers that have started in on or already have well-established single repository XML-based content management systems, the benefits of which are tremendous not just for eBooks, but for content re-use, custom publishing, localization and translation, and even to varying extents, integration with other line of publishing business systems. In trade publishing, however, there are plenty of publishers that have diverse collections of editorial and production platforms—often the result of the long history of mergers and acquisitions in this industry—and the level of integration within these editorial and production systems is ad hoc, at best, never mind effective tie-ins with marketing or sales systems, or royalties, or rights, etc.  You know who you are.

So, what is the trade publisher supposed to do?  While the ideal solution might be to create content chunks rich with meta-data that feed workflows across not just departments like production, but in and out of all of the other business systems as needed, there is a lot of time and money that goes into such a set up. For trade publishers with publishing systems that work—and maybe it doesn’t really matter if it’s taken a lot of gum and baling wire—what really is needed to add eBooks to the mix?

Companies like Aptara and even newer comer Tizra, along with well-established composition and conversion services, will tell you that if you can output in PDF, they can make eBook for you. And depending on the vendor, the eBook production may be very inexpensive, or have very sophisticated features, or be ready to market and sell, or some combination. SaaS is becoming more common for such processes, so investment, too, is relatively painless. Let’s think of this class of eBook production as “tigers.”  This class of solutions offers impressively quick solutions and a good range of capabilities across a growing number of vendors, and represent a strong competitive argument.

XML-based repository digital asset and content management platforms, with their ability to embed rich metadata that may even enable actionable content to other publishing systems—including sales and distribution—stand as a class we can think of a “grizzly bears.”  There is no doubt that this class of digital publishing solutions is a competitive strategy choice itself. One example is Wave Corporation, another is Mark Logic.  Some solutions work better with publishing business-specific platforms (e.g., Klopotek, Firebrand, MEI).

Of course it may not be an either/or question.  Recent news from codeMantra, about partnering with Mark Logic, points to the combining of the tiger and the grizzly. A “tizzly,” anyone?

Keep an eye open for our efforts to answer such questions, and if you are a vendor in this space, please be sure to contact Bill Trippe or Ralph Marto about participating in our multi-client reports. To read more about our Gilbane Publishing Practice consulting services, click here.

What is Smart Content?

At Gilbane we talk of “Smart Content,” “Structured Content,” and “Unstructured Content.” We will be discussing these ideas in a seminar entitled “Managing Smart Content” at the Gilbane Conference next week in Boston. Below I share some ideas about these types of content and what they enable and require in terms of processes and systems.

When you add meaning to content you make it “smart” enough for computers to do some interesting things. Organizing, searching, processing, and discovery are greatly improved, which also increases the value of the data. Structured content allows some, but fewer, processes to be automated or simplified, and unstructured content enables very little to be streamlined and requires the most ongoing human intervention.

Most content is not very smart. In fact, most content is unstructured and usually more difficult to process automatically. Think flat text files, HTML without all the end tags, etc. Unstructured content is more difficult for computers to interpret and understand than structured content due to incompleteness and ambiguity inherent in the content. Unstructured content usually requires humans to decipher the structure and the meaning, or even to apply formatting for display rendering.

The next level up toward smart content is structured content. This includes wellformed XML documents, content compliant to a schema, or even RDMS databases. Some of the intelligence is included in the content, such as boundaries of element (or field) being clearly demarcated, and element names that mean something to users and systems that consume the information. Automatic processing of structured content includes reorganizing, breaking into components, rendering for print or display, and other processes streamlined by the structured content data models in use.

Smart Content diagram

Finally, smart content is structured content that also includes the semantic meaning of the information. The semantics can be in a variety of forms such as RDFa attributes applied to structured elements, or even semantically names elements. However it is done, the meaning is available to both humans and computers to process.

Smart content enables highly reusable content components and powerful automated dynamic document assembly. Searching can be enhanced with the inclusion of metadata and buried semantics in the content providing more clues as to what the data is about, where it came from, and how it is related to other content. Smart content enables very robust, valuable content ecosystems.

Deciding which level of rigor is needed for a specific set of content requires understanding the business drivers intended to be met. The more structure and intelligence you add to content, the more complicated and expensive the system development and content creation and management processes may become. More intelligence requires more investment, but may be justified through benefits achieved.

I think it is useful if the XML and content management (CMS) communities use consistent terms when talking about the rigor of their data models and the benefits they hope to achieve with them. Hopefully, these three terms, smart content, structured content, and unstructured content ring true and can be used productively to differentiate content and application types.

New Gilbane Beacon Targets Digital Marketers

We’re pleased to announce the publication of a new Gilbane Beacon entitled Lessons for Digital Marketers: What Marketing Professionals Can Learn from the World’s Leading Publishers.

From the introduction:

". . .  Internet marketing will increase at the expense of traditional advertising, which is predicted to decline. This means that digital marketers will clearly be challenged to bring in the lion’s share of new customers and revenues. . . . 

"Gilbane believes that digital marketing managers can learn a great deal about leveraging content assets by drawing on the experiences of other content-rich organizations. One of the best candidate industries for lessons learned is the publishing industry. Challenges faced by CMOs and publishers are very similar: content closely tied to revenue streams, large volumes of diverse content types, rapidly evolving expectations regarding personalized content and interactivity, and requirement for frictionless publishing in order to meet the need for content immediacy. "

The paper is available for download now, along with a recording of the companion webinar. The paper will also be distributed at next week’s Gilbane Boston conference.

CrownPeak Launches New Online Marketing Tools

CrownPeak announced the launch of its Online Marketing Management Suite, with content management and marketing tools designed to enable online marketers to more easily and effectively engage target audiences. The completely new Suite of tools empowers business managers to test, target and measure content relevance in Web sites, landing pages, banner ads, mobile devices, social media and other online channels. Users can create “playlists” of persona segments based on implicit data such as referring URLS, external marketing campaigns, paid vs. organic search, geography or even specific IP ranges. Additionally, CrownPeak enables the creation of explicit segments based on what is “known” about each visitor from online registration or other forms (e.g. Webinar or white paper sign ups, polls and/or survey results). Also introduced within the new Suite are new form building tools to make it easier for CrownPeak customers to create any type of data collection form, and use that data for content targeting purposes. CrownPeak’s new tools can be integrated into other online marketing solutions and social media programs. From CRM solutions such as Salesforce.com, email solutions such as ExactTarget, and Web analytics solutions such as Omniture‘s Site Catalyst and Google Analytics and Website Optimizer, CrownPeak provides pre-integrated solutions. The new CrownPeak capabilities are immediately available to users. http://www.crownpeak.com

« Older posts

© 2020 The Gilbane Advisor

Theme by Anders NorenUp ↑