Curated content for content, computing, and digital experience professionsals

Tag: Enterprise semantic search

Embedded Search in the Enterprise

We need to make a distinction between “search in the enterprise” and “enterprise-wide search.” The former is any search that exists persistently in view as we go about our primary work activities. The latter commonly assumes aggregation of all enterprise content via a single platform OR enterprise content to which everyone in the organization will have access. So many attempts at enterprise-wide search are reported to be compromised or frustrated before achieving successful outcomes that it is time to pay attention to point-of-need solutions. This is search that will smoothly satisfy routine retrieval requirements as we work.

Most of us work in a small number of applications all day. A writer will be wedded to a content creation application plus research sources both on the web and internal to the enterprise in which writing is being done. Finding information to support writing whether it is a press release, marketing brochure or technical documentation to accompany a technical product requires access to appropriate content for the writer to deliver to an audience. The audience may be a business analyst, customer’s buyer or product user with advanced technical expertise. During any one work assignment, the writer will usually be focused on one audience and will only need a limited view of content specific to that task.

When a search takes us on a merry chase through multiple resource repositories or in a single repository with heaps of irrelevant content and no good results, we are being forced into a mental traffic nightmare, not of our own making. As this blog post by Tony Schwartz reminds us, we need time to focus and concentrate. It enables us to work smarter and more calmly; for employers seeking to support workers with the best tools, search that works well at the point of doing an assignment is the ultimate perk. I know how frantic and fractionated my mental state becomes as I follow one fruitless web of links after another that I believe will lead me to the piece of information I need. Truthfully, I often become so absorbed in the search and ancillary information I “discover” along the way that sight of the target becomes secondary.

New wisdom from a host of analysts and writers suggests that embedded search is more than a trend, as is search with a specific focus or purposeful business goal. The fact that FAST is now embedded with and for SharePoint and its use is growing principally in that arena illustrates the trend. But readers should also consider a large array of newer search solutions that are strong on semantic features, APIs, integration options, and connectors to a huge variety of content that exists in other application repositories. This article by James Martin in CIO, How to Evaluate Enterprise Search has helpful comments from Leslie Owens of Forrester Research and the rise of connectors is highlighted by Alan Pelz-Sharpe in this post.

Right now two rather new search engines are on my radar screen because of their timely entrance to the marketplace. One is Q-Sensei, which has just released their version 2.0. It is an ontology-based solution very much focused on efficiently processing big data, quick deployment, and integration with content applications. The second is Cambridge Semantics with its Anzo semantic solutions for analyzing and retrieving business data. Finally, I am very excited that ISYS was the object of an acquisition by Lexmark. It was an unexpected move but they deserved to be recognized for having solid connector/filter technology and a large, satisfied customer base. It will be interesting to see how a hardware vendor, noted for print technology, will integrate ISYS search software into its product offerings. Information retrieval belongs where work is being done.

These are just three vendors poised to change the expectations of searchers by fulfilling search needs, embedded or integrated efficiently in select business application areas. Martin White’s most recent enumeration of search vendors puts the list at about 70; they are primarily vendors with standalone search products, products that support standalone search or search engines that complement other content applications. You will see many viable options there that are unfamiliar but be sure to dig down to understand where each might fill a unique need in your enterprise.

When seeking solutions for search problems you need to really understand the purpose before seeking candidate vendors. Then focus on products that have the same clarity of applicability you want. They may be embedded with a product such as Lexmark’s, or a CAD system. The first step is to decide where and for whom you need search to be present.

Leveraging Two Decades of Computational Linguistics for Semantic Search

Over the past three months I have had the pleasure of speaking with Kathleen Dahlgren, founder of Cognition, several times. I first learned about Cognition at the Boston Infonortics Search Engines meeting in 2009. That introduction led me to a closer look several months later when researching auto-categorization software. I was impressed with the comprehensive English language semantic net they had doggedly built over a 20+ year period.

A semantic net is a map of language that explicitly defines the many relationships among words and phrases. It might be very simple to illustrate something as fundamental as a small geographical locale and all named entities within it, or as complex as the entire base language of English with every concept mapped to illustrate all the ways that any one term is related to other terms, as illustrated in this tiny subset. Dr. Dahlgren and her team are among the few companies that have created a comprehensive semantic net for English.

In 2003, Dr. Dahlgren established Cognition as a software company to commercialize its semantic net, designing software to apply it to semantic search applications. As the Gilbane Group launched its new research on Semantic Software Technologies, Cognition signed on as a study co-sponsor and we engaged in several discussions with them that rounded out their history in this new marketplace. It was illustrative of pioneering in any new software domain.

Early adopters are key contributors to any software development. It is notable that Cognition has attracted experts in fields as diverse as medical research, legal e-discovery and Web semantic search. This gives the company valuable feedback for their commercial development. In any highly technical discipline, it is challenging and exciting to finding subject experts knowledgeable enough to contribute to product evolution and Cognition is learning from client experts where the best opportunities for growth lie.

Recent interviews with Cognition executives, and those of other sponsors, gave me the opportunity to get their reactions to my conclusions about this industry. These were the more interesting thoughts that came from Cognition after they had reviewed the Gilbane report:

  • Feedback from current clients and attendees at 2010 conferences, where Dr. Dahlgren was a featured speaker, confirms escalating awareness of the field; she feels that “This is the year of Semantics.” It is catching the imagination of IT folks who understand the diverse and important business problems to which semantic technology can be applied.
  • In addition to a significant upswing in semantics applied in life sciences, publishing, law and energy, Cognition sees specific opportunities for growth in risk assessment and risk management. Using semantics to detect signals, content salience, and measures of relevance are critical where the quantity of data and textual content is too voluminous for human filtering. There is not much evidence that financial services, banking and insurance are embracing semantic technologies yet, but it could dramatically improve their business intelligence and Cognition is well positioned to give support to leverage their already tested tools.
  • Enterprise semantic search will begin to overcome the poor reputation that traditional “string search” has suffered. There is growing recognition among IT professionals that in the enterprise 80% of the queries are unique; these cannot be interpreted based on popularity or social commentary. Determining relevance or accuracy of retrieved results depends on the types of software algorithms that apply computational linguistics, not pattern matching or statistical models.

In Dr. Dahlgren’s view, there is no question that a team approach to deploying semantic enterprise search is required. This means that IT professionals will work side-by-side with subject matter experts, search experts and vocabulary specialists to gain the best advantage from semantic search engines.

The unique language aspects of an enterprise content domain are as important as the software a company employs. The Cognition baseline semantic net, out-of-the-box, will always give reliable and better results than traditional string search engines. However, it gives top performance when enhanced with enterprise language, embedding all the ways that subject experts talk about their topical domain, jargon, acronyms, code phrases, etc.

With elements of its software already embedded in some notable commercial applications like Bing, Cognition is positioned for delivering excellent semantic search for an enterprise. They are taking on opportunities in areas like risk management that have been slow to adopt semantic tools. They will deliver software to these customers together with services and expertise to coach their clients through the implementation, deployment and maintenance essential to successful use. The enthusiasm expressed to me by Kathleen Dahlgren about semantics confirms what I also heard from Cognition clients. They are confident that the technology coupled with thoughtful guidance from their support services will be the true value-added for any enterprise semantic search application using Cognition.

The free download of the Gilbane study and deep-dive on Cognition was announced on their Web site at this page.