TEMIS announced the launch of Luxid Content Pipeline, a new content collection module integrated within the latest version of its content discovery solution, Luxid 5.1. This platform collects content from a range of information sources and feeds them into Luxid. After annotating content with relevant metadata, Luxid then applies search, discovery and sharing tools to the enriched content and provides users with content analytics and knowledge discovery. Luxid Content Pipeline accesses content by three different methods: Structured Access connects and automates the collection of documents from structured content sources such as Dialog, DataStar, ISI Web of Knowledge, Ovid, STN, Questel, EBSCOhost, Factiva, LexisNexis, MicroPatent, Scopus, ScienceDirect, Minesoft, Esp@cenet, and PubMed. Enterprise Content Management Access connects to corporate knowledge repositories such as EMC Documentum, EMC Documentum CenterStage and Microsoft Office SharePoint Server. To be as compatible as possible with a wide variety of document sources, Luxid Content Pipeline also supports the integration of UIMA (Unstructured Information Management Architecture) collection readers, enabling the connection to these sources using UIMA standard protocol and format conversion. http://www.temis.com/
Category: Enterprise search & search technology (Page 24 of 60)
Research, analysis, and news about enterprise search and search markets, technologies, practices, and strategies, such as semantic search, intranet collaboration and workplace, ecommerce and other applications.
Before we consolidated our blogs, industry veteran Lynda Moulton authored our popular enterprise search blog. This category includes all her posts and other enterprise search news and analysis. Lynda’s loyal readers can find all of Lynda’s posts collected here.
For older, long form reports, papers, and research on these topics see our Resources page.
Semantic search is a composite beast like many enterprise software applications. Most packages are made up of multiple technology components and often from multiple vendors. This raises some interesting thoughts as we prepare for Gilbane Boston 2009 to be held this week.
As part of a panel on semantic search, moderated by Hadley Reynolds of IDC, with Jeff Fried of Microsoft and Chris Lamb of the OpenCalais Initiative at Thomson Reuters, I wanted to give a high level view of semantic technologies currently in the marketplace. I contacted about a dozen vendors and selected six to highlight for the variety of semantic search offerings and business models.
One case study involves three vendors, each with a piece of the ultimate, customer-facing, product. My research took me to one company that I had reviewed a couple of years ago, and they sent me to their “customer” and to the customer’s customer. It took me a couple of conversations and emails to sort out the connections; in the end the relationships made perfect sense.
On one hand we have conglomerate software companies offering “solutions” to every imaginable enterprise business need. On the other, we see very unique, specialized point solutions to universal business problems with multiple dimensions and twists. Teaming by vendors, each with a solution to one dimension of a need, create compound product offerings that are adding up to a very large semantic search marketplace.
Consider an example of data gathering by a professional services firm. Let’s assume that my company has tens of thousands of documents collected in the course of research for many clients over many years. Researchers may move on to greater responsibility or other firms, leaving content unorganized except around confidential work for individual clients. We now want to exploit this corpus of content to create new products or services for various vertical markets. To understand what we have, we need to mine the content for themes and concepts.
The product of the mining exercise may have multiple uses: help us create a taxonomy of controlled terms, preparing a navigation scheme for a content portal, providing a feed to some business or text analytics tools that will help us create visual objects reflecting various configurations of content. A text mining vendor may be great at the mining aspect while other firms have better tools for analyzing, organizing and re-shaping the output.
Doing business with two or three vendors, experts in their own niches, may help us reach a conclusion about what to do with our information-rich pile of documents much faster. A multi-faceted approach can be a good way to bring a product or service to market more quickly than if we struggle with generic products from just one company.
When partners each have something of value to contribute, together they offer the benefits of the best of all options. This results in a new problem for businesses looking for the best in each area, namely, vendor relationship management. But it also saves organizations from dealing with huge firms offering many acquired products that have to be managed through a single point of contact, a generalist in everything and a specialist in nothing. Either way, you have to manage the players and how the components are going to work for you.
I really like what I see, semantic technology companies partnering with each other to give good-to-great solutions for all kinds of innovative applications. By the way, at the conference I am doing a quick snapshot on each: Cogito, Connotate (with Cormine and WorldTech), Lexalytics, Linguamatics, Sinequa and TEMIS.
Nstein Technologies Inc. announced the release of a new product, Semantic Site Search (3S). 3S leverages Nstein’s text-mining technology to power a faceted site search which returns results that are organized categorically. 3S can ingest content from many different indices from many different web publishing platforms, meaning it indexes material across multiple properties. 3S’ embedded Text Mining Engine (TME) identifies concepts, categories, proper names, places, organizations, sentiment and topics in particular content pieces and then annotates those documents, creating a semantic fingerprint that exposes underlying nuances and meaning in content. 3S is also boasts a visual interface that is designed to allow administrators to tweak search sensitivity algorithms without having to modify hard code. 3S comes bundled with front-end wiidgets which could be used to point users to “similar content”, “most recent content”, or other identifying characteristics of content that one wants to promote. http://www.nstein.com
Clarabridge announced the general availability of Clarabridge Enterprise 4. Clarabridge Enterprise 4 includes the addition of an Ad-Hoc Uploader, upgrades to the Natural Language Processing (NLP) and Sentiment Engines, new collaboration tools in the Classification Suite and built-in Early Warnings and Alerts. Sentiment Engine Upgrades: clause-based sentiment and classification, along with a multitude of core engine enhancements; as well as added support for classifying data in foreign languages. Classification Templates to provide quick-start templates for analysts developing category models. Collaboration changes such as locking of models to prevent changes, rule history and roll back functionality, color-coding as a visual aid for maintaining models, and a preview feature. Early Warnings & Alerts: statistical warning and alert engines aimed at helping users proactively address customer experience issues by alerting them to anything that exceed defined thresholds. Ad-Hoc Uploader: The Ad-Hoc Uploader is designed to upload feedback sources for analysis directly from browsers. http://www.clarabridge.com/
Designing an enterprise search interface that employees will use on their intranet is challenging in any circumstance. But starting from nothing more than verbal comments or even a written specification is really hard. However, conversations about what is needed and wanted are informative because they can be aggregated to form the basis for the overarching design.
Frequently, enterprise stakeholders will reference a commercial web site they like or even search tools within social sites. These are a great starting point for a designer to explore. It makes a lot of sense to visit scores of sites that are publicly accessible or sites where you have an account and navigate around to see how they handle various design elements.
To start, look at:
- How easy is it to find a search box?
- Is there an option to do advanced searches (Boolean or parametric searching)?
- Is there a navigation option to traverse a taxonomy of terms?
- Is there a “help” option with relevant examples for doing different kinds of searches?
- What happens when you search for a word that has several spellings or synonyms, a phrase (with or without quotes), a phrase with the word and in it, a numeral, or a date?
- How are results displayed: what information is included, what is the order of the results and can you change them? Can you manipulate results or search within the set?
- Is the interface uncluttered and easily understood?
The point of this list of questions is that you can use it to build a set of criteria for designing what your enterprise will use and adopt, enthusiastically. But this is only a beginning. By actually visiting many sites outside your enterprise, you will find features that you never thought to include or aggravations that you will surely want to avoid. From these experiences on external sites, you can build up a good list of what is important to include or banish from your design.
When you find sites that you think are exemplary, ask key stakeholders to visit them and give you their feedback, preferences and dislikes. Particularly, you want to note what confuses them or enthusiastic comments about what excites them.
This post originated because several press notices in the past month brought to my attention Web applications that have sophisticated and very specialized search applications. I think they can provide terrific ideas for the enterprise search design team and also be used to demonstrate to your internal users just what is possible.
Check out these applications and articles: on KNovel, particularly this KNovel page; ThomasNet; EBSCOHost mentioned in this article about the “deep Web.”. All these applications reveal superior search capabilities, have long track records, and are already used by enterprises every day. Because they are already successful in the enterprise, some by subscription, they are worth a second look as examples of how to approach your enterprise’s search interface design.
A recent article about how Google Internet search does not use meta tags to find relevant content got me thinking about a couple of things.
First it explains why none of the articles I write for this blog about enterprise search appear in Google alerts for “enterprise search.” Besides being a personal annoyance, easily resolved if I invested in some Internet search optimization, it may explain why meta tagging is a hard sell behind the firewall.
I do know something about getting relevant content to show up in enterprise search systems and it does depend on a layer of what I call “value-added metadata” by someone who knows the subject matter in target content and the audience. Working with the language of the enterprise audience that relies on finding critical content to do their jobs, a meta tagger will bring out topical language known to be the lingua franca of the dominant searchers as well as the language that will be used by novice employee searchers. The key here is to recognize that in any specific piece of content its “aboutness” may never be explicitly spelled out in terminology by the author.
In one example, let’s consider some fundamental HR information about “holiday pay” or “compensation for holidays” or “compensation for time-off.” The strings in quotes were used throughout documents on the intranet of one organization where I consulted. When some complained about not being able to find this information using the company search system, my review of search logs showed a very large number of searches for “vacation pay” and almost no searches for “compensation” or “holidays” or “time off.” Thus, there was no way that using the search engine employees would stumble upon the useful information they are seeking – unless, meta tags make “vacation pay” a retrievable index pointer to these documents. The tagger would have analyzed the search logs, seen the high number of searches for that phrase and realized that it was needed as a meta tag.
Now, back to Google’s position on ignoring meta tags because writers and marketing managers were “gaming the system.” They were adding tags they thought would be popular to get people to look at content not related but for which they were seeking a huge audience.
I have heard the concern that people within enterprises might also hijack the usefulness of content they were posting in blogs or wikis to get more “eyeballs” in the organization. This is a foolish concern, in my opinion. First I have never seen evidence that this happens and don’t believe that any productive enterprise has people engaging in this obvious foolishness.
More importantly, professional growth and success depends on the perceptions of others, their belief in you and your work, and the value of your ideas. If an employee is so foolish as to misdirect fellow employees to useless or irrelevant content, he is not likely to gain or keep the respect of his peers and superiors. In the long run persistent, misleading or mischievous meta tagging will have just the opposite effect, creating a pathway to the door.
Conversely, the super meta tagger with astute insights into what people are looking for and how they are most likely to look for it, will be the valued expert we all need to care for and spoon feed us our daily content. Trusted resources rise to the top when they are appropriately tagged and become bedrock content when revealed through enterprise search on well-managed intranets.
Is there any real competition when it comes to enterprise search? Articles like this one in ComputerWorld make good points but also foster the idea that this could be a differentiator for buyers: Yahoo deal puts IBM, Microsoft in enterprise search pickle, by Juan Carlos Perez, August 4, 2009.
I wrote about the IBM launch of the OmniFind suite of search products a couple of years ago with positive comments. The reality ended up being quite different as I noted later. Among the negatives were three that stand out in my mind. First, free (as in the IBM OmniFind Yahoo no-charge edition) is rarely attractive to serious enterprises looking for a well-supported product. Second, the substantial computing overhead for the free product was significant enough that some SMBs I know of were turned off; the costs associated with the hardware and support it would require offset “free.” Third, my understanding that the search architecture for the free product would provide seamless upgrades to IBM’s other OmniFind products was wrong. Each subsequent product adoption would require the same “rip and replace” that Steve Arnold describes in his report, Beyond Search. It is hard to believe that IBM got much traction out of this offering from the enterprise search market at large. Does anyone know if there was really any head-to-head competition between IBM and other search vendors over this product?
On the other hand, does the Microsoft Express Search offering appeal to enterprises other than the traditional Microsoft shop? If Microsoft Express Search went away, it would probably be replaced by some other Microsoft search variation with inconvenience to the customer who needs to rip and replace and left on his own to grumble and gripe. What else is new? The same thing would happen with IBM Yahoo OmniFind users and they would adapt.
I’ve noticed that free and cheap products may become heavily entrenched in the marketplace but not among organizations likely to upgrade any time soon. Once enterprises get immersed in a complex implementation (and search done well does require that) they won’t budge for a long, long time, even if the solution is less than optimal. By the time they are compelled to upgrade they are usually so wedded to their vendor that they will accept any reasonable offer to upgrade that the vendor offers. Seeking competitive options is really difficult for most enterprises to pursue without an overwhelmingly compelling reason.
This additional news item indicates that Microsoft is still trying to get their search strategy straightened out with another new acquisition, Applied Discovery Selects Microsoft FAST for Advanced E-Discovery Document Search. E-discovery is a hot market in legal, life sciences and financial verticals but firms like ISYS, Recommind, Temis, and ZyLab are already doing well in that arena. It will take a lot of effort to displace those leaders, even if Microsoft is the contender. Enterprises are looking for point solutions to business problems, not just large vendors with a boatload of poorly differentiated products. There is plenty of opportunity for specialized vendors without going toe-to-toe with the big folks.
Ecordia has announced the availability of its new predictive content analysis application, the Ecordia Content Optimizer. Designed for copywriters, journalists, and SEO practitioners, this content analysis application provides automated intelligence and recommendations for improving the structure of content prior to publishing. Available for free, this turn-key web application provides a number of features to aid writers in the creation and validation of content including: advanced keyword research during authoring; detailed scoring of your content based on 15 proven SEO techniques; automated recommendations on how you can improve your content for search engines; intelligent keyword extraction that compares your content to popular search terms; sophisticated Keyword Analysis that scores your keyword usage based on 5 statistical formulas. The Ecordia Content Optimizer has been in beta development for over a year and is currently in use by a number of SEO practitioners. The Ecordia Content Optimizer provides content analysis capabilities ideally suited for web publishers who wish to: improve their quality score for landing pages used in PPC campaigns; SEO professionals that want to validate and review content prior to publishing; blog sites that wish to improve the quality of their ads from contextual ad networks; and PR Practitioners that want to optimize their press release prior to publishing. The Ecordia Content Optimizer is licensed on a per user monthly subscription. http://www.ecordia.com/