Curated for content, computing, and digital experience professionals

Category: Enterprise search & search technology (Page 26 of 61)

Research, analysis, and news about enterprise search and search markets, technologies, practices, and strategies, such as semantic search, intranet collaboration and workplace, ecommerce and other applications.

Before we consolidated our blogs, industry veteran Lynda Moulton authored our popular enterprise search blog. This category includes all her posts and other enterprise search news and analysis. Lynda’s loyal readers can find all of Lynda’s posts collected here.

For older, long form reports, papers, and research on these topics see our Resources page.

Competition among Search Vendors

Is there any real competition when it comes to enterprise search? Articles like this one in ComputerWorld make good points but also foster the idea that this could be a differentiator for buyers: Yahoo deal puts IBM, Microsoft in enterprise search pickle, by Juan Carlos Perez, August 4, 2009.

I wrote about the IBM launch of the OmniFind suite of search products a couple of years ago with positive comments. The reality ended up being quite different as I noted later. Among the negatives were three that stand out in my mind. First, free (as in the IBM OmniFind Yahoo no-charge edition) is rarely attractive to serious enterprises looking for a well-supported product. Second, the substantial computing overhead for the free product was significant enough that some SMBs I know of were turned off; the costs associated with the hardware and support it would require offset “free.” Third, my understanding that the search architecture for the free product would provide seamless upgrades to IBM’s other OmniFind products was wrong. Each subsequent product adoption would require the same “rip and replace” that Steve Arnold describes in his report, Beyond Search. It is hard to believe that IBM got much traction out of this offering from the enterprise search market at large. Does anyone know if there was really any head-to-head competition between IBM and other search vendors over this product?

On the other hand, does the Microsoft Express Search offering appeal to enterprises other than the traditional Microsoft shop? If Microsoft Express Search went away, it would probably be replaced by some other Microsoft search variation with inconvenience to the customer who needs to rip and replace and left on his own to grumble and gripe. What else is new? The same thing would happen with IBM Yahoo OmniFind users and they would adapt.

I’ve noticed that free and cheap products may become heavily entrenched in the marketplace but not among organizations likely to upgrade any time soon. Once enterprises get immersed in a complex implementation (and search done well does require that) they won’t budge for a long, long time, even if the solution is less than optimal. By the time they are compelled to upgrade they are usually so wedded to their vendor that they will accept any reasonable offer to upgrade that the vendor offers. Seeking competitive options is really difficult for most enterprises to pursue without an overwhelmingly compelling reason.

This additional news item indicates that Microsoft is still trying to get their search strategy straightened out with another new acquisition, Applied Discovery Selects Microsoft FAST for Advanced E-Discovery Document Search. E-discovery is a hot market in legal, life sciences and financial verticals but firms like ISYS, Recommind, Temis, and ZyLab are already doing well in that arena. It will take a lot of effort to displace those leaders, even if Microsoft is the contender. Enterprises are looking for point solutions to business problems, not just large vendors with a boatload of poorly differentiated products. There is plenty of opportunity for specialized vendors without going toe-to-toe with the big folks.

Ecordia Releases Content Analysis Tool for Search Engine Optimization

Ecordia has announced the availability of its new predictive content analysis application, the Ecordia Content Optimizer. Designed for copywriters, journalists, and SEO practitioners, this content analysis application provides automated intelligence and recommendations for improving the structure of content prior to publishing. Available for free, this turn-key web application provides a number of features to aid writers in the creation and validation of content including: advanced keyword research during authoring; detailed scoring of your content based on 15 proven SEO techniques; automated recommendations on how you can improve your content for search engines; intelligent keyword extraction that compares your content to popular search terms; sophisticated Keyword Analysis that scores your keyword usage based on 5 statistical formulas. The Ecordia Content Optimizer has been in beta development for over a year and is currently in use by a number of SEO practitioners. The Ecordia Content Optimizer provides content analysis capabilities ideally suited for web publishers who wish to: improve their quality score for landing pages used in PPC campaigns; SEO professionals that want to validate and review content prior to publishing; blog sites that wish to improve the quality of their ads from contextual ad networks; and PR Practitioners that want to optimize their press release prior to publishing. The Ecordia Content Optimizer is licensed on a per user monthly subscription. http://www.ecordia.com/

Google Search Appliance Gets New Connectors

Google has announced an upgraded suite of GSA Connectors for the Google Search Appliance (GSA)- including connectors to integrate offline company data with information stored in the cloud. GSA Connectors connect the GSA with content management systems and other repositories, so that users can find the information they are looking for in unified search results. With the upgraded Google Search Appliance Connectors, the connector framework is simplified so that it can search content stored across various databases. One of the featured GSA Connectors is for Salesforce, enabling the GSA can search content in Salesforce, providing sales, marketing, and customer support personnel access to the information they seek regularly. In addition, new updates and features have been made to the connectors for SharePoint, Livelink, FileNet, and Documentum. Specifically, the SharePoint connector supports batch authorization and multiple site collection, and has added 64-bit Windows support. Additionally, the Google Search Box can be implemented within SharePoint, which would be powered by the GSA, giving results from databases outside of the SharePoint system. Multiple connectors now support more recent versions of content systems, such as the Documentum v6.5, or the FileNet v4. www.google.com/enterprise/search/gsa.html

Kentico CMS for ASP.NET Gets New Enterprise Search Capabilities

Kentico Software released a new version 4.1 of Kentico CMS for ASP.NET. The new version comes with a enterprise-class search engine as well as user productivity enhancements. The search engine enables web content to be searchable to assist  visitors in finding information. The search engine provides search results with ranking, previews, thumbnail images and customizable filters. The site owners can dictate which parts of the site, which content types and which content fields are searchable. The search engine uses the Lucene search framework. The new version also enhances productivity by changing the way images are inserted into text. The uploaded images can be part of the page life cycle. When the page is removed from the site, the related images and attachments are also removed which helps organizations avoid invalid or expired content on their server. Other improvements were made to the management of multi-lingual web sites. Kentico CMS for ASP.NET now supports workflow configuration based on the content language and it allows administrators to grant editors with permissions to chosen language versions. Content editors can see which documents are not translated or their translation is not up-to-date. http://www.kentico.com/

Convergence of Enterprise Search and Text Analytics is Not New

Prompted by the news item about IBM’s bid for SPSS and similar acquisitions by Oracle, SAP and Microsoft made me think about the predictions of more business intelligence (BI) capabilities being conjoined with enterprise search. But why now and what is new about pairing search and BI? They have always been complementary, not only for numeric applications but also for text analysis. Another article by John Harney in KMWorld referred to the “relatively new technology of text analytics” for analyzing unstructured text. The article is a good summary of some newer tools but the technology itself has had a long shelf life, too long for reasons which I’ll explore later.

Like other topics in this blog this one requires a readjustment in thinking by technology users. One of the great things about digitizing text was the promise of ways in which it could be parsed, sorted and analyzed. With heavy adoption of databases that specialized in textual, as well as numeric and date data fields for business applications in the 1960s and 70s, it became much easier for non-technical workers to look at all kinds of data in new ways. Early database applications leveraged their data stores using command languages; the better ones featured statistical analysis and publication quality report builders. Three that I was familiar with were DRS from ADM, Inc., BASIS from Battelle Columbus Labs and INQUIRE from IBM.

Tools that accompanied database back-ends had the ability to extract, slice and dice the database content, including very large text fields to report: word counts, phrase counts (breaking on any delimiter), transaction counts, relationships among data elements across associated record types, ability to create relationships on the fly, report expert activity and working documents, and describe distribution of resources. These are just a few examples of how new content assets could be created for export in minutes. In particular, a sort command with DRS had histogram controls that were invaluable to my clients managing corporate document and records collections, news clippings files, photographs, patents, etc. They could evaluate their collections by topic, date ranges, distribution, source, and so on, at any time.

So, there existed years ago the ability to connect data structures and use a command language to formulate new data models that informed and elucidated how information was being used in the organization, or to illustrate where there were holes in topics related to business initiatives. What were the barriers to wide-spread adoption? Upon reflection, I came to realize that extracting meaningful content from database in new and innovative formats requires a level of abstract thinking for which most employees are not well-trained. Putting descriptive data into a database via a screen form, then performing a transaction on the object of that data on another form, and then adding more data about another similar but different object are isolated in the database user’s experience and memory. The typical user is not trained to think about how the pieces of data might be connected in the database and therefore is not likely to form new ideas of how it can all be extracted in a report with new information about the content. There is a level of abstraction that eludes most workers whose jobs consist of a lot of compartmentalized tasks.

It was exciting to encounter prospects that really grasped the power of these tools and were excited to push the limits of the command language and reporting applications, but they were scarce. It turned out that our greatest use came in applying text analytics to the extraction of valuable information from our customer support database. A rigorously disciplined staff populated it after every support call with not only demographic information about the nature of the call, linked to a customer record that had been created back at the first contact during the sales process (with appropriate updates along the way in the procurement process) but also a textual description of the entire transaction. Over time this database was linked to a “wish list” database and another “fixes” database and the entire networked structure provided extremely valuable reports that guided both development work and documentation production. We also issued weekly summary reports to the entire staff so everyone was kept informed about product conditions and customer relationships. The reporting tools provided transparency to all staff about company activity and enabled an early version of “social search collaboration.”

Current text analytics products have significantly more algorithmic horsepower than the old command languages. But making the most of their potential and transforming them into utilities that any knowledge worker can leverage will remain a challenge for vendors in the face of poor abstract reasoning among much of the work force. The tools have improved but maybe not in all the ways they need to for widespread adoption. Workers should not have to be dependent on IT folks to create that unique analysis report that reveals a pattern or uncovers product flaws described by multiple customers. We expect workers to multitask, have many aptitudes and skills, and be self-servicing in so many aspects of their work, but for them to flourish the tools fall short too often. I’m putting in a big plug for text analytics for the masses, soon, so that enterprise search begins to deliver more than personalized lists of results for one person at a time. Give more reporting power to the user.

Searching Email in the Enterprise

Last week I wrote about “personalized search” and then a chance encounter at a meeting triggered a new awareness of business behavior that makes my own personalized search a lot different than might work for others. A fellow introduced himself to me as the founder of a start-up with a product for searching email. He explained that countless nuggets of valuable information reside in email and will never be found without a product like the one his company had developed. I asked if it only retrieved emails that were resident in an email application like Outlook; he looked confused and said “yes.” I commented that I leave very little content in my email application but instead save anything with information of value in the appropriate file folders with other documents of different formats on the same topic. If an attachment is substantive, I may create a record with more metadata in my content management database so that I can use the application search engine to find information germane to projects I work on. He walked away with no comment, so I have no idea what he was thinking.

It did start me thinking about the realities of how individuals dispose of, store, categorize and manage their work related documents. My own process goes like this. My work content falls into four broad categories: products and vendors, client organizations and business contacts, topics of interest, and local infrastructure related materials. When material is not purposed for a particular project or client but may be useful for a future activity, it gets a metadata record in the database and is hyperlinked to the full-text. The same goes for useful content out on the Web.

When it comes to email, I discipline myself to dispose of all email into its appropriate folder as soon as I can. Sometimes this involves two emails, the original and my response. When the format is important I save it in the *.mht format (it used to be *.htm until I switched to Office 2007 and realized that doing so created a folder for every file saved); otherwise, I save content in *.txt format. I rename every email to include a meaningful description including topic, sender and date so that I can identify the appropriate email when viewing a folder. If there is an attachment it also gets an appropriate title and date, is stored in its native format and the associated email has “cover” in the file name; this helps associate the email and attachment. The only email that is saved in Outlook in personal folders is current activity where lots of back and forth is likely to occur until a project is concluded. Then it gets disposed of by deleting, or with the project file folders as described above. This is personal governance that takes work. Sometimes I hit a wall and fall behind on the filtering and disposing but I keep at it because it pays off in the long term.

So, why not relax and leave it all in Outlook, then let a search engine do the retrieval? Experience had revealed that most emails are labeled so poorly by senders and the content is so cryptic that to expect a search engine to retrieve it in a particular context or with the correct relevance would be impossible. I know this from the experience of having to preview dozens of emails stored in folders for projects that are active. I have decided to give myself the peace of mind that when the crunch is on, and I really need to go to that vendor file and retrieve what they sent me in March of last year, I can get it quickly in a way that no search engine could ever do. Do you realize how much correspondence you receive from business contacts using their “gmail” account with no contact information revealing their organization in the body and signed with a nickname like “Bob” and messages “like we’re releasing the new version in four weeks” or that just have a link to an important article on the web with “thought this would interest you?”

I did not have a chance to learn if my new business acquaintance had any sense of the amount of competition he has out there for email search, or what his differentiator is that makes a compelling case for a search product that only searches through email, or what happens to his product when Microsoft finally gets FAST search bundled to work with all Office products. OR, perhaps the rest of the world is storing all content in Outlook. Is this true? If so, he may have a winner.

Personalized Search in the Enterprise

This is an interesting topic for two reasons: there is enormous diversity in the ways we all think and go about finding content; personalizing a search interface without being intrusive is extremely difficult. Any technology that requires us to do activities according to someone else’s design, which bends our natural inclination, is by definition not going to be personal.

This topic comes to mind because of two unrelated pieces of content I read in the past 24 hours. The first was an email asking me about personal information management and automated tagging, and the second was an interview I read with Mike Moran, a thought leader in search and speaker at one of our Gilbane Conferences. In the interview, Mike talks about personalized search. Then Information Week referenced search personalization in an article about a patent suit against Google.

Here is my take on the many personalized search themes that have recently emerged. From dashboards to customizing results, options to focus on particular topics or types of content, socialized search to support interacting with and sharing results, to retrieving content we personally created or received (email), content we used or were named in, all might be referred to as search personalization. Getting each to work well will enhance enterprise search but….

Knowing how transient and transformative our thoughts and behaviors really are, we should focus realistically on the complexity of producing software tools and services that satisfy and enhance personal findability. We are ambiguous beings, seeking structured equilibrium in many of our activities to create efficiency and reduce anxiety, while desiring new, better, quicker and smarter devices to excite and engage us. Once we achieve a level of comfort with a method or mechanism, whether quickly or over time, we evolve and seek change. But, when change is imposed on an unprepared mind, our emotions probably override any real benefit that might be gained in productivity. Then we tend to self-sabotage the potential for operational usefulness when an uncomfortable process intrudes. Mental lack of preparedness undermines our work when a new design demands a behavioral shift that lacks connection to our current state or past experiences. How often are we just not in a frame of mind to take on something totally alien, especially with deadlines looming?

Look at the single most successful aspect of Google, minimalism in its interface. One did not need to wade through massively dense graphics scrambled with text in disordered layouts to figure out what to do when Google first appeared. The focus was immediately obvious.

I am presenting this challenge to vendors; there is a need to satisfy a huge array of personal preferences while introducing a minimal amount of change in any one release. Easy adoption requires that new products be simple. Usefulness must be quickly obvious to multiple audiences.

I am presenting this challenge to technology users; focus your appetite. Decide before shopping or adopting new tools what would bring the most immediate productivity gain and personal adoptability for maximum efficiency. Think about how defeated you feel when approaching a new release of an upgraded product that has added so many new “bells and whistles” that you are consumed with trying to rediscover all the old functions and features that gave your workflow a comfortable structure. Think carefully about how much learning and re-adjusting will be needed if you decide on technology that promises to do everything, with unlimited personalization. It may be possible, but does it really feel personally acceptable.

Semantic Search has Its Best Chance for Successes in the Enterprise

I am expecting significant growth in the semantic search market over the next five years with most of it focused on enterprise search. The reasons are pretty straightforward:

  • Semantic search is very hard and to scale it to the Web compounds the complexity.
  • Because the semantic Web is so elusive and results have been spotty with not much traction, it will be some time before it can be easily monetized.
  • Like many things that are highly complex, a good model will be to break the challenge of semantic search into smaller targeted business problems where focus is on a particular audience seeking content from a narrower domain.

I base this predication on my observation of the on-going struggle for organizations to get a strong framework in place to manage content effectively. By effectively I mean, establishing solid metadata, governance and publishing protocols that ensure that the best information knowledge workers produce is placed in range for indexing and retrieval. Sustained discipline and the people to exercise it just aren’t being employed in many enterprises to make this happen in a cohesive and comprehensive fashion. I have been discouraged by the number of well-intentioned projects I have seen flounder because organizations just can’t commit long-term or permanent human resources to the activity of content governance. Sometimes it is just on-again-off-again. What enterprises need are people with deep knowledge about the organization and how its content fits together in a logical framework for all types of knowledge workers. Instead, organizations tend to assign this job to external consultants or low-level staffers who are not well-grounded in the work of the particular enterprise. The results are predictably disappointing.

Enter semantic search technologies where there are multiple algorithmic tools available to index and retrieve content for complex and multi-faceted queries. Specialized semantic technologies are often well suited to shorter term projects for which domain specific vocabularies can be built more quickly with good results. Maintaining targeted vocabulary ontologies for a focused topic can be done with fewer human resources and a carefully bounded ontology can become an intelligent feed to a semantic search engine, helping it index with better precision and relevance.

This scenario is proposed with one caveat; enterprises must commit to having very smart people with enterprise expertise to build the ontology. Having a consultant coach the subject matter expert in method, process and maintenance guidelines for doing so is not a bad idea but the consultant has to prepare the enterprise for sustainability after exiting the scene.

The wager here is that enterprises can ramp up semantic search with a series of short, targeted projects, each of which establishes a goal of solving one business problem at a time and committing to efficient and accurate content retrieval as part of the solution. By learning what works well in each situation, intranet web retrieval will improve systematically and thoughtfully. The ramp to a better semantic Web will be paved with these interlocking pieces.

Keep an eye on these companies to provide technologies for point solutions in business critical applications: Basis Technology, Cognition Technology, Connotate, Expert Systems, Lexalytics, Linguamatics, Metatomix, Semantra, Sinequa and Temis.

« Older posts Newer posts »

© 2025 The Gilbane Advisor

Theme by Anders NorenUp ↑