Recently in Search Problems/Solved Search Problems Category
Search in the enterprise suffers from lack of expert attention to tuning, care and feeding, governance and fundamental understanding of what functionality comes with any one of the 100+ products now on the market. This is just as true for search appliances, and open source search tools (Lucene) and applications (Solr). But while companies licensing search out-of-the-box solutions or heavily customized search engines have service, support and upgrades built-in into their deliverables, the same level of support cannot be assumed for getting started with open source search or even appliances.
Search appliances are sold with licenses that imply some high level of performance without a lot of support, while open source search tools are downloadable for free. As speakers about both open source and appliances made perfectly clear at our recent Gilbane Conference, both come with requirements for human support. When any enterprise search product or tool is selected and procured, there is a presumed business case for acquisition. What acquirers need to understand above all else is the cost of ownership to achieve the expected value. This means people and people with expertise on an ongoing basis.
Particularly when budgets are tight and organizations lay off workers, we discover that those with specialized skills and expertise are often the first to go. The jack-of-all-trades, or those with competencies in maintaining ubiquitous applications are retained to be "plugged in" wherever needed. So, where does this leave you for support of the search appliance that was presumed to be 100% self-maintaining, or the open source code that still needs bug fixes, API development and interface design-work?
This is the time to look to system integrators and service companies with specialists in tools you use. They are immersed in the working innards of these products and will give you better support through service contracts, subscriptions or labor-based hourly or project charges than you would have received from your in-house generalists, anyway.
You may not see specialized system houses or service companies listed by financial publications as a growth business, but I am going to put my confidence in the industry to spawn a whole new category of search service organizations in the short term. Just-in-time development for you and lower overhead for your enterprise will be a growing swell in 2009. This is how outsourcing can really bring benefits to your organization.
Post-post note - Here is a related review on the state-of-open source in the enterprise: The Open Source Enterprise; its time has come, by Charles Babcock in Information Week, Nov. 17, 2008. Be sure to read the comments, too.
This blog has not focused on non-profit institutions (e.g. museums, historical societies) as enterprises but they are repositories of an extraordinary wealth of information. The past few weeks I've been trying, with mixed results, to get a feel for the accessibility of this content through the public Web sites of these organizations. My queries leave me with a keen sense of why search on company intranets also fail.
Most sizable non-profits want their collections of content and other information assets exposed to the public. But each department manages its own content collections with software that is unique to their specific professional methods and practices. In the corporate world the mix will include human resources (HR), enterprise resource management (ERP) systems, customer relationship management (CRM), R & D document management systems and collaboration tools. Many corporations have or "had" library systems that reflected a mix of internally published reports and scholarly collections that support R & D and special areas such as competitive intelligence. Corporations struggle constantly with federating all this content in a single search system.
Non-profit organizations have similar disparate systems constructed for their special domain, museums or research institutions. One area that is similar between the corporate and non-profit sector is libraries, operating with software whose interfaces hearken back to designs of the late 1980s or 90s. Another by-product of that era was the catalog record in a format devised by the Library of Congress for the electronic exchange of records between library systems. It was never intended to be the format for retrieval. It is similar to the metadata in content management systems but is an order of magnitude more complex and arcane to the typical person doing searching. Only librarians and scholars really understand the most effective ways to search most library systems; therein lies the "public access" problem. In a corporation a librarian often does the searching.
However, a visitor to a museum Web site would expect to quickly find a topic for which the museum has exhibit materials, printed literature and other media, all together. This calls for nomenclature that is "public friendly" and reflects the basic "aboutness" of all the materials in museum departments and collections. It is a problem when each library and curatorial department uses a different method of categorizing. Libraries typically use Library of Congress Subject Headings. What makes this problematic is that topics are so numerous. The number of possible subject headings is designed for the entire population of all Library of Congress holdings, not a special collection of a few tens of thousands of materials. Almost no library systems search for words "contained in" the subject headings if you try to browse just the Subject index. If I am searching Subjects for all power generation materials and a heading such as electric power generation is used, it will not be found because the look-up mechanism only looks for headings that "begin with" power generation.
Let's cut to the chase; mountains of metadata in the form of library cataloging are locked inside library systems within non-profit institutions. It is not being searched at the search box when you go to a museum Web site because it is not accessible to most "enterprise" or "web site" search engines. Therefore, a separate search must be done in the library system using a more complex approach to be truly thorough.
We have a big problem if we are to somehow elevate library collections to the same level of importance as the rest of a museum's collections and integrate the two. Bigger still is the challenge of getting everything indexed with a normalized vocabulary for the comfort of all audiences. This is something that takes thought and coordination among professionals of diverse competencies. It will not be solved easily but it must be done for institutions to thrive and satisfy all their constituents. Here we have yet another example of where enterprise search will fail to satisfy, not because the search engine is broken but because the underlying data is inappropriately packaged for indexes to work as expected. Yet again, we come to the realization that we need people to recognize and fix the problem.
When interviewing search administrators who had also been part of product selection earlier this year, I asked about surprises they had encountered. Some involved the selection process but most related to on-going maintenance and support. None commented on actual failures to retrieve content appropriately. That is a good thing whether it was because, during due diligence they had already tested for that during a proof of concept or because they were lucky.
Thinking about how product selections are made, prompts me to comment on a two major search product attributes that control the success or failure of search for an enterprise. One is the actual algorithms that control content indexing, what is indexed and how it is retrieved from the index (or indices). The second is the interfaces, interfaces for the population of searchers to execute selections, and interfaces for results presentation. On each aspect, buyers need to know what they can control and how best to execute it for success.
Indexing and retrieval technology is embedded with search products; the number of administrative options to alter search scalability, indexing and content selection during retrieval is limited to none. The "secret sauce" for each product is largely hidden, although it may have patented aspects available for researching. Until an administrator of a system gets deeply into tuning, and experimenting with significant corpuses of content, it is difficult to assess the net effect of delivered tuning options. The time to make informed evaluations about how well a given product will retrieve your content when searched by your select audience is before a purchase is made. You can't control the underlying technology but you can perform a proof of concept (PoC). This requires:
- human resources and a commitment of computing resources
- well-defined amount, type and nature (metadata plus full-text or full-text unstructured-only) to give a testable sample
- testers who are representative of all potential searchers
- a comparison of the results with three to four systems to reveal how well they each retrieve the intended content targets
- knowledge of the content by testers and similarity of searches to what will be routinely sought by enterprise employees or customers
- search logs of previously deployed search systems, if they exist. Searches that routinely failed in the past should be used to test newer systems
Interface technology
Unlike the embedded search technology, buyers can exercise design control or hire a third-party to produce search interfaces that vary enormously. Controlling for what searchers experience when they first encounter a search engine, either a search box at a portal or a completely novel variety of search options with search box, navigation options or special search forms is within the control of the enterprise. This may be required if what comes "out-of-the box" as the default is not satisfactory. You may find, at a reasonable price, a terrific search engine that scales well, indexes metadata and full-text competently and retrieves what the audience expects but requires a different look-and-feel for your users. Through an API (application programming interface), SDK (software development kit) or application connectors (e.g. Documentum, SharePoint) numerous customization options are delivered with enterprise search packages or are available as add-ons.
In either case, human resource costs must be added to the bottom line. A large number of mature software companies and start-ups are innovating with both their indexing techniques and interface design technologies. They are benefiting from several decades of search evolution for search experts, and now a decade of search experiences in the general population. Search product evolution is accelerating as knowledge of searcher experiences is leveraged by developers. You may not be able to control emerging and potentially disruptive technologies, but you can still exercise beneficial controls when selecting and implementing most any search system.
Search for the whole enterprise vs. point solutions was the subject of some discussion, especially since our keynote speaker, Stephen Arnold gave strong guidance that you can't think about one search solution ("product") for the entire enterprise and all content. This is something with which I pretty much agree, in most cases.
Just emerging from the Gilbane San Francisco conference, six sessions on search and a workshop I conducted, I want to share a couple of general impressions. Details and expanded reflections will follow in the days and weeks to come.
Search for the whole enterprise vs. point solutions was the subject of some discussion, especially since our keynote speaker, Stephen Arnold gave strong guidance that you can't think about one search solution ("product") for the entire enterprise and all content. This is something with which I pretty much agree, in most cases. However, a question arose in one of the sessions in which a couple of presentations talked about a single search engine for what appeared to be the entire enterprise. A member of the audience asked for clarification in view of Arnold's earlier comments.
I chose to intercede so as not to put our speakers on the defensive about what, for their organizations were very reasonable choices. Both of the cases were for research or professional services organizations with a high incidence of uniformity in the scope and type of content. They are relatively flat in structure with the bulk of the population being researchers: consultants, engineers, scientists. The applications were for intranets that were being leveraged to connect content and experts, so that from either direction (finding an expert and then looking at their content, or finding content to reveal expertise) other professionals could leverage organizational knowledge. It is a safe bet that other search does exist elsewhere in these companies, even if it is in stealth mode or embedded in other applications. Still, in general, large organizations with highly differentiated personnel with functional and disparate content requirements will find value in point search solutions that may only have purpose in a single internal domain.
To that point, if you are a finance professional or business manager you might want to sign up for a webinar this Thursday, June 26th, when I will be laying out a business case for a particular kind of search solution that is targeted at your demographic. This Apps Associates sponsored webinar also describes a solution leveraging Oracle enterprise search, but the ideas in it will give you a sense of what search can provide in your domain.
Judging from the topics presented on search, the reasons and ways in which it is being applied are more diverse than even I imagined. Opinions about what is good/bad, appropriate or not, and how to approach search technology ran the gamut of simple to complex. Two strong points of view were expressed about taxonomy vs. just tagging or letting the search engine categorize. Neither side would give an inch to the other as having an approach that is often "good enough." It is pretty clear that hybrid solutions offering both a structured approach to search where a taxonomy is applied through metadata, and auto-categorization by the search engine without a supporting taxonomy in the background will be applied in many enterprises.
Hustling through my preparation list before the Gilbane San Francisco conference I have come to the fifth session on enterprise search that I'll be moderating, Mining, Analyzing and Delivering Intelligent Content, featuring Amin Negandi, Principal, Echelon Consulting LLC, speaking on Enterprise Search at A.T. Kearney, and Rob Joachim, Information Systems Engineering Lead, MITRE Corporation presenting a case study on the development of An Expertise Finder Application Built on Enterprise Search. In listening to both of them talk about their projects, these are "must-attend" presentations for those seeking to build search-based solutions for their organizations. Both are examples of the practical and real challenges that surround value building projects. Both have positive outcomes but are hardly implementations that will become static legacy deployments; sustaining a value-based system is an ongoing activity.
As the session abstract states, there are as many technologies for finding content as there are types of content and types of enterprises. Locating a pile of links or citations is rarely the end game for those who really seek to leverage content. Both presenters in this session will talk about solutions that serve real and critical needs for one enterprise, in the first case being able to securely search content across a professional services firm in which collaboration is important within defined proprietary boundaries.
The second case also touches on the need for collaboration and sharing, in this case by enabling location of individuals who are experts. Using the context of content and associations to which they are linked for "defining" individual expertise, search filters relevant metadata to reveal those individuals. Connections are made to locate people and their professional work.
Delivering search results intelligently requires not only technology but also the art of the implementation team. Keeping the focus on specific business outcomes is the essence of ensuring that search delivers intelligent content. The stories of what problem was targeted, what tools were deployed, and how search was implemented by savvy search specialists are the most interesting and useful for learning. Finding out that serendipity also plays a role is getting closer to the best solution is always fun to discover in the process. We'll be listening on June 19th.
The topic of the month seems to be “social search;” I confess to being a willing participant in this new semantic framing of a rash of innovative new tools for enterprise search products. I would, however, defer to the professional intent of some great new features by stressing that this is really a next step in bringing collaboration closer to where expert knowledge workers do their work. As I view enterprises with a heavy research component, 10 – 30% of the average professional’s time is spent in a search environment. In other words, we all spend a lot of our day just looking for “stuff.” We also spend a significant amount of time in meetings, exchanging emails, and making presentations. More and more of us contribute to collaboration spaces where we work together on various types of document production.
Putting together the work habits and needs of a time-poor and information-rich community of knowledge workers in a post-processing environment where they can “mash up,” tag and commentate their search discoveries is a natural evolution of search technology. It is remarkable to see how search companies that are serious about the enterprise market (search within and for the enterprise) are rapidly turning out enhancements for their audiences, now that they are convinced that “Enterprise 2.0” has a boatload of early adopters in the wings. Search should always be about connecting experts and their content. Add collaboration and the ability to enrich search results by searchers for the benefit of their colleagues and you have a model for, soon-to-be, heavily adopted products.
That pretty much sums up how we should be thinking about “social search” in the enterprise. You can hear more of my views in a KMWorld Webinar, Using Social Search to Drive Innovation through Collaboration next Tuesday in a presentation sponsored by Vivisimo, one of the leaders in this area.
The week had plenty of virtual ink devoted this topic so you might want to check out these two articles with more commentary. The first was in eWeek, by Clint Boulton, Vivisimo Marries Search, Social Networking. The second shows that Google is on the bandwagon, as well, Google Enterprise Search gets social, a blog entry at C|Net News.com by Rafe Needleman.
There is nothing more disappointing to a consultant than to learn that a project in which you gave significant guidance to a client is experiencing a project meltdown…except maybe having everything get off to a positive start only to falter due to problems with the technologies being implemented. I have been burned several times lately and that surprises me because, as a former software vendor myself, I have pretty deep skepticism when it comes to overblown claims and can usually spot the companies I wouldn’t want my clients to trust. This was not one of them.
It is hard to deal with situations that you didn’t consider likely. A big one is a broken promise, even if it is implicit, not explicit. For a vendor to deliver a solid CMS product with a buggy search interface to toggle between keyword and metadata search is one thing. My client spent months getting it to work so that users could seek by keyword or on explicit metadata fields. They rolled it out and it was “OK,” if not great. But after much discussion with the vendor about the bugs, my client was pressured into adopting an upgrade to “solve the problem.” Unfortunately, the upgrade was an experience from hell, but worse was the fact that the old search controls no longer worked and there was no way to search metadata any longer. Having predicated the procurement on being able to search metadata… well, you get the picture.
What happened to the old motto of “first do no harm?” In my world that means you never release an “upgrade” that subtracts functionality. In the words of my client, “we consider this a major regression.” I consider it a serious breach of trust between them and the supplier but also between me and my client. Why would they ever trust my guidance about the solidness of a vendor again? Guess I have my work cut out for me to find some recourse for my client.
On a much more positive note, I will be offering commentary on the subject of trust and technology solutions when I participate in my first Gilbane Webinar with Oracle’s Brian Dirking, Wednesday, October 10th. The title is The Trust Factor: Secure Enterprise Search for High-Value Content and it will include some key considerations when considering your path to a successful search implementation. I’m still optimistic and enthusiastic that you can implement an excellent search solution for your organization if you really chose your strategy, your technology and your business partners carefully and I’m teaming with Oracle to reinforce that message.
It is free, so click on the title to sign up, even if you are in the beginning stages of your quest for a search product. I hope you will join us for the discussion.
Many organizations have experimented with a number of search engines for their enterprise content. When the search engine is deployed within the bounds of a specific content domain (e.g. a QuickPlace site) the user can assume that the content being searched is within that site. However, an organization’s intranet portal with a free-standing search box comes with a different expectation. Most people assume that search will find content anywhere in the implied domain, and for most of us we believe that all content belonging to that domain (e. g. a company) is searchable.
I find it surprising how many public Web sites for media organizations (publishers) don’t appear to have their site search engines pointing to all the sub-sites indicated in site maps. I know from my experience at client sites that the same is often true for enterprise searching. The reasons are numerous and diverse, commentary for another entry. However, one simple notation under or beside the search box can clarify expectations. A simple link to a “list of searchable content” will underscore the caveat or at least tip the searcher that the content is bounded in some way.
When users in an organization come to expect that they will not find, through their intranet, what they are seeking but know to exist somewhere in the enterprise, they become cynical and distrustful. Having a successful intranet portal is all about building trust and confidence that the search tool really works or “does the job.” Once that trust is broken, new attempts to change the attitudes by deploying a new search engine, increasing the license to include more content, or doing better tuning to return more reliable results is not going to change minds without a lot of communication work to explain the change. I know that the average employee believes that all the content in the organization should be brought together in some form of federated search but now know it isn’t. The result is that they confine themselves to embedded search within specific applications and ignore any option to “search the entire intranet.”
It would be great to see comments from readers who have changed a Web site search experience from a bad scene to one with a positive traffic gain with better search results. Let us know how you did it so we can all learn.
Preparing for two upcoming meetings with search themes (Gilbane San Francisco and Boston KM Forum) has brought to mind many issues of search usability. At the core is the issue of search literacy. Offering some fundamental searching tips to non-professional searchers often results in a surprised reaction. (e.g. When told, if seeking information about a specific topic such as "industrial engineering," enclose it in quotes to limit the search to that phrase. Without quotes, you will get all content with “industrial” and “engineering” anywhere in the content with no explicit relationship implied.)
If you are reading this you probably know that, but many do not. In order to learn what people search for on their company intranet and how they type their search requests, I spend time reading search log files. I do this for several reasons:
> To learn terminology searchers are using to guide taxonomy building choices
> To see the way searches are formulated, and followed up
> To inform design decisions about how to make searching easier
> To see what is searched but not found to inform future content inclusion
> To view the searcher’s next step when the results are zero or huge
Two results remain consistent: less than 1% of the searchers place a phrase inside quotations, even when there are multiple words; word are often truncated but do not include a truncation symbol (usually an asterisk, “*”). Both reveal a probable lack of search conventions understanding, a search literacy problem. Here are a couple of possible solutions:
> Put into place better help and training mechanisms to help the lost find their way,
OR
> Remove the legacy practice of forcing command language type symbols on searchers for the most common search requests
Placing punctuation around a search string is a holdover from 30 years ago when searching was done using a command language. Since only a limited number of people ever knew this syntactical format, why does it persist as the default for a phrase search for Web-based search engines?
The solution of providing a better help page and getting people to actually use it is a harder proposition. This one from McGraw-Hill for BusinessWeek Online is pretty simple with just seven tips but who reads it? I expect very few, although it could dramatically improve their search results. http://search.businessweek.com/advanced.jsp.
If you are trying to improve the search experience for your intranet, there are two resources to consult for content usability on all fronts, not just search: useit.com, Jakob Nielsen's Website and Jared Spool’s UIEtips, User Interface Engineering’s free email newsletter. In the meantime, think about whether you need to demand more core search usability or tunable default options from vendors, or whether better interface design could guide searchers to better results.
I had a briefing from a vendor that is a strong contender for a piece of the enterprise search market this week. The offering is impressive, other reviews have given it high technical marks and the pricing model is reasonable. But because I am currently immersed in the deployment of another enterprise search engine with a client, the issue of vendor client relationship is foremost in my focus.
I asked the CEO of this relatively new offering, what are the fundamental assumptions his company makes about customer technology environments (e.g. the mix of software applications, hardware environment) and the competencies required to integrate his software with that environment. His answer was given strictly in terms of what the IT staff needs to know to bring the product online. My question did have several levels of complexity and was probably badly phrased but I was trying to make a point by asking it.
There are three specific elements missing from search vendors:
> Documentation or explicit models for deployment in environments where there are numerous technological variables to be considered
> Availability of training that takes into consideration the context for enterprise search in a specific customer’s organization
> Frank discussions with customers that set expectations about deployment and implementation, potential bottlenecks, and the need for experienced searchers, search analysts and subject matter experts on the team with the IT group
Downloading software and using automatic installers has become routine; with the launch of a menu and a few simple clicks on boxes on an administrative screen, vendors can claim “out-of-the-box” functionality. Never mind that what you find when you first search your targeted domain is nonsensical, the software finds “stuff.” The IT guys are happy because it was easy to install, met their architecture requirement and, knowing little about the actual corpus of content, they are satisfied that everything works.
I am in a bit of a pickle with the current project, software from another vendor, because:
> What the documentation says will happen when I make certain choices in the set-up does not, in fact, happen when a search is executed
> My attempts through email and phone to schedule training have gone unanswered
> My messages to the support service citing problems also get no response
I’ve only spent two weeks trying to get this software working but three weeks ago, on a holiday, I got a briefing from two executives from this firm because they were “going to be in the area” and wanted face time with a search analyst. Knowing my role as an analyst and as a client you would think they’d answer my phone calls.
What is it that makes the customer experience so easily ignored? All these products look great in demos; what is under the hood is often technologically wonderful but, boy, getting them to work in my environment always seems to be one long nightmare. I wish I could find out what I really need to know. A terrific search engine might help.
