Curated for content, computing, and digital experience professionals

Category: Enterprise search & search technology (Page 39 of 60)

Research, analysis, and news about enterprise search and search markets, technologies, practices, and strategies, such as semantic search, intranet collaboration and workplace, ecommerce and other applications.

Before we consolidated our blogs, industry veteran Lynda Moulton authored our popular enterprise search blog. This category includes all her posts and other enterprise search news and analysis. Lynda’s loyal readers can find all of Lynda’s posts collected here.

For older, long form reports, papers, and research on these topics see our Resources page.

Will Steve Arnold Scare IT Into Taking Search in the Enterprise Seriously?

Steve Arnold of ArnoldIT struck twice in a big way last week, once as a contributor to the Bear, Stearns & Co. research report on Google and once as a principal speaker at Enterprise Search in New York. I’ve read a copy of the Bear Stearns report, which contains information that should make IT people pay close attention to how they manage searchable enterprise content. I can verify that this blog summary of Steve’s New York speech by Larry Digman sounds like vintage Arnold, to the point and right on it. Steve, not for the first time, is making points that analysts and other search experts routinely observe about the lack of serious infrastructure vested in making content valuable by enhancing its searchability.

First is the Bear Stearns report, summarized for the benefit of government IT folks with admonitions about how to act on the technical guidance it provides in this article by Joab Jackson in GCN. The report’s appearance in the same week as Microsoft’s acquisition of aQuantive is newsworthy in itself. Google really ups the ante with their plans to change the rules for posting content results for Internet searches. If Webmasters actually begin to do more sophisticated content preparation to leverage what Google is calling its Programmable Search Engine (PSE), then results using Google search will continue to be several steps ahead of what Microsoft is currently rolling out. In other words, while Microsoft is making its most expensive acquisition to tweak Internet searching in one area, Google is investing its capital in its own IP development to make search richer in another. Experience looking at large software companies tells me that IP strategically developed to be totally in sync with existing products have a much better chance of quick success in the marketplace than companies that do acquisitions to play catch up. So, even though Microsoft, in an acquiring mode, may find IP to acquire in the semantic search space (and there is a lot out there that hasn’t been commercialized), its ability to absorb and integrate it in time to head off this Google initiative is a real tough proposition. I’m with Bear Stearn’s guidance on this one.

OK, on to Arnold’s comments at Enterprise Search, in which he continues a theme to jolt IT folks. As, already noted, I totally agree that IT in most organizations is loath to call on information search professionals to understand the best ways to exploit search engine adoption for getting good search results. But I am hoping that the economic side of search, Web content management for an organization’s public facing content, may cause a shift. Already, I am experiencing Web content managers who are enlightened about how to make content more findable through good metadata and taxonomy strategies. They have figured out how to make good stuff rise to the top with guidance from outside IT. When sales people complain that their prospects can’t find the company’s products online, it tends to spur marketing folks to adjust their Web content strategies accordingly.

It may take a while, but my observation is that when employees see search working well on their public sites, they begin to push for equal quality search internally. Now that we have Google paying serious attention to metadata for the purpose of giving search results semantic context, maybe the guys in-house will begin to get it, too.

Mapping Search Requirements

Last week I commented on the richness of the search marketplace. However, diversity presents the enterprise buyer with pressure to be more focused on immediate and critical search needs.

The Enterprise Search Summit is being held in New York this week. Two years ago I found it a great place to see the companies offering search products, where I could easily see them all, and still attend every session in two days. This year, 2007, there were over 40 exhibitors, most offering solutions for highly differentiated enterprise search problems. Few of the offerings will serve the end-to-end needs of a large enterprise but many would be sufficient for medium to small organizations. The two major search engine categories used to be Web content keyword searching, and structured searching. Not only is my attention as an analyst being requested by major vendors offering solutions for different types of search but new products are being announced weekly. Newcomers include those describing their products as data mining engines, search and reporting “platforms,” BI intelligence engines, semantic and ontological search engines. This mix challenges me to determine if a product really solves a type of enterprise search problem before I pay attention.

You, on the other hand, need to do another type of analysis before considering specific options. Classifying search categories, taking a faceted approach will help you narrow down the field. Here is a checklist for categorizing what and how content needs to be found:

  • Content types (e.g. HTML pages, PDFs, images)
  • Content repositories (e.g. database applications, content management systems, collaboration applications, file locations)
  • Types of search interfaces and navigation (e.g. simple search box, metadata, taxonomy)
  • Types of search (e.g. keyword, phrase, date, topical navigation)
  • Types of results presentation (e.g. aggregated, federated, normalized, citation)
  • Platforms (e.g. hosted, intranet, desktop)
  • Type of vendor (e.g. search-only, single purpose application with embedded search, software as service – SaS )
  • Amount of content by type
  • Number and type of users by need (personas)

Then use any tools or resources at hand to harvest an understanding of the mapping results to learn who needs what type of content, in what format and its criticality to business requirements. Prioritizing the facets produces a multidimensional view of enterprise search requirements. This will go a long way to narrowing down the vendor list and gives you a tool to keep discussions focused.

There are terrific options in the marketplace and they will only become richer in features and complexity. Your job is to find the most appropriate solution for the business search problem you need to solve today, at a cost that matches your budget. You also want a product that can be implemented rapidly with immediate benefit linking to a real business proposition.

Will Search Technology Ever Become a Commodity?

This week, EMC announced a collaborative research network, with this headline: New EMC Innovation Network to Harness Worldwide Tech Resources, Accelerate Information Infrastructure Innovation. Among the areas that the research network will explore are Semantic Web, search, context, and ontological views.
There is a lot to feed on in this announcement but the most interesting aspect is the juxtaposition with other hardware giants’ forays into the world of document and content search software (e.g. IBM, CISCO), and recent efforts by software leaders Oracle and Microsoft to strengthen their offerings in the area of content and search.

One of the phrases in EMC’s announcement that struck me is the reference to “information infrastructure.” This phrase is used ubiquitously by IT folks to aggregate their hardware and network components with the assumption that because these systems store and transport data, they are information infrastructure. We need to recognize that there are two elements missing from this infrastructure view, skilled knowledge workers (e.g.content structure architects, taxonomists, specialist librarians) and software applications for content authoring, capture, organization, and retrieval. Judging from the language of EMC’s press release this might just be tacit recognition that hardware and networks do not make up an information infrastructure. But those of us in search and content management knew that all along; we don’t need a think tank to show us how the pieces fit together nor even how to innovate to create good infrastructure. Top notch professionals have been doing that for decades. Will this new network really reveal anything new?

EMC does not explicitly announce a plan to make search and information infrastructure product commodities but they do express the desire to build “commercial products” for this market. They have already acquired a few of the software components but have yet to demonstrate a tight integration with the rest of the company. Usually innovation comes from humble roots and grows organically through the sponsorship of a large organization, self-funding or other interested contributors. This effort to lead an innovation community to solutions for information infrastructure has the potential to spawn growth of truly innovative tools, methods and even standards for diverse needs and communities. Alternatively, it may simply be a push to bring a free-wheeling industry of multi-faceted components under central control with the result being tools and products that serve the lowest common denominator users.

From a search point of view, I for one am enjoying the richness of the marketplace and how varied the product offerings are for many specialized needs. For the time being, I remain skeptical that any hardware or software giant can sustain the richness of offerings that get to the heart of particular business search needs in a universal way. Commodity search solutions are a long way off for the community of organizations I encounter.

Siderean and Inxight Federal Systems Announce Partnership to Deliver Relational Navigation to Federal Government

Siderean Software announced that it has entered a reseller agreement with Inxight Federal Systems. Effective immediately, Siderean will be added to Inxight’s GSA-approved price list. Inxight’s software structures unstructured data by “reading” text and extracting important entities, such as people, places and organizations. It also extracts facts and events involving these entities, such as travel events, purchase events, and organizational relationships. Siderean’s Seamark Navigator then builds on this newly structured data, providing an relational navigational interface that allows users to put multi-source content in context to help improve discovery, access and participation across the information flow. Seamark Navigator uses the Resource Description Framework (RDF) and Web Ontology Language (OWL) standards developed by the World Wide Web Consortium (W3C). Siderean’s Seamark Navigator will provide an important add-on to Inxight’s metadata harvesting and extraction solutions. Inxight’s government customers will now be able to leverage Siderean’s relational navigation solutions to access more relevant and timely results derived from the full context and scope of information. As users refine their searches, Siderean dynamically displays additional navigation options and gives users summaries of those items that best match search criteria. Siderean also enables users to illuminate unseen relationships between sets of information and leverage human knowledge to explore information interactively. http://www.siderean.com, http://www.inxightfedsys.com

Turning Around a Bad Enterprise Search Experience

Many organizations have experimented with a number of search engines for their enterprise content. When the search engine is deployed within the bounds of a specific content domain (e.g. a QuickPlace site) the user can assume that the content being searched is within that site. However, an organization’s intranet portal with a free-standing search box comes with a different expectation. Most people assume that search will find content anywhere in the implied domain, and for most of us we believe that all content belonging to that domain (e. g. a company) is searchable.

I find it surprising how many public Web sites for media organizations (publishers) don’t appear to have their site search engines pointing to all the sub-sites indicated in site maps. I know from my experience at client sites that the same is often true for enterprise searching. The reasons are numerous and diverse, commentary for another entry. However, one simple notation under or beside the search box can clarify expectations. A simple link to a “list of searchable content” will underscore the caveat or at least tip the searcher that the content is bounded in some way.

When users in an organization come to expect that they will not find, through their intranet, what they are seeking but know to exist somewhere in the enterprise, they become cynical and distrustful. Having a successful intranet portal is all about building trust and confidence that the search tool really works or “does the job.” Once that trust is broken, new attempts to change the attitudes by deploying a new search engine, increasing the license to include more content, or doing better tuning to return more reliable results is not going to change minds without a lot of communication work to explain the change. I know that the average employee believes that all the content in the organization should be brought together in some form of federated search but now know it isn’t. The result is that they confine themselves to embedded search within specific applications and ignore any option to “search the entire intranet.”

It would be great to see comments from readers who have changed a Web site search experience from a bad scene to one with a positive traffic gain with better search results. Let us know how you did it so we can all learn.

The FAST acquisition of Convera

It has been a couple of weeks since the announcement that Fast Search & Transfer would acquire Convera’s RetrievalWare, a search technology built on the foundation of Excalibur and widely used in government enterprises.

At a recent Boston KM Forum meeting I asked Hadley Reynolds, VP & Director of the Center for Search Innovation at Fast, to comment on the acquisition. He indicated Fast’s interest in building up a stronger presence in the government sector, a difficulty for a Norwegian-based company. I remember Fast as a company launching in the U.S. with great fanfare in 2002 (http://newsbreaks.infotoday.com/nbreader.asp?ArticleID=17223 ) to support FirstGov.gov, a portal to multi-agency content of the U.S. Government. That site has recently been re-launched as http://www.usa.gov/ using the Vivisimo search portal. There must be a story behind the story, as I hope to learn.

To add to the discussion, last week I moderated a session at the Gilbane San Francisco conference at which Helen Mitchell, Senior Search Strategist for Credo Systems and Workgroup Chairperson for the Convera User Group, spoke. I asked Helen before the program about her reaction to the recent announcement. She had already been in contact with Fast and received assurances that Convera Federal Users would be well supported by Fast and they want to actively participate in conversations with the group through on-line and in-person meetings. Helen was positive about the potential for RetrievalWare users gaining from the best of Fast technology while still being supported with the unique capabilities of Convera’s semantic, faceted search.

Erik Schwartz, Director of Product Management from Convera, was also present; I encouraged him and Helen to leverage the RetrievalWare user community to make sure Fast really understands the unique and diverse needs of search within the enterprise. We are all well aware that in the rush to build up large customer bases with a solid revenue stream of maintenance, vendors are likely to sacrifice unique technologies that are highly valued by customers. A bottom-line round of pragmatic cost cutting usually determines what R&D a vendor will fund, foregoing the long term good will that could accrue if they would belly-up to integrating these unique features into their own platform.

Time will tell how serious Fast is in giving its new base a truly valuable customer experience. I would also note that this acquisition has also been observed by a broader information management industry publication, Information Week. See David Gardner’s article at http://www.informationweek.com/news/showArticle.jhtml?articleID=198701793.

Search Help and Usability

Preparing for two upcoming meetings with search themes (Gilbane San Francisco and Boston KM Forum) has brought to mind many issues of search usability. At the core is the issue of search literacy. Offering some fundamental searching tips to non-professional searchers often results in a surprised reaction. (e.g. When told, if seeking information about a specific topic such as “industrial engineering,” enclose it in quotes to limit the search to that phrase. Without quotes, you will get all content with “industrial” and “engineering” anywhere in the content with no explicit relationship implied.)

If you are reading this you probably know that, but many do not. In order to learn what people search for on their company intranet and how they type their search requests, I spend time reading search log files. I do this for several reasons:

  • To learn terminology searchers are using to guide taxonomy building choices
  • To see the way searches are formulated, and followed up
  • To inform design decisions about how to make searching easier
  • To see what is searched but not found to inform future content inclusion
  • To view the searcher’s next step when the results are zero or huge

wo results remain consistent: less than 1% of the searchers place a phrase inside quotations, even when there are multiple words; word are often truncated but do not include a truncation symbol (usually an asterisk, “*”). Both reveal a probable lack of search conventions understanding, a search literacy problem. Here are a couple of possible solutions:

  • Put into place better help and training mechanisms to help the lost find their way,

OR

  • Remove the legacy practice of forcing command language type symbols on searchers for the most common search requests

Placing punctuation around a search string is a holdover from 30 years ago when searching was done using a command language. Since only a limited number of people ever knew this syntactical format, why does it persist as the default for a phrase search for Web-based search engines?

The solution of providing a better help page and getting people to actually use it is a harder proposition. This one from McGraw-Hill for BusinessWeek Online is pretty simple with just seven tips but who reads it? I expect very few, although it could dramatically improve their search results. http://search.businessweek.com/advanced.jsp.

If you are trying to improve the search experience for your intranet, there are two resources to consult for content usability on all fronts, not just search: useit.com, Jakob Nielsen’s Website and Jared Spool’s UIEtips, User Interface Engineering’s free email newsletter. In the meantime, think about whether you need to demand more core search usability or tunable default options from vendors, or whether better interface design could guide searchers to better results.

« Older posts Newer posts »

© 2024 The Gilbane Advisor

Theme by Anders NorenUp ↑