Curated for content, computing, and digital experience professionals

Day: February 24, 2009

Federated Search: Setting User Expectations

In the past few months, it is rare that I am briefed on an enterprise search product without a claim to provide “federated search.” Having worked with the original ANSI standard, Z39.50, and on one of the many review committees for it back in the early 1990s, it is a topic that always catches my attention.

Some of the history of search federation is described in this rather sketchy article at Wikipedia. However, I want clarify the original call for such a standard. It comes from the days when public access to search technologies was available primarily through library on-line catalogs in pubic and academic institutional libraries. A demand for the ability to search not only one’s local library system and network (e.g. a university often standardized on one library system to include all the holdings of a number of its own libraries), but also the holdings of other universities or major public libraries. The problem was that the data structures and protocols from one library system product to the next varied in way that made it difficult for the search engine of the first system to penetrate the database of records in another system. Records might have been meta-tagged similarly, but the way the metadata were indexed and accessible to retrieval algorithms was not possible with a translating layer between systems. Thus, the Z39.50 standard was established, originally to let one library system‘s user search from that library system into the contents of other libraries with different systems.

Ideally, results were presented to the searcher in a uniform citation format, organized to help the user easily recognize duplicated records, each marked with location and availability. Usually there was a very esoteric results presentation that could only be readily interpreted by librarians and research scholars.

Now we live in a digitized content environment in which the dissimilarities across content management systems, content repositories, publishers’ databases, and library catalogs have increased a hundred fold. The need for federating or translation layers to bring order to this metadata or metadata-less chaos has only become stronger. The ANSI standard is largely ignored by content platform vendors, thus leaving the federating solution to non-embedded search products. A buyer of search must do deep testing to determine if the enterprise search engine you have acquired actually stands up well under a load of retrieving across numerous disparate repositories. And you need a very astute and experienced searcher with expert familiarity of content in all the repositories to make an evaluation as to suitability for the circumstance in which the engine will be used.

So, let’s just recap what you need to know before you select and license a product claiming to support what you expect from search federation:

  • Federated search is a process for retrieving content either serially or concurrently from multiple targeted sources that are indexed separately, and then presenting results in a unified display. You can imagine that there will be a huge variation in how well those claims might be satisfied.
  • Federation is an expansion of the concept of content aggregation. It has play in a multi-domain environment of only internal sites OR a mix of internal and external sites that might include the deep (hidden) web. Across multiple domains complete federation supports at least four distinct functions:
    • Integration of the results from a number of targeted searchable domains, each with its own search engine
    • Disambiguation of content results when similar but non-identical pieces of content might be included
    • Normalization of search results so that content from different domains is presented similarly
    • Consolidation of the search operation (standardizing a query to each of the target search engines) and standardizing the results so they appear to be coming from a single search operation

In order to do this effectively and cleanly, the federating layer of software, which probably comes from a third-party like MuseGlobal, must have “connectors” that recognize the structures of all the repositories that will be targeted from the “home” search engine.

Why is this relevant? In short, because it is expected by users that when they search, all the results they are looking at represent all the content from all the repositories they believed they were searching in a format that makes sense to them. It is a very tall order for any search system to do this but when enterprise information managers are trying to meet a business manager’s or executive’s lofty expectations, anything less is viewed as the failure of enterprise search. Or else, they better set expectations lower.

Ingres Launches Open Source Enterprise Content Management Offering

Ingres Corporation announced the availability of the Ingres Icebreaker Enterprise Content Management (ECM) Appliance. Powered by Alfresco’s open source alternative software for enterprise content management, the Ingres Icebreaker ECM Appliance gives businesses a way to manage business content growth. Like other commercial open source solutions, the Ingres Icebreaker ECM Appliance lets IT purchasers pay only for the software and support they actually need. For the Ingres Icebreaker ECM Appliance, Ingres provides the open source database for a company’s advanced data repository needs, Alfresco provides the content management expertise, and their technology runs on the Ingres Database. It is an appliance that allows developers to bring two open technologies together on the open source Linux operating system. The Ingres Icebreaker ECM Appliance integrates the operating system, the database, and the ECM technology and is installed as a unit, managed as a unit, and maintained as a unit. The Ingres Icebreaker ECM appliance allows users to capture, store, and preserve data, assist in the management of data, and deliver data to customers and partners. To download the Ingres Icebreaker ECM appliance today, please go to http://esd.ingres.com, http://www.ingres.com

 

2009 Gilbane San Francisco Conference to Focus on Business Impact of Global Content Management and Social Media

One of the most closely watched technology trends for 2009 is the need for enterprises to leverage new web platforms while holding down the costs of managing their ever-proliferating content. Even in the midst of the current economic downturn, critical areas related to global business content will see double-digit growth over the next 12 months – including spending on search technologies which is projected to represent almost half of all digital spending by businesses in 2009, along with rising corporate investment in social media tools (Source: Winterberry Group). Reflecting these fundamental shifts in the way enterprises engage with customers and disseminate information, the sixth annual Gilbane San Francisco June 2-4, 2009 at the Westin Hotel in San Francisco – will focus on the timely theme “Where Content Management Meets Social Media.” Produced by The Gilbane Group and Lighthouse Seminars, the 2009 Gilbane San Francisco conference will offer enhanced programs tied to the business issues surrounding a company’s marketing, technical and enterprise content. The conference tracks have been organized around the four major areas of how enterprises use Web and content technologies and where they are most likely to invest, including: Web Business & Engagement; Managing Collaboration & Social Media; Enterprise Content: Searching, Integrating & Publishing; and Content Infrastructure. Gilbane San Francisco brings together industry experts from leading technology, enterprise IT, analyst, and consulting firms who provide attendees with the latest successful content management and new media strategies, technologies and techniques. In addition to the latest best practices, technology coverage within these four tracks will include enterprise and site search; content globalization; semantic technologies; publishing; XML; and social media tools and platforms from Twitter to business blogs, project wikis and microformats. The just-published schedule of conference sessions can be viewed at http://gilbanesf.com/conference-schedule.html.

Pre-conference workshops will feature industry thought leaders covering core topics in web content, new media, Sharepoint and more — the full schedule of workshops can be found at http://gilbanesf.com/workshops.html. IT and business professionals involved in content creation, management, delivery or analytics wishing to attend the conference may register at: http://gilbanesf.com/registration_information.html. Technology solution providers wishing to exhibit or sponsor should visit: http://gilbanesf.com/exhibitors_sponsors.html. Follow the conference on Twitter:
http://twitter.com/gilbanesf

 

© 2020 The Gilbane Advisor

Theme by Anders NorenUp ↑