Curated for content, computing, data, information, and digital experience professionals

Category: Enterprise search & search technology (Page 25 of 61)

Research, analysis, and news about enterprise search and search markets, technologies, practices, and strategies, such as semantic search, intranet collaboration and workplace, ecommerce and other applications.

Before we consolidated our blogs, industry veteran Lynda Moulton authored our popular enterprise search blog. This category includes all her posts and other enterprise search news and analysis. Lynda’s loyal readers can find all of Lynda’s posts collected here.

For older, long form reports, papers, and research on these topics see our Resources page.

Search Industry in 2010

Just in from Information Week is this article (Exclusive: IBM Reorganizes Software Group ) that prompted me to launch 2010 with some thoughts on where we are heading with enterprise search this year. When IBM does something dramatic it impacts the industry because it makes others react.

I don’t make forecasts or try to guess whether strategic changes will succeed or fail but a couple of years ago, I blogged on IBM’s introduction of Yahoo OmniFind, a free offering and then followed up with these comments just a few months ago. IBM makes their competitors change, try to outsmart, outguess, or copy, just as Microsoft or Google changes cause ripples in the industry.

Meanwhile, OpenText, another large software company with search offerings, is not going to offer search outside of its other product suites. [More is likely to come out after the scheduled analyst meetings today but I’m not there and can’t brief you on deeper intent.] We have recently seen an announcement about FAST being delivered with new SharePoint offerings, the first major release of FAST announced since Microsoft acquired them almost two years ago. While FAST is still available as a standalone product from MS, it and other search engines may be steadily moving into being embedded in suites by their acquirers.

Certainly IBM has a lot of search components that they have acquired, so continuing to bind with other content offerings is a probable strategy. Oracle and Autonomy may soon come up with similar suite offerings embedding search once again. Oracle SES (Secure Enterprise Search) does not appear to have a lot of traction and it’s possible that supporting pure search offerings may be a burden for Autonomy with its stable of many acquired content products.

All of this leads me to think that, since enterprise search has gotten such a bad reputation as a failed technology, the big software houses are going to bury it in point solutions. Personally, I believe that enterprise search is a failed strategy and SMBs can still find search engines that will serve the majority of their enterprise needs for several years to come. The same holds true for divisions or groups within large corporations.

Guidance: select and adopt one or more search solutions that fit your budget for small scale needs, point solutions and enterprise content that everyone in the organization needs to access on a regular basis. Learn how these products work, what they can and cannot deliver, making incremental adjustments as needs change and evolve. Do not install and think you are done because you will never be done. Cultivate a few search experts to stick with the evolving landscape and give them the means to keep up with changes in the search landscape. It is going to keep morphing for a long time to come.

Contegra and dtSearch Announce Faceted Search for dtSearch

Contegra Systems and dtSearch announced a faceted search add-on for dtSearch Developer Customers. Faceted search enables dynamic filtering of search results by attributes. Built on the dtSearch Engine APIs, Contegra Systems’ Kaleido Search now makes faceted search available to content-rich applications and e-commerce sites. Kaleido Search offers the ability to group search results by facet, the ability to “expand and collapse” facet selections, on-demand summaries of selected facets, and more. Kaleido Search enables these faceted search features in the context of comprehensive solution for online data access that is customizable to suit any site. The dtSearch Engine can index over a terabyte of data in a single index, as well as create and instantly search an unlimited number of indexes. The software offers more than 25 search options, including Unicode support covering hundreds of international languages. Proprietary file format support highlights hits in popular file types.  A built-in Spider supports searching of local and remote, public and secure, dynamic and static web data, with WYSIWYG hit-highlighted displays. The dtSearch Engine API supports .NET, Java, C++, SQL, etc., including native 64-bit Windows/Linux support. http://www.contegrasystems.com, http://www.dtsearch.com

Perst Embedded Database Integrated with Jease Content Management Framework

Jease, a content management framework based on open source Java technologies, has added support for the Perst object-oriented, open source embedded database system from McObject. When used with Jease, Perst becomes the persistence engine for highly customized, content- and database-driven Web applications that leverage the productivity and efficiency of working with “plain old Java objects” (POJOs). Jease (the name combines “Java” and “ease”) provides building blocks for developers with even a little Java experience to assemble Web applications tailored to specific needs. The goal of Jease is to offer a flexible content management framework rather than a full-blown content management system. Other open source software components used by Jease include Apache Lucene for full-text indexing and search, and the ZK Ajax + Mobile Java framework. Perst and Perst Lite are part of McObject’s family of small footprint, high performance embedded database software products. The eXtremeDB in-memory embedded database from McObject is used  in devices including MP3 players, industrial automation solutions, digital TVs, telecom/network communications equipment and military/aerospace technology. Perst is available for Java and .NET, including Java ME and .NET Compact Framework. http://www.jease.org/, http://www.mcobject.com

In the end, good search may depend on good source.

As the world of search becomes more and more sophisticated (and that process has been underway for decades,) we may be approaching the limits of software’s ability to improve its ability to find what a searcher wants. If that is true, and I suspect that it is, we will finally be forced to follow the trail of crumbs up the content life cycle… to its source.

Indeed, most of the challenges inherent in today’s search strategy and products appears to grow from the fact that while we continually increase our demands for intelligence on the back end, we have done little if anything to address the chaos that exists on the front end. You name it, different word processing formats, spreadsheets, HTML tagged text, database delimited files, and so on are all dumped into what we think of as a coherent, easily searchable body of intellectual property. It isn’t and isn’t likely to become so any time soon unless we address the source.

Having spent some time in the library automation world, I can remember the sometimes bitter controversies over having just two major foundations for cataloging source material (Dewey and LC; add a third if you include the NICEM A/V scheme.) Had we known back then that the process of finding intellectual property would devolve into the chaos we now confront, with every search engine and database product essentialy rolling its own approach to rational search, we would have considered ourselves blessed. In the end, it seems, we must begin to see the source material, its physcial formats, its logical organization and its inclusion of rational cataloging and taxonomy elements as the conceptual raw material for its own location.

As long as the word processing world teaches that anyone creating anything can make it look like it should in a dozen different ways, ignoring any semblance of finding-aid inclusion, we probably won’t have a truly workable ability to find what we want without reworking the content or wading through a haystack of misses to find our desired hits.

Unfortunately, the solutions of yesteryear, including after-creation cataloging by a professional cataloger, probably won’t work now either, for cost if no other reason. We will be forced to approach the creators of valuable content, asking them for a minimum of preparation for searching their product, and providing the necessary software tools to make that possible.

We can’t act too soon because, despite the growth of software elegance and raw computer power, this situation will likely get worse as the sheer volume of valuable content grows. Regards, Barry Read more: Enterprise Search Practice Blog:  https://gilbane.com/search_blog/

W3C Publishes Drafts of XQuery 1.1, XPath 2.1

The World Wide Web Consortium (W3C) has published new Drafts of XQuery 1.1, XPath 2.1 and Supporting Documents. As part of work on XSLT 2.1 and XQuery 1.1, the XQuery and XSL Working Groups have published First Public Working Drafts of “XQuery and XPath Data Model 1.1,” “XPath and XQuery Functions and Operators 1.1,” “XSLT and XQuery Serialization 1.1” and “XPath 2.1.” In addition, the XQuery Working Group has updated drafts for “XQuery 1.1: An XML Query Language,” “XQueryX 1.1” and “XQuery 1.1 Requirements.” http://www.w3.org/News/2009#entry-8682

TEMIS Unveils Luxid Content Pipeline

TEMIS announced the launch of Luxid Content Pipeline, a new content collection module integrated within the latest version of its content discovery solution, Luxid 5.1. This platform collects content from a range of information sources and feeds them into Luxid. After annotating content with relevant metadata, Luxid then applies search, discovery and sharing tools to the enriched content and provides users with content analytics and knowledge discovery. Luxid Content Pipeline accesses content by three different methods: Structured Access connects and automates the collection of documents from structured content sources such as Dialog, DataStar, ISI Web of Knowledge, Ovid, STN, Questel, EBSCOhost, Factiva, LexisNexis, MicroPatent, Scopus, ScienceDirect, Minesoft, Esp@cenet, and PubMed. Enterprise Content Management Access connects to corporate knowledge repositories such as EMC Documentum, EMC Documentum CenterStage and Microsoft Office SharePoint Server. To be as compatible as possible with a wide variety of document sources, Luxid Content Pipeline also supports the integration of UIMA (Unstructured Information Management Architecture) collection readers, enabling the connection to these sources using UIMA standard protocol and format conversion. http://www.temis.com/

Layering Technologies to Support the Enterprise with Semantic Search

Semantic search is a composite beast like many enterprise software applications. Most packages are made up of multiple technology components and often from multiple vendors. This raises some interesting thoughts as we prepare for Gilbane Boston 2009 to be held this week.

As part of a panel on semantic search, moderated by Hadley Reynolds of IDC, with Jeff Fried of Microsoft and Chris Lamb of the OpenCalais Initiative at Thomson Reuters, I wanted to give a high level view of semantic technologies currently in the marketplace. I contacted about a dozen vendors and selected six to highlight for the variety of semantic search offerings and business models.

One case study involves three vendors, each with a piece of the ultimate, customer-facing, product. My research took me to one company that I had reviewed a couple of years ago, and they sent me to their “customer” and to the customer’s customer. It took me a couple of conversations and emails to sort out the connections; in the end the relationships made perfect sense.

On one hand we have conglomerate software companies offering “solutions” to every imaginable enterprise business need. On the other, we see very unique, specialized point solutions to universal business problems with multiple dimensions and twists. Teaming by vendors, each with a solution to one dimension of a need, create compound product offerings that are adding up to a very large semantic search marketplace.

Consider an example of data gathering by a professional services firm. Let’s assume that my company has tens of thousands of documents collected in the course of research for many clients over many years. Researchers may move on to greater responsibility or other firms, leaving content unorganized except around confidential work for individual clients. We now want to exploit this corpus of content to create new products or services for various vertical markets. To understand what we have, we need to mine the content for themes and concepts.

The product of the mining exercise may have multiple uses: help us create a taxonomy of controlled terms, preparing a navigation scheme for a content portal, providing a feed to some business or text analytics tools that will help us create visual objects reflecting various configurations of content. A text mining vendor may be great at the mining aspect while other firms have better tools for analyzing, organizing and re-shaping the output.

Doing business with two or three vendors, experts in their own niches, may help us reach a conclusion about what to do with our information-rich pile of documents much faster. A multi-faceted approach can be a good way to bring a product or service to market more quickly than if we struggle with generic products from just one company.

When partners each have something of value to contribute, together they offer the benefits of the best of all options. This results in a new problem for businesses looking for the best in each area, namely, vendor relationship management. But it also saves organizations from dealing with huge firms offering many acquired products that have to be managed through a single point of contact, a generalist in everything and a specialist in nothing. Either way, you have to manage the players and how the components are going to work for you.

I really like what I see, semantic technology companies partnering with each other to give good-to-great solutions for all kinds of innovative applications. By the way, at the conference I am doing a quick snapshot on each: Cogito, Connotate (with Cormine and WorldTech), Lexalytics, Linguamatics, Sinequa and TEMIS.

Nstein Technologies Launches Semantic Site Search

Nstein Technologies Inc. announced the release of a new product, Semantic Site Search (3S). 3S leverages Nstein’s text-mining technology to power a faceted site search which returns results that are organized categorically. 3S can ingest content from many different indices from many different web publishing platforms, meaning it indexes material across multiple properties. 3S’ embedded Text Mining Engine (TME) identifies concepts, categories, proper names, places, organizations, sentiment and topics in particular content pieces and then annotates those documents, creating a semantic fingerprint that exposes underlying nuances and meaning in content. 3S is also boasts a visual interface that is designed to allow administrators to tweak search sensitivity algorithms without having to modify hard code. 3S comes bundled with front-end wiidgets which could be used to point users to “similar content”, “most recent content”, or other identifying characteristics of content that one wants to promote. http://www.nstein.com

« Older posts Newer posts »

© 2025 The Gilbane Advisor

Theme by Anders NorenUp ↑