Curated for content, computing, and digital experience professionals

Category: Enterprise search & search technology (Page 27 of 61)

Research, analysis, and news about enterprise search and search markets, technologies, practices, and strategies, such as semantic search, intranet collaboration and workplace, ecommerce and other applications.

Before we consolidated our blogs, industry veteran Lynda Moulton authored our popular enterprise search blog. This category includes all her posts and other enterprise search news and analysis. Lynda’s loyal readers can find all of Lynda’s posts collected here.

For older, long form reports, papers, and research on these topics see our Resources page.

It Takes Work to Get Good-to-Great Enterprise Search

It takes patience, knowledge and analysis to tell when search is really working. For the past few years I have seen a trend away from doing any “dog work” to get search solutions tweaked and tuned to ensure compliance with genuine business needs. People get cut, budgets get sliced and projects dumped because (fill the excuse) and the message gets promoted “enterprise search doesn’t work.” Here’s the secret, when enterprise search doesn’t work the chances are it’s because people aren’t working on what needs to be done. Everyone is looking for a quick fix, short cut, “no thinking required” solution.

This plays out in countless variations but the bottom line is that impatience with human processing time and the assumption that a search engine “ought to be able to” solve this problem without human intervention cripple possibilities for success faster than anything else.

It is time for search implementation teams to get realistic about the tasks that must be executed and milestones to be reached. Teams must know how they are going to measure success and reliability, then to stick with it, demanding that everyone agrees on the requirements before throwing the towel in at the first executive anecdote that the “dang thing doesn’t work.”

There are a lot of steps to getting even an out-of-the-box solution working well. But none is more important than paying attention to these:

  • Know your content
  • Know your search audience
  • Know what needs to be found and how it will be looked for
  • Know what is not being found that should be

The operative verb here is to know and to really know anything takes work, brain work, iterative, analytical and thoughtful work. When I see these reactions from IT upon setting off a search query that returns any results: “we’re done” OR “no error messages, good” OR “all these returns satisfy the query,” my reaction is:

  • How do you know the search engine was really looking in all the places it should?
  • What would your search audience be likely to look for and how would they look?
  • Who is checking to make sure these questions are being answered correctly
  • How do you know if the results are complete and comprehensive?

It is the last question that takes digging and perseverance. It is pretty simple to look at search results and see content that should not have been retrieved and figure out why it was. Then you can tune to make sure it does not happen again.

To make sure you didn’t miss something takes systematic “dog work” and you have to know the content. This means starting with a small body of content that it is possible for you to know, thoroughly. Begin with content representative of what your most valued search audience would want to find. Presumably, you have identified these people through establishing a clear business case for enterprise search. (This is not something for the IT department to do but for the business team that is vested in having search work for their goals.) Get these “alpha worker” searchers to show you how they would go about trying to find the stuff they need to get their work done every day, to share with you some of what they consider some of the most valuable documents they have worked with over the past few years. (Yes, years – you need to work with veterans of the organization whose value is well established, as well as with legacy content that is still valuable.)

Confirm that these seminal documents are in the path of the search engine for the index build; see what is retrieved when they are searched for by the seekers. Keep verifying by looking at both content and results to be sure that nothing is coming back that shouldn’t and that nothing is being missed. Then double the content with documents on similar topics that were not given to you by the searchers, even material that they likely would never have seen that might be formatted very differently, written by different authors, and more variable in type and size but still relevant. Re-run the exact searches that were done originally and see what is retrieved. Repeat in scaling increments and validate at every point. When you reach points where content is missing from results that should have been found using the searcher’s method, analyze, adjust, and repeat.

A recent project revealed to me how willing testers are to accept mediocre results when it became apparent how closely content must be scrutinized and peeled back to determine its relevance. They had no time for that and did not care how bad the results were because they had a pre-defined deadline. Adjustments may call for refinements in the query formulation that might require an API to make it more explicit, or the addition of better category metadata with rich cross-references to cover vocabulary variations. Too often this type of implementation discovery signals a reason to shut down the project because all options require human resources and more time. Before you begin, know that this level of scrutiny will be necessary to deliver good-to-great results; set that expectation for your team and management, so it will be acceptable to them when adjustments are needed for more work to be done to get it right. Just don’t blame it on the search engine – get to work, analyze and fix the problem. Only then can you let search loose on your top target audience.

Lucid Imagination and ISYS Partner on Lucene/Solr

Lucid Imagination and ISYS Search Software announced a strategic partnership. The agreement enables Lucid Imagination to provide solutions that combine its core Lucene and Solr expertise with the ISYS File Readers document filtering technology. The flexibility of the architecture allows enterprises to develop sophisticated purpose-built search solutions. By offering ISYS File Readers as part of its Lucene/Solr solutions, Lucid Imagination gives users and developers out-of-the-box capability to find and extract virtually all of the content and formats that exist in their enterprise environment. Lucid Imagination Web site serves as a knowledge portal for the Lucene community, with wide range of information, resources and  information retrieval application, LucidFind to help developers and search professionals get access to the information they need to design, build and deploy Lucene and Solr based solutions. http://www.lucidimagination.com

MuseGlobal and Specialty Systems Partner

MuseGlobal announced a partnership with Specialty Systems, Inc., a company focusing on innovative information systems solutions to Federal, State and Local Government customers. Specialty Systems, Inc. is partnering with MuseGlobal to provide the systems integration expertise to engineer law enforcement and homeland security applications built on MuseGlobal’s MuseConnect, which provides federated search and harvesting technologies, with a library of more than 6,000 pre-built source connectors. The applications resulting from this partnership will incorporate unified information access allowing structured data from database sources; semi-structured data from spreadsheets, forms and XML sources; unstructured data from web sites, documents, email; and rich media such as images, video and audio information to be accessed simultaneously from internal databases and external sources.  This information is gathered on the fly, and unified for immediate presentation to the requestor. http://www.specialtysystems.com, http://www.museglobal.com

SDL Tridion Integrates Q-go Natural Language Search into Web Content Management

SDL Tridion announced that it has partnered with Q-go to provide an integrated Natural Language Search engine within SDL Tridion’s web content management platform. The solution provides the online search environment within websites only targeted and relevant search results. Q-go’s Natural Language Search is now accessible from within the SDL Tridion web content management environment. Content editors are able to create model questions in the Q-go component of the SDL Tridion platform. This means that the most common questions pertaining to products and the website itself can be targeted and answered by web content editors, creating streamlined content and vastly increased relevance of searches. The integration also means that only one interface is needed to update the entire website, which can be done anywhere, anytime. You can find more information on the integration at the eXtensions Community of  http://www.sdltridionworld.com

Paying Attention to Enterprise Search Results

When thinking about some enterprise search use cases that require planning and implementation, presentation of search results is not often high on the list of design considerations. Learning about a new layer of software called Documill from CEO and founder, Mika Könnölä, caused me to reflect on possible applications in which his software would be a benefit.

There is one aspect of search output (results) that always makes an impression when I search. Sometimes the display is clear and obvious and other times the first thing that pops into my mind is “what the heck am I looking at” or “why did this stuff appear?” In most cases, no matter how relevant the content may end up being to my query, I usually have to plow through a lot (could be dozens) of content pieces to confirm the validity or usefulness of what is retrieved.

Admittedly, much of my searching is research or helping with a client’s intranet implementation, not just looking for a quick answer, a fact or specific document. When I am in the mode for what I call “quick and dirty” search, I can almost always frame the search statement to get the exact result I want very quickly. But when I am trying to learn about a topic new to me, broaden my understanding or collect an exhaustive corpus of material for research, sifting and validating dozens of documents by opening each and then searching within the text for the piece of the content that satisfied the query is both tedious and annoyingly slow.

That is where Documill could enrich my experience considerably for it can be layered on any number of enterprise search engines to present results in the form of precise thumbnails that show where in a document the query criterion/criteria is located. In their own words, “it enhances traditional search engine result list with graphically accurate presentation of the content.”

Here are some ideas for its application:

  • In an application developed to find specific documents from among thousands that are very similar (e.g. invoices, engineering specifications), wouldn’t it be great to see only a dozen, already opened, pages to the correct location where the data matches the query?
  • In an application of 10s of thousands of legacy documents, OCRed for metadata extraction displayable as PDFs, wouldn’t it be great to have the exact pages of the document that match the search displayed as visual images opened to read in the results page? This is especially important in technical documents of 60-100 pages where the target content might be on page 30 or 50.
  • In federated search output, when results may contain many similar documents, the immediate display of just the right pages as images ready for review will be a time-saving blessing.
  • In a situation where a large corpus of content contains photographs or graphics, such as newspaper archives, scientific and engineering drawings, an instantaneous visual of the content will sharpen access to just the right documents.

I highly recommend that you ask your search engine solution provider about incorporating Documill into your enterprise search architecture. And, if you have, please share your experiences with me through comments to this post or by reaching out for a conversation.

Deep Web Technologies Introduces Automatic Federated Search Builder

Deep Web Technologies introduced Search Builder, an automatic deep web search-engine creation tool. This tool enables Deep Web Technologies’ clients to personalize the federated search experience for particular individuals, departments, classrooms or other groups that require different default collections and/or a modified user experience when conducting deep web searches. In short, Search Builder makes it possible to deliver any number of individualized federated search portals within an organization. Search Builder allows administrators or authorized users to create as many federated search portals as needed, which represent privately branded search pages containing different default collections and search fields used for the search, as well as customized user experience. Future plans include integrating Search Builder into Deep Web Technologies’ free media sites, Mednar (http://www.mednar.com/), Biznar (http://www.biznar.com/), and ScienceResearch.com (http://www.scienceresearch.com/). http://www.deepwebtech.com/

If a Vendor Spends Enough on Full-page Ads: Ink will Follow

Earlier comments in this blog referred to Autonomy ads in Information Week. They have continued throughout early 2009 with just the latest proclaiming “Autonomy Dominates Enterprise Search” in bold red and black, two of my favorite, eye-catching colors. Having read the publication for over ten years, I notice things that are different. Seeing a search company repeatedly showing up keeps me noticing because they are the first to spend on major advertising like this in an IT publication.

This week the predictable happened, it was an article by Information Week‘s Sr. VP focusing on Autonomy’s terrific business run in a tough economy. Fair enough – it happens all the time for big spenders.

I just want to remind readers that if you are a small unit in a large organization or a small or medium business, there are dozens of enterprise search solutions that will serve you extremely well, with much lower cost of ownership and startup effort than Autonomy. You do not need the biggest or fastest growing company’s products to get good or even excellent solutions. Furthermore, the chances of getting superior customer support and services from a more modest company, which is focused exclusively on search excellence, are much better.

Be sure to check out the offerings at the Gilbane Conference in San Francisco next week. A lot more guidance and good case studies will give you an earful of what else to consider. The search headliners at the conference with Hadley Reynolds moderating are:

E8. Search Survival Guide: Delivering Great Results
Speakers: Randy Woods, Co-founder & Executive VP, non-linear creations, Best Practices for Tuning Enterprise Search and Miles Kehoe, President, New Idea Engineering

E9/I5. The Next Big Thing: Tomorrow’s Search Revealed
Speakers: Stephen Arnold, ArnoldIT, What You Need to Know About Google Dataspaces and Jeff Fried, Senior Product Manager, Microsoft

E10/I6. Bringing it All Together: Perils and Pitfalls of Search Federation
Speakers: Helen Mitchell Curtis, Senior Program Director of Enterprise Solutions, MacFadden, Federated Search in a Disparate Environment, Larry Donahue, Chief Operating Officer & Corporate Counsel, Deep Web Technologies, Federated Search: True Enterprise Search and Jeff Fried, Senior Product Manager, Microsoft

E11/I7. The Special Case of Categories – and Where To Find Them
Speakers: Joseph Busch, Founder, Taxonomy Strategies, Taxonomy Validation, and Arje Cahn, CTO, Hippo, Find What You Need in Unstructured Content with the Help of Others (and your CMS): Demo of Wikipedia with Faceted Search

E12/I8. It’s Easier with Structure: Leveraging Markup for Better Search
Speakers: Dianne Burley, Industry Specialist, Nstein Technologies, Semantic Search and J. Brooke Aker, CEO, Expert System, A 3-Step Walk Through ECM Using Semantics

E13/I9. Improving SharePoint Search & Navigation with a Taxonomy and Metadata

Have a question for our analyst panel?

Looking forward to seeing many of you next week at Gilbane San Francisco. Whether you will be there or not, you can suggest questions to ask our analyst panel. Each of the panelists have specific areas of expertise covering web content management, web governance, enterprise social software and social media, collaboration, and enterprise search. The panel is a keynote session after the two keynote presentations from Microsoft and Adobe, so we’ll also be covering reactions to those. You can submit your questions directly to me via a comment, email, or twitter (DM or post using the hashtag #gilbanesf).

Registration for the conference is still open and will be available on-site. If you register in advance you can still get a $200. discount using GILBANE as the discount code. There is no charge for the keynotes or the technology demonstrations or product labs.

K2. Keynote Analyst Panel
We invite industry analysts from many different firms to speak at all our events to make sure our conference attendees hear differing opinions from a wide variety of expert sources. A second, third, fourth or fifth opinion will ensure you don’t make ill-informed decisions about critical content and information technologies or strategies. This session will be a lively, interactive debate guaranteed to be both informative and fun.
Moderator: Frank Gilbane
Panelists:
Jeremiah Owyang, Senior Analyst, Social Computing, Forrester
Hadley Reynolds, Research Director, Search & Digital Marketplace Technologies, IDC
Larry Hawes, Lead Analyst, Collaboration and Enterprise Social Software, Gilbane Group
Lisa Welchman, Founding Partner, WelchmanPierpoint

« Older posts Newer posts »

© 2025 The Gilbane Advisor

Theme by Anders NorenUp ↑