Curated for content, computing, and digital experience professionals

Category: Semantic technologies (Page 38 of 72)

Our coverage of semantic technologies goes back to the early 90s when search engines focused on searching structured data in databases were looking to provide support for searching unstructured or semi-structured data. This early Gilbane Report, Document Query Languages – Why is it so Hard to Ask a Simple Question?, analyses the challenge back then.

Semantic technology is a broad topic that includes all natural language processing, as well as the semantic web, linked data processing, and knowledge graphs.


Siderean and Inxight Federal Systems Announce Partnership to Deliver Relational Navigation to Federal Government

Siderean Software announced that it has entered a reseller agreement with Inxight Federal Systems. Effective immediately, Siderean will be added to Inxight’s GSA-approved price list. Inxight’s software structures unstructured data by “reading” text and extracting important entities, such as people, places and organizations. It also extracts facts and events involving these entities, such as travel events, purchase events, and organizational relationships. Siderean’s Seamark Navigator then builds on this newly structured data, providing an relational navigational interface that allows users to put multi-source content in context to help improve discovery, access and participation across the information flow. Seamark Navigator uses the Resource Description Framework (RDF) and Web Ontology Language (OWL) standards developed by the World Wide Web Consortium (W3C). Siderean’s Seamark Navigator will provide an important add-on to Inxight’s metadata harvesting and extraction solutions. Inxight’s government customers will now be able to leverage Siderean’s relational navigation solutions to access more relevant and timely results derived from the full context and scope of information. As users refine their searches, Siderean dynamically displays additional navigation options and gives users summaries of those items that best match search criteria. Siderean also enables users to illuminate unseen relationships between sets of information and leverage human knowledge to explore information interactively. http://www.siderean.com, http://www.inxightfedsys.com

Turning Around a Bad Enterprise Search Experience

Many organizations have experimented with a number of search engines for their enterprise content. When the search engine is deployed within the bounds of a specific content domain (e.g. a QuickPlace site) the user can assume that the content being searched is within that site. However, an organization’s intranet portal with a free-standing search box comes with a different expectation. Most people assume that search will find content anywhere in the implied domain, and for most of us we believe that all content belonging to that domain (e. g. a company) is searchable.

I find it surprising how many public Web sites for media organizations (publishers) don’t appear to have their site search engines pointing to all the sub-sites indicated in site maps. I know from my experience at client sites that the same is often true for enterprise searching. The reasons are numerous and diverse, commentary for another entry. However, one simple notation under or beside the search box can clarify expectations. A simple link to a “list of searchable content” will underscore the caveat or at least tip the searcher that the content is bounded in some way.

When users in an organization come to expect that they will not find, through their intranet, what they are seeking but know to exist somewhere in the enterprise, they become cynical and distrustful. Having a successful intranet portal is all about building trust and confidence that the search tool really works or “does the job.” Once that trust is broken, new attempts to change the attitudes by deploying a new search engine, increasing the license to include more content, or doing better tuning to return more reliable results is not going to change minds without a lot of communication work to explain the change. I know that the average employee believes that all the content in the organization should be brought together in some form of federated search but now know it isn’t. The result is that they confine themselves to embedded search within specific applications and ignore any option to “search the entire intranet.”

It would be great to see comments from readers who have changed a Web site search experience from a bad scene to one with a positive traffic gain with better search results. Let us know how you did it so we can all learn.

The FAST acquisition of Convera

It has been a couple of weeks since the announcement that Fast Search & Transfer would acquire Convera’s RetrievalWare, a search technology built on the foundation of Excalibur and widely used in government enterprises.

At a recent Boston KM Forum meeting I asked Hadley Reynolds, VP & Director of the Center for Search Innovation at Fast, to comment on the acquisition. He indicated Fast’s interest in building up a stronger presence in the government sector, a difficulty for a Norwegian-based company. I remember Fast as a company launching in the U.S. with great fanfare in 2002 (http://newsbreaks.infotoday.com/nbreader.asp?ArticleID=17223 ) to support FirstGov.gov, a portal to multi-agency content of the U.S. Government. That site has recently been re-launched as http://www.usa.gov/ using the Vivisimo search portal. There must be a story behind the story, as I hope to learn.

To add to the discussion, last week I moderated a session at the Gilbane San Francisco conference at which Helen Mitchell, Senior Search Strategist for Credo Systems and Workgroup Chairperson for the Convera User Group, spoke. I asked Helen before the program about her reaction to the recent announcement. She had already been in contact with Fast and received assurances that Convera Federal Users would be well supported by Fast and they want to actively participate in conversations with the group through on-line and in-person meetings. Helen was positive about the potential for RetrievalWare users gaining from the best of Fast technology while still being supported with the unique capabilities of Convera’s semantic, faceted search.

Erik Schwartz, Director of Product Management from Convera, was also present; I encouraged him and Helen to leverage the RetrievalWare user community to make sure Fast really understands the unique and diverse needs of search within the enterprise. We are all well aware that in the rush to build up large customer bases with a solid revenue stream of maintenance, vendors are likely to sacrifice unique technologies that are highly valued by customers. A bottom-line round of pragmatic cost cutting usually determines what R&D a vendor will fund, foregoing the long term good will that could accrue if they would belly-up to integrating these unique features into their own platform.

Time will tell how serious Fast is in giving its new base a truly valuable customer experience. I would also note that this acquisition has also been observed by a broader information management industry publication, Information Week. See David Gardner’s article at http://www.informationweek.com/news/showArticle.jhtml?articleID=198701793.

Search Help and Usability

Preparing for two upcoming meetings with search themes (Gilbane San Francisco and Boston KM Forum) has brought to mind many issues of search usability. At the core is the issue of search literacy. Offering some fundamental searching tips to non-professional searchers often results in a surprised reaction. (e.g. When told, if seeking information about a specific topic such as “industrial engineering,” enclose it in quotes to limit the search to that phrase. Without quotes, you will get all content with “industrial” and “engineering” anywhere in the content with no explicit relationship implied.)

If you are reading this you probably know that, but many do not. In order to learn what people search for on their company intranet and how they type their search requests, I spend time reading search log files. I do this for several reasons:

  • To learn terminology searchers are using to guide taxonomy building choices
  • To see the way searches are formulated, and followed up
  • To inform design decisions about how to make searching easier
  • To see what is searched but not found to inform future content inclusion
  • To view the searcher’s next step when the results are zero or huge

wo results remain consistent: less than 1% of the searchers place a phrase inside quotations, even when there are multiple words; word are often truncated but do not include a truncation symbol (usually an asterisk, “*”). Both reveal a probable lack of search conventions understanding, a search literacy problem. Here are a couple of possible solutions:

  • Put into place better help and training mechanisms to help the lost find their way,

OR

  • Remove the legacy practice of forcing command language type symbols on searchers for the most common search requests

Placing punctuation around a search string is a holdover from 30 years ago when searching was done using a command language. Since only a limited number of people ever knew this syntactical format, why does it persist as the default for a phrase search for Web-based search engines?

The solution of providing a better help page and getting people to actually use it is a harder proposition. This one from McGraw-Hill for BusinessWeek Online is pretty simple with just seven tips but who reads it? I expect very few, although it could dramatically improve their search results. http://search.businessweek.com/advanced.jsp.

If you are trying to improve the search experience for your intranet, there are two resources to consult for content usability on all fronts, not just search: useit.com, Jakob Nielsen’s Website and Jared Spool’s UIEtips, User Interface Engineering’s free email newsletter. In the meantime, think about whether you need to demand more core search usability or tunable default options from vendors, or whether better interface design could guide searchers to better results.

Fast to Acquire Convera’s RetrievalWare Business

Fast Search & Transfer announced its agreement to purchase selected assets of Convera Corporation. Under the terms of the signed agreement, FAST will acquire the assets of Convera’s RetrievalWare business which supports a wide range of mission-critical programs at government agencies and commercial enterprises. The acquisition, priced at $23 million, will help FAST expand its presence primarily in the government markets. Convera and FAST have also announced that Convera has licensed FAST Ad Momentum, a private-label contextual advertising and monetization platform developed with the support of online publishers. FAST Ad Momentum will be integrated with Convera’s hosted vertical search solution and its Publisher Control Panel. Expected to close in the second quarter, the acquisition is limited to Convera’s RetrievalWare business. Convera will continue to trade under the NASDAQ symbol CNVR. http://www.fastsearch.com, http://www.convera.com/

Google and Microsoft debate Enterprise Search in keynote at Gilbane San Francisco

Join us on April 11, 8:30 am at the Palace Hotel in San Francisco for Gilbane San Francisco 2007

We have expanded our opening keynote to include a special debate between Microsoft and Google on Enterprise Search and Information Access, in addition to our discussion on all content technologies with IBM, Oracle & Adobe.

You still have time to join us for this important and lively debate at the Palace Hotel, April 11. The keynote is open to all attendees, even those only planning to visit the technology showcase. The full keynote runs from 8:30am to 10:15am followed by a coffee break and the opening of the technology showcase, and now includes:

Keynote Panel: Content Technology Industry Update PART 2
Google and Microsoft are competing in many areas on many levels. One area which both are ramping-up quickly is enterprise search. In this part of the opening keynote, we bring the senior product managers face to face to answer our questions about their plans and what this means for enterprise information access and content management strategies.

Moderator: Frank Gilbane, Conference Chair, CEO, Gilbane Group, Inc.
Panelists:
Jared Spataro, Group Product Manager, Enterprise Search, Microsoft
Nitin Mangtani, Lead Product Manager, Google Search Appliance, Google

See the complete keynote description.

Gilbane San Francisco 2007
Content management, enterprise search, localization, collaboration, wikis, publishing …
Complete conference information is at http://gilbanesf.com/07/conference_grid.html

http://gilbanesf.com/07/

What is Under the Hood?

Last week I began this entry, re-considered how to make the point and tucked it away. Today I unearthed an article I had not gotten around to putting into my database of interesting and useful citations. Lisa Nadile in The ABCs of Search Engine Marketing, in CIO Magazine, hits the nail on the head with this statement, “Each search engine has its own top-secret algorithm to analyze this data…” This is tongue in cheek so you need to read the whole article to get the humor. Ms. Nadile’s article is geared to Internet marketing but the comments about search engines are just a relevant for enterprise search.

I may be an enterprise search analyst but there are a lot of things I don’t know about the guts of current commercial search tools. Some things I could know if I am willing to spend months studying patents and expensive reports, while other things are protected as trade secrets. I will never know what is under the hood of most products. Thirty years ago I knew a lot about relatively simple concepts like b-tree indexes and hierarchical, relational, networked and associative data structures for products I used and developed.

My focus has shifted to results and usability. My client has to be able to find all the content in their content repository or crawled site. If not, it had better be easy to discover why, and simple to take corrective actions with the search engine’s administration tools, if that is where the problem lies. If the scope of the corpus of content to be searched is likely to grow to hundreds of thousands of documents, I also care about hardware resource requirements and performance (speed) and scalability. And, if you have read previous entries, you already know that I care a lot about service and business relationships with the vendor because that is crucial to long term success. No amount of “whiz bang” technology will overcome a lousy client/vendor relationship.

Finding out what is going on under the hood with some imponderable algorithms isn’t really going to do me or my client any good when evaluating search products. Either the search tool finds stuff the way my client wants to find it, or it doesn’t. “Black art,” trade secret or “patent protected” few of us would really understand the secret sauce anyway.

« Older posts Newer posts »

© 2024 The Gilbane Advisor

Theme by Anders NorenUp ↑