Curated for content, computing, and digital experience professionals

Category: Semantic technologies (Page 34 of 72)

Our coverage of semantic technologies goes back to the early 90s when search engines focused on searching structured data in databases were looking to provide support for searching unstructured or semi-structured data. This early Gilbane Report, Document Query Languages – Why is it so Hard to Ask a Simple Question?, analyses the challenge back then.

Semantic technology is a broad topic that includes all natural language processing, as well as the semantic web, linked data processing, and knowledge graphs.


Search Behind the Firewall aka Enterprise Search

Called to account for the nomenclature “enterprise search,” which is my area of practice for The Gilbane Group, I will confess that the term has become as tiresome as any other category to which the marketplace gives full attention. But what is in a name, anyway? It is just a label and should not be expected to fully express every attribute it embodies. A year ago I defined it to mean any search done within the enterprise with a primary focus of internal content. “Enterprise” can be an entire organization, division, or group with a corpus of content it wants to have searched comprehensively with a single search engine.

A search engine does not need to be exclusive of all other search engines, nor must it be deployed to crawl and index every single repository in its path to be referred to as enterprise search. There are good and justifiable reasons to leave select repositories un-indexed that go beyond even security concerns, implied by the label “search behind the firewall.” I happen to believe that you can deploy enterprise search for enterprises that are quite open with their content and do not keep it behind a firewall (e.g. government agencies, or not-for-profits). You may also have enterprise search deployed with a set of content for the public you serve and for the internal audience. If the content being searched is substantively authored by the members of the organization or procured for their internal use, enterprise search engines are the appropriate class of products to consider. As you will learn from my forthcoming study, Enterprise Search Markets and Applications: Capitalizing on Emerging Demand, and that of Steve Arnold (Beyond Search) there are more than a lot of flavors out there, so you’ll need to move down the food chain of options to get it right for the application or problem you are trying to solve.

OK! Are you yet convinced that Microsoft is pitting itself squarely against Google? The Yahoo announcement of an offer to purchase for something north of $44 billion makes the previous acquisition of FAST for $1.2 billion pale. But I want to know how this squares with IBM, which has a partnership with Yahoo in the Yahoo edition of IBM’s OmniFind. This keeps the attorneys busy. Or may-be Microsoft will buy IBM, too.

Finally, this dog fight exposed in the Washington Post caught my eye, or did one of the dogs walk away with his tail between his legs? Google slams Autonomy – now, why would they do that?

I had other plans for this week’s blog but all the Patriots Super Bowl talk puts me in the mode for looking at other competitions. It is kind of fun.

Beyond Search and Search

As many of you know from our press release at Gilbane Boston, two of the reports we will be publishing in the next few of months have to do with search. Lynda Moulton, who runs our Enterprise Search consulting practice is working on Enterprise Search Markets and Applications: Capitalizing on Emerging Demand, and our colleague Steve Arnold is writing Beyond Search: What to do When you’re Enterprise Search System Doesn’t Work. Lynda’s report covers the “Enterprise Search” market, what organizations are doing with the variety of technologies considered to be enterprise search products, and what their experiences have been. By the way Lynda is collecting experiences about implementations and would love to hear about yours.

Steve’s report is a look at what is coming next, and is largely, but not only, based on an analysis of what Google is doing, what they are planning on doing, and the emerging ecosystem they are creating. This is fascinating stuff. Steve has recently launched a must-read blog, Beyond Search, where you can get a peek at some of what will be in our report. For example, see his thoughts on enterprise search terminology.

Both reports will be important tools for enterprise IT strategists and executives. We’ll keep you posted on their progress.

Search Adoption is a Tricky Business: Knowledge Needed

Enterprise search applications abound in the technology marketplace, from embedded search to specialized e-discovery solutions to search engines for crawling and indexing the entire intranet of an organization. So, why is there so much dissatisfaction with results and heaps of stories of buyer’s remorse? Are we on the cusp of a new wave of semantic search options or better ways to federate our universe of content within and outside the enterprise? Who are the experts on enterprise search anyway?

You might read this blog because you know me from the knowledge management (KM) arena, or from my past life as the founder of an integrated enterprise library automation company. In the KM world a recurring theme is the need to leverage expertise, best done in an environment where it is easy to connect with the experts but that seems to be a dim option in many enterprises. In the corporate library world the intent is to aggregate and filter a substantive domain of content, expertise and knowledge assets on behalf of the specialized interests of the enterprise, too often a legacy model of enterprise infrastructure. Librarians have long been innovators at adopting and leveraging advanced technologies but they have also been a concentrating force for facilitating shared expertise. In fact, special librarians excel at providing access to experts.

We are drowning in technological options, not the least of which is enterprise search and its complexity of feature laden choices. However, it is darned hard to find instances of full search tool adoption or users who love the search tools they are delivered on their intranets. So, I am adopting my KM and library science modes to elevate the discussion about search to a decidedly non-technical conversation.

I really want to learn what you know about enterprise search, what you have learned, discovered and experienced over the past two or three years. This blog and the work I do with The Gilbane Group is about getting readers to the best and most appropriate search solutions that can make positive contributions in their enterprises. Knowing who is using what and where it has succeeded or what problems and issues were encountered is information I can use to communicate, in aggregate, those experiences. I am reaching out to you and those you refer to complete a five minute survey to open the door to more discussion. Please use this link to participate right now Click Here to take survey. You will then have the option to get the resulting details in my upcoming research study on enterprise search.

Just to prove that I still follow exciting technologies, as well, I want to relay a couple of new items. First is a recent category in search, “active intelligence,” adopted as Attivio’s tag line. This is a start-up led by Ali Riaz and officially launched this week from Newton, MA. Then, to get a steady feed of all things enterprise search from guru Steve Arnold, check out his new blog, a lead up to the forthcoming Beyond Search: What to Do When Your Search Engine Doesn’t Work to be published by The Gilbane Group. You’ll be transported from the historical, to the here and now, to the newest tools on his radar screen as you page from one blog entry to another.

Nothing Like a Move by Microsoft to Stir up Analysis and Expectations

Since I weighed in last week on the Microsoft acquisition of FAST Search & Transfer, I have probably read 50+ blog entries and articles on the event. I have also talked to other analysts, received emails from numerous search vendors summarizing their thoughts and expectations about the enterprise search market and had a fair number of phone calls asking questions about what it means. The questions range from “Did Microsoft pay too much?, to “Please define enterprise search,” to “What are the next acquisitions in this market going to be going to be?” My short and flippant answers would be “No,” “Do you have a few hours?” and “Everyone and no one.”

I have seen some excellent analysis contributing relevant commentary to this discussion, some misinterpretation of what the distinction’s are between enterprise search and Web search, and some conclusions that I would seriously debate. You’ll forgive me if I don’t include links to the pieces that influenced the following comments. But one by Curt Monash in his piece on January 14 summarized the state of this industry and its already long history. It is noteworthy that while the popular technology press has only recently begun to write about enterprise search, it has been around for decades in different forms and in a short piece he manages to capture the highlights and current state.

Other commentary seems to imply that Microsoft is not really positioning itself to compete with Google because Google is really about Web (Internet) searching and Microsoft is not. This implies that FAST has no understanding of Web searching. Several points must be made:

  1. FAST Search & Transfer has been involved in many aspects of search technologies for a decade. Soon after landing on our shores it was the search engine of choice for the U.S. government’s unifying search engine to support Internet-based searching of agency Web sites by the public. Since then it has helped countless enterprises (e.g. governments, manufacturers, e-commerce companies) expose their content, products and services via the Web. FAST knows a lot about how to make Web search better for all kinds of applications and they will bring that expertise to Microsoft.
  2. Google is exploiting the Web to deliver free business software tools that directly challenge Microsoft stronghold ( e.g. email, word processing). This will not go unanswered by the largest supplier of office automation software.
  3. Google has several thousand Google Enterprise Search Appliances installed in all types of enterprises around the world, so it is already as widely deployed in enterprises in terms of numbers as FAST, albeit at much lower prices and for simpler application. That doesn’t mean that they are not satisfying a very practical need for a lot of organizations where it is “good enough.”

For more on the competition between the two check this article out.

Enterprise search has been implied to mean only search across all content for an entire enterprise. This raises another fundamental problem of perception. Basically, there are few to no instances of a single enterprise search engine being the only search solution for any major enterprise. Even when an organization “standardizes” on one product for its enterprise search, there will be dozens of other instances of search deployed for groups, divisions, and embedded within applications. Just two examples are the use of Vivisimo now used for USA.gov to give the public access to government agency public content, even as each agency uses different search engines for internal use. Also, there is IBM, which offers the OmniFind suite of enterprise search products, but uses Endeca internally for its Global Services Business enterprise.

Finally, on the issue of expectations, most of the vendors I have heard from are excited that the Microsoft announcement confirms the existence of an enterprise search market. They know that revenues for enterprise search, compared to Web search, have been miniscule. But now that Microsoft is investing heavily in it, they hope that top management across all industries will see it as a software solution to procure. Many analysts are expecting other major acquisitions, perhaps soon. Frequently mentioned buyers are Oracle and IBM but both have already made major acquisitions of search and content products, and both already offer enterprise search solutions. It is going to be quite some time before Microsoft sorts out all the pieces of FAST IP and decides how to package them. Other market acquisitions will surely come. The question is whether the next to be acquired will be large search companies with complex and expensive offerings bought by major software corporations. Or will search products targeting specific enterprise search markets be a better buy to make an immediate impact for companies seeking broader presence in enterprise search as a complementary offering to other tools. There are a lot of enterprise search problems to be solved and a lot of players to divvy up the evolving business for a while to come.

W3C Opens Data on the Web with SPARQL

W3C (The World Wide Web Consortium) announced the publication of SPARQL, the key standard for opening up data on the Semantic Web. With SPARQL query technology, pronounced “sparkle,” people can focus on what they want to know rather than on the database technology or data format used behind the scenes to store the data. Because SPARQL queries express high-level goals, it is easier to extend them to unanticipated data sources, or even to port them to new applications. Many successful query languages exist, including standards such as SQL and XQuery. These were primarily designed for queries limited to a single product, format, type of information, or local data store. Traditionally, it has been necessary to formulate the same high-level query differently depending on application or the specific arrangement chosen for the relational database. And when querying multiple data sources it has been necessary to write logic to merge the results. These limitations have imposed higher developer costs and created barriers to incorporating new data sources. The goal of the Semantic Web is to enable people to share, merge, and reuse data globally. SPARQL is designed for use at the scale of the Web, and thus enables queries over distributed data sources, independent of format. Because SPARQL has no tie to a specific database format, it can be used to take advantage of “Web 2.0” data and mash it up with other Semantic Web resources. Furthermore, because disparate data sources may not have the same ‘shape’ or share the same properties, SPARQL is designed to query non-uniform data. The SPARQL specification defines a query language and a protocol and works with the other core Semantic Web technologies from W3C: Resource Description Framework (RDF) for representing data; RDF Schema; Web Ontology Language (OWL) for building vocabularies; and Gleaning Resource Descriptions from Dialects of Languages (GRDDL), for automatically extracting Semantic Web data from documents. SPARQL also makes use of other W3C standards found in Web services implementations, such as Web Services Description Language (WSDL). http://www.w3.org/

Microsoft and FAST

Yesterday was obviously a big day in the enterprise search space. “Enterprise search”, as opposed to web search news, doesn’t usually make the New York Times, Wall Street Journal Boston Globe etc. We (especially Lynda!) spent a lot of time yesterday just dealing with all the inquiries. Lynda posted her initial thoughts before the analyst call yesterday, as did Steve Arnold. Both will certainly have more to say. In addition to their blogging keep an eye out for the two reports we’ll be publishing this Spring: Enterprise Search Markets and Applications: Capitalizing on Emerging Demand, by Lynda Moulton, and Beyond Search: What to do When You’re Enterprise Search System Doesn’t Work, by Steve Arnold.

Microsoft Announces Offer to Acquire Fast

Microsoft Corp. (Nasdaq “MSFT”) announced that it will make an offer to acquire Fast Search & Transfer ASA (OSE: “FAST”) through a cash tender offer for 19.00 Norwegian kroner (NOK) per share. This offer represents a 42 percent premium to the closing share price on Jan. 4, 2008 (the last trading day prior to this announcement), and values the fully diluted equity of FAST at 6.6 billion NOK (or approximately $1.2 billion U.S.). FAST’s board of directors has unanimously recommended that its shareholders accept the offer. In addition, shareholders representing in aggregate 37 percent of the outstanding shares, including FAST’s two largest institutional shareholders, Orkla ASA and Hermes Focus Asset Management Europe, have irrevocably undertaken to accept the offer. The transaction is expected to be completed in the second quarter of calendar year 2008. In addition to bolstering Microsoft’s enterprise search efforts, this acquisition increases Microsoft’s research and development presence in Europe, complementing existing research teams in Cambridge, England, and Copenhagen, Denmark, with new capabilities in Norway. http://microsoft.com, http://www.fast.no/

Enterprise Search and Its Semantic Evolution

That the Gilbane Group launched its Enterprise Search Practice this year was timely. In 2007 enterprise search become a distinct market force, capped off with Microsoft announcing in November that it has definitively joined the market.

Since Jan. 1, 2007, I have tried to bring attention to those issues that inform buyers and users about search technology. My intent has been to make it easier for those selecting a search tool while helping them to get a highly satisfactory result with minimal surprises. Playing coach and lead champion while clarifying options within enterprise search is a role I embrace. It is fitting then, that I wrap up this year with more insights gained from Gilbane Boston; these were not previously highlighted and relate to semantic search.

The semantic Web is a concept introduced almost ten years ago reflecting a vision of how the Worldwide Web (WWW) would evolve. In the beginning we needed a specific address (URL) to get to individual Web sites. Some of these had their own search engines while others were just pages of content we scrolled through or jumped through from link to link. Internet search engines like Alta Vista and Northern Light searched limited parts of the WWW. Then, Yahoo and Google came to provide much broader coverage of all “free” content. While popular search engines provided various categorizing, taxonomy navigation, keyword and advanced searching options, you had to know the terminology that content pages contained to find what you meant to retrieve. If your terms were not explicitly in the content, pages with synonymous or related meaning were not found. The semantic Web vision was to “understand” your inquiry intent and return meaningful results through its semantic algorithms.

The most recent Gilbane Boston conference featured presentations of commercial applications of various semantic search technologies that are contributing to enterprise search solutions. A few high level points gleaned from speakers on analytic and semantic technologies follow.

  • Jordan Frank on blogs and wikis in enterprises articulated how they add context by tying content to people and other information like time. Human commentary is a significant content “contextualizer,” my term, not his.
  • Steve Cohen and Matt Kodama co-presented an application using technology (interpretive algorithms integrated with search) to elicit meaning from erratic and linguistically difficult (e.g. Arabic, Chinese) text in the global soup of content.
  • Gary Carlson gave us understanding of how subject matter expertise contributes substantively to building terminology frameworks (aka “taxonomies”) that are particularly meaningful within a unique knowledge community.
  • Mike Moran helped us see how semantically improved search results can really improve the bottom line in the business sense in both his presentation and later in his blog, a follow-up to a question I posed during the session.
  • Colin Britton described the value of semantic search to harvest and correlate data from highly disparate data sources needed to do criminal background checks.
  • Kate Noerr explained the use of federating technologies to integrate search results in numerous scenarios, all significant and distinct ways to create semantic order (i.e. meaning) out of search results chaos.
  • Bruce Molloy energized the late sessions with his description of how non-techies can create intelligent agents to find and feed colleagues relevant information by searching in the background in ways that go far beyond the typical keyword search.
  • Finally, Sean Martin and John Stone co-presented an approach to computational data gathering and integrating the results in an analyzed and insightful format that reveals knowledge about the data, not previously understood.

Points taken are that each example represents a building block of the semantic retrieval framework we will encounter on the Web and within the enterprise. The semantic Web will not magically appear as a finished interface or product but it will become richer in how and what it helps us find. Similar evolutions will happen in the enterprise with a different focus, providing smarter paths for operating within business units.

There is much more to pass along in 2008 and I plan to continue with new topics relating to contextual analysis, the value, use and building of taxonomies, and the variety of applications of enterprise search tools. As for 2007, it’s a wrap.

« Older posts Newer posts »

© 2024 The Gilbane Advisor

Theme by Anders NorenUp ↑