We launched a new consulting practice and blog yesterday to focus on Enterprise Search. I am thrilled to have Lynda Moulton join us to be the Lead Analyst of the practice. Lynda has a long and deep experience as an expert on research technologies as a software developer, entrepreneur, and consultant. We’ve been getting more calls for help about enterprise search over the past year, as well as increased interest in our Enterprise Search track at our conferences – the topic cried out for a dedicated focus. Visit our new Enterprise Search blog at https://gilbane.com/author/lynda-moulton/, and let Lynda know what questions you have about search technology and its enterprise application.
Category: Semantic technologies (Page 42 of 72)
Our coverage of semantic technologies goes back to the early 90s when search engines focused on searching structured data in databases were looking to provide support for searching unstructured or semi-structured data. This early Gilbane Report, Document Query Languages – Why is it so Hard to Ask a Simple Question?, analyses the challenge back then.
Semantic technology is a broad topic that includes all natural language processing, as well as the semantic web, linked data processing, and knowledge graphs.
The Gilbane Group announced today that they have launched a new research and consulting practice covering Enterprise Search technologies and applications. The new practice is lead by industry veteran and research expert Lynda Moulton. The new practice complements existing Gilbane Group consulting services that cover a broad range of content technologies, as well as the Gilbane Group’s Publishing Technology and Strategy consulting practice. While the Gilbane Group has covered enterprise search technologies since 1993, today’s demand from a broad range of organizations for solid information and guidance needs to be met with a highly focused dedicated effort. The Enterprise Search practice is supported by a new blog devoted to the topic as well as the Enterprise Search track at Gilbane conferences. The Enterprise Search blog went live on January 1 with an introductory entry by Lead Analyst Lynda Moulton. Visit the new blog at: https://gilbane.com/search_blog/. UPDATE: This blog has moved here.
The recent Web 2.0 conference predictably accelerated some prognostication on Web 3.0. I don’t think these labels are very interesting in themselves, but I do admit that the conversations about what they might be, if they had a meaningful existence, expose some interesting ideas. Unfortunately, they (both the labels and the conversations) also tend to generate a lot of over-excitement and unrealistic expectations, both in terms of financial investment and doomed IT strategies. Dan Farber does his usual great job of collecting some of the thoughts on the recent discussion in “Web 2.0 isn’t dead, but Web 3.0 is bubbling up“.
One of the articles Dan links to is a New York Times article by John Markoff, where John basically equates Web 3.0 with the Semantic Web. Maybe that’s his way of saying very subtly that there will never be a Web 3.0? No, he is more optimistic. Dan also links to Nick Carr’s post welcoming Web 3.0, but even Carr is gentler that he should be.
But here’s the basic problem with the Semantic Web – it involves semantics. Semantics are not static, language is not static, science is not static. Even more, rules are not static either, but at least in some cases, syntax, and logical systems have longer shelf lives.
Now, you can force a set of semantics to be static and enforce their use – you can invent little worlds and knowledge domains where you control everything, but there will always be competition. That’s how humans work, and that is how science works as far as we can tell. Humans will break both rules and meanings. And although the Semantic Web is about computers as much (or more) than about humans, the more human-like we make computers, the more they will break rules and change meanings and invent their own little worlds.
This is not to say that the goal of a Semantic Web hasn’t and won’t generate some good ideas and useful applications and technologies – RDF itself is pretty neat. Vision is a good thing, but vision and near-term reality require different behavior and belief systems.
Index Data has released Zebra 2.0, a major upgrade of its Open Source database server and indexing engine. This upgrade makes index profiling much easier, supports increased tuning of search results, incorporates XML technology into core functionality, and increases performance speed. Some of the highlights of the improvements of Zebra 2.0 over the 1.3 version are: a 64-bit based index structure, elimination of the 2GB limit on register file size, new on-disk format providing increased stability and faster indexing and retrieval, new record filter using XSLT transformations to drive both indexing and retrieval, improved logging and analysis of external traffic, and revised and expanded documentation. Zebra 2.0 replaces the previous versions’ tight coupling to the Z39.50 BIB-1 attribute set with a new XML friendliness, making Zebra easy to use for such XML-based formats as Dublin Core, MODS, METS, MARCXML, OAI-PMH, RSS, etc. The software’s new plug-in architecture allows the skilled user to write his or her own record indexing and retrieval filters as loadable modules. The performance enhancements incorporated into version 2.0 mean that Zebra can now index and search even faster than version 1.3. In a test of Zebra 2.0, the software was able to build a 31 GB database of very large records in four elapsed hours on a 1800 GHz Dual AMD box. It processed an average of 2.2 MB of data per second. Zebra 2.0 offers more precise logging of external traffic, access and indexing, and log messages are now printed in a style similar to Apache server logs. http://www.indexdata.com/zebra
Our friends over at CMS Watch have released an updated version of their Enterprise Search Report. The report suggests a healthy enterprise market and covers 28 vendors. There is a free excerpt available. A few of the findings (taken from the press release) are:
– IBM, Oracle, and Microsoft continue to struggle to rationalize multiple search technologies and strategies. Oracle’s “Secure Enterprise Search 10g” product may be the most straightforward offering of the three, but it has not yet seen extensive customer testing.
– Smaller search vendors continue to exploit Microsoft’s inability to develop effective search solutions atop SharePoint. Mondosoft, Coveo, dtSearch, and others are likely to continue offering value-added capabilities after the release of Microsoft’s new search services in SharePoint 2007.
– Google’s search appliance has disrupted the market, but customer testing still often finds the appliance lacking in “tune-ability” and integration capabilities.
– Faceted or “guided” navigation capabilities originally associated with enterprise search vendor Endeca have gone from product differentiator to widespread feature. Customers can obtain faceted navigation capabilities from several low-cost search vendors. Now, the key differentiator is the extent to which a search system can successfully autogenerate a useful set of metadata “facets” with minimal customer intervention.
Steve Arnold, the main author of the report, will be leading a couple of sessions on Enterprise Search at our upcoming conference with CMS Watch in Washington DC June 13 -15. Join us there and get more details from Steve.
Fast Search & Transfer (FAST) announced that the company has acquired Kopek AS, a developer of a secure online content access platform. The content access platform incorporates security, transport, identification, access control, personalization, payments and clearing, all in one solution. The system works transparently with all types of media and content, including Web pages, Web services, images, email, live video streams, audio, applications, and Internet network access, across all device types. http://www.fastsearch.com
Endeca introduced the Endeca Information Access Platform (IAP) – a new platform built specifically to address an emerging market that is attempting to change the way people access and interact with information. The platform is designed to help people find, analyze and understand information in ways not possible with search engine, database and business intelligence solutions. Powered by Endeca’s MDEX Engine technology, it fuses the ease of search and browsing with the analytical capabilities of business intelligence. The Endeca Information Access Platform aids information-based problem solving across a wide variety of business processes, including eCommerce, marketing campaign analysis, product design and parts reuse, knowledge management, and customer service. To meet specific industry and application requirements, Endeca offers a range of Market Solutions, each designed to accelerate time-to-market. These solutions combine the benefits of the platform with unique application modules and deep market and application-specific services expertise. In a related announcement, Endeca has also unveiled a new, expanded line of these solutions. http://www.endeca.com
Yahoo jumps into the deep end of the pool. This puts the big three (Yahoo, Microsoft, Google) even more on the same path. Competition is a good thing, of course, and here it means that the expansion of content being indexed will only accelerate.