Category: Semantic technologies (Page 32 of 72)

Our coverage of semantic technologies goes back to the early 90s when search engines focused on searching structured data in databases were looking to provide support for searching unstructured or semi-structured data. This early Gilbane Report, Document Query Languages – Why is it so Hard to Ask a Simple Question?, analyses the challenge back then.

Semantic technology is a broad topic that includes all natural language processing, as well as the semantic web, linked data processing, and knowledge graphs.

Powerset Unveils Semantic Search

May 12, 2008 / NewsShark

Powerset released a service for searching Wikipedia that illustrates the capapbilities of the semantic search engine they are developing. From their site “Powerset’s goal is to change the way people interact with technology by enabling computers to understand our language. … Powerset is first applying its natural language processing to search, aiming to improve the way we find information by unlocking the meaning encoded in ordinary human language. … Powerset’s technology improves the entire search process. In the search box, you can express yourself in keywords, phrases, or simple questions. On the search results page, Powerset gives more accurate results, often answering questions directly, and aggregates information from across multiple articles. Finally, Powerset’s technology follows you into enhanced Wikipedia articles, giving you a better way to quickly digest and navigate content.” http://www.powerset.com/

Google Executive to Provide Opening Keynote Address on Search Quality at Upcoming Gilbane San Francisco Conference

April 22, 2008 / NewsShark

The Gilbane Group and Lighthouse Seminars announced that Udi Manber, a Google Vice President of Engineering, will kick-off the annual Gilbane San Francisco conference on June 18th at 8:30am with a discussion on Google’s search quality and continued innovation. Now in its fourth year, the conference has rapidly gained a reputation as a forum for bringing together vendor-neutral industry experts that share and debate the latest information technology experiences, research, trends and insights. The conference takes place June 18-20 at the Westin Market Hotel in San Francisco. Gilbane San Francisco helps attendees move beyond the mainstream content technologies they are familiar with, to enhanced “2.0” versions, which can open up new business opportunities, keep customers engaged, and improve internal communication and collaboration. The 2008 event will have its usual collection of information and content technology experts, including practitioners, technologists, business strategists, consultants, and the leading analysts from a variety of market and technology research firms. Topics to be covered in-depth at Gilbane San Francisco include– Web Content Management (WCM); Enterprise Search, Text Analytics, Semantic Technologies; Collaboration, Enterprise Wikis & Blogs; “Enterprise 2.0” Technologies & Social Media; Content Globalization & Localization; XML Content Strategies; Enterprise Content Management (ECM); Enterprise Rights Management (ERM); and Publishing Technology & Best Practices. Details on the Google keynote session as well as other keynotes and conference breakout sessions can be found at http://gilbanesf.com/conference-grid.html

Beyond Search report introductory price available for 1 more week

April 18, 2008 / Frank Gilbane / 0 Comments

In case you missed it, we published our latest report, Beyond Search: What to do When Your Enterprise Search System Doesn’t Work, by Stephen Arnold, early last week, and it is available at a special introductory price through April 25. More details are at: , or you can go right to the store at: .

Only Humans can Ensure the Value of Search in Your Enterprise

April 14, 2008 / Lynda Moulton / 1 Comment

While considering what is most important in selecting the search tools for any given enterprise application, I took a few minutes off to look at the New York Times. This article, He Wrote 200,000 Books (but Computers Did Some of the Work), by Noam Cohen, gave me an idea about how to compare Internet search with enterprise search.

A staple of librarians’ reference and research arsenal has been a category of reference material called “bibliographies of bibliographies.” These works, specific to a subject domain, are aimed at a usually scholarly audience to bring a vast amount of content into focus for the researcher. Judging from the article, that is what Mr. Parker’s artificial intelligence is doing for the average person who needs general information about a topic. According to at least one reader, the results are hardly scholarly.

This article points out several things about computerized searching:

It does a very good job of finding a lot of information easily.
Generalized Internet searching retrieves only publicly accessible, free-for-consumption, content.
Publicly available content is not universally vetted for accuracy, authoritativeness, trustworthiness, or comprehensiveness, even though it may be all of these things.
Vast amounts of accurate, authoritative, trustworthy and comprehensive content does exist in electronic formats that search algorithms used by Mr. Parker or the rest of us on the Internet will never see. That is because it is behind-the-firewall or accessible only through permission (e.g. subscription, need-to-know). None of his published books will serve up that content.

Another concept that librarians and scholars understand is that of primary source material. It is original content, developed (written, recorded) by human beings as a result of thought, new analysis of existing content, bench science, or engineering. It is often judged, vetted, approved or otherwise deemed worthy of the primary source label by peers in the workplace, professional societies or professional publishers of scholarly journals. It is often the substance of what get republished as secondary and tertiary sources (e.g. review articles, bibliographies, books).

We all need secondary and tertiary sources to do our work, learn new things, and understand our work and our world better. However, advances in technology, business operations, and innovation depend on sharing primary source material in thoughtfully constructed domains in our enterprises of business, healthcare, or non-profits. Patient’s laboratory or mechanical device test data that spark creation of primary source content need surrounding context to be properly understood and assessed for value and relevancy.

To be valuable enterprise search needs to deliver context, relevance, opportunities for analysis and evaluation, and retrieval modes that give the best results for any user seeking valid content. There is a lot that computerized enterprise search can do to facilitate this type of research but that is not the whole story. There must still be real people who select the most appropriate search product for that enterprise and that defined business case. They must also decide content to be indexed by the search engine based on its value, what can be secured with proper authentication, how it should be categorized appropriately, and so on. To throw a computer search application at any retrieval need without human oversight is a waste of capital. It will result in disappointment, cynicism and skepticism about the value of automating search because the resulting output will be no better than Mr. Parker’s books.

Semantic Technologies and our CTO Blog

April 10, 2008 / Frank Gilbane / 0 Comments

We host a number of blogs, some more active than others. One of the least active (although it still gets a surprising amount of traffic) has been our CTO blog. However, I am happy to say that Colin Britton started blogging on semantic technologies yesterday. As a co-founder and CTO of Metatomix he led the development of a commercial product based on RDF – a not very well understood W3C semantic web standard. Colin’s first post on the CTO blog starts a series that will help shed a little more light on semantic technologies and their practical applications.

Some of you know that I remain skeptical of the new world “Semantic Web” vision, but I do think semantic technologies are important and have a lot to offer, and Colin will help you see why. Check out his first post and let him know what you think about semantic technologies and what you would like to know about.

Introduction to Semantic Technology

April 9, 2008 / Colin Britton / 0 Comments

Ten years ago I had a belief that a meta-data approach to managing enterprise information was a valid way to go. The various structures, relationships and complexities of IT systems led to disjointed information. By relating the information elements to each other, rather than synchronizing the information together, we _might_ stand a chance.

At the same time a new set of standards was emerging, standards to describe, relate and query a new information model, based on meta-data, these became know as the Semantic Web, outlined in a Scientific American article (http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21 ) in 2001.

Fast forward to 2008 – where are we with this vision. Some part of me is thrilled, another part disappointed. We have adoption of these standards and this approach at use in everyday information management situations. Major software companies and startup’s alike are implementing Semantic Technology in their offerings and products. However, I am disappointed that we still find it hard to communicate what this semantic technology means and how valuable it is. Most technologists I meet glaze over at the mention of the Semantic Web or any of it’s standards, yet when asked if they think RSS is significant, praise it’s contributions.

Over a series of posts to this blog, I would like to try and explain, share and show some of the value of Semantic Technology and why one should be looking at it.

Let’s start with what is Semantic Technology and what are the standards that define it’s openness. To quote Wikipedia “In software, semantic technology encodes meanings separately from data and content files, and separately from application code.” This abstraction is a core tenant and value provided by a Semantic approach to information management. The idea that our database or programming patterns do no restrict the form or boundaries of our information is a large shift from traditional IT solutions. The idea that our business logic should not be tied to the code that implements it, nor the information that it operates on is all provided through this semantic representation. So firstly ABSTRACTION is a key definition.

The benefit of this is that systems, machines, solutions, whatever term you wish to use can interact with each other – share, understand and reason, without having been explicitly programmed to understand each other.

With this you get to better manage CHANGE. Your content and systems can evole or change with the changes managed through the Semantic Technology layer.

So what makes up Semantic Technology, one sees the word in a number of soltuions or technologies, are they all created equal.

In my view, Semantic Technology can only truly claim to be so, if it is based on and implements the standards laid out through the (W3C) World Wide Web Consortium standards process. http://www.w3.org/2001/sw/

The vision of the Semantic Web and the standards required to support it continue to expand, but the anchor standards have been laid out for a while.

RDF – The model and syntax for describing information. It is important to understand that with the RDF standards there are multiple things defined to create this standard – the model (or data model) , the syntax (how it is written/serialized) and the formal semantics (or logic described by the use of rdf). In 2004, the original RDF specification was revised and published as 6 separate documents, each covering an important area of the standard.

RDF-S – Provides a typing system for RDF and the basic constructs for expressing Ontologies and relationships within the meta data structure.

OWL – To quote the W3C paper, this facilitates greater machine interpretability of Web content than that supported by XML, RDF, and RDF-S by providing additional vocabulary along with a formal semantics.

SPARQL – While anyone with a Semantic Technology solution invented there own query language (why was this never there one in the first place!), SPARQL, pronounced “sparkle” is the w3c standardization of one. It is HUGE for Semantic Technology and makes all the effort with the other three standards worthwhile.

These standards are quite a pile to sift through, understanding the capabilities embodied in them takes significant effort, but it is the role of technologists in this arena to remove that need for you to understand them. It is our job to provide tools, solutions and capabilities that leverage the these standards bringing semantic technology to life and deliver the power defined within them.

But that is the subject of another post. So what does this all mean in real life? In my next post I will layout a concrete example using product information as an example.

Parsing the Enterprise Search Landscape

April 3, 2008 / Lynda Moulton / 0 Comments

Steve Arnold’s Beyond Search report is finally launched and ready for purchase. Reviewing it gave me a different perspective on how to look at the array of 83 search companies I am juggling in my upcoming report: Enterprise Search Markets and Applications. For example, technological differentiators can channel your decisions about must haves/have nots in your system selection. Steve codifies considerations and details 15 technology tips that will help you frame those considerations.

We are getting ready for the third Gilbane Conference in which “search” has been a significant part of the presentation landscape in San Francisco, June 17 – 20th.Six sessions will be filled with case studies and enlightening “how-to-do-it-better” guidance from search experts with significant “hands-on” experience in the field. I will be conducting a workshop, immediately after the conference, How to Successfully Adopt and Deploy Search. Presentations by speakers and the workshop will focus on users’ experiences and guidance for evaluating, buying and implementing search. Viewing search from a usage perspective begs a different set of classification criteria for divvying up the products.

In February, Business Trends published an interview I gave them in December, Revving up Search Engines in the Enterprise. There probably isn’t much new in it for those who routinely follow this topic but if you are trying to find ways to explain what it is, why and how to get started, you might find some ideas for opening the discussion with others in your business setting. The intended audience is those who don’t normally wallow in search jargon. This interview pretty much covers the what, why, who, and when to jump into procuring search tools for the enterprise.

For my report, I have been very pleased with discussions I’ve had with a couple dozen people immersed in evaluating and implementing search for their organizations. Hearing them describe their experiences guides other ways to organize a potpourri of search products and how buyers should approach their selection. With over eighty products we have a challenge in how to parse the domain. I am segmenting the market space into multiple dimensions from the content type being targeted by “search” to the packaging models the vendors offer. When laying out a simple “ontology” of concepts surrounding the search product domain, I hope to clarify why there are so many ways of grouping the tools and products being offered. If vendors read the report to decide which buckets they belong in for marketing and buyers are able to sort out the type of product they need, the report will have achieved one positive outcome. In the meantime, read Frank Gilbane’s take on the whole topic of enterprise tacked onto any group of products.

As serendipity would have it, a colleague from Boston KM Forum, Marc Solomon, just wrote a blog on a new way of thinking of the business of classifying anything, “Word Algebra.” And guess who gave him the inspiration, Mr. Search himself, Steve Arnold. As a former indexer and taxonomist I appreciate this positioning of applied classification. Thinking about why we search gives us a good idea for how to parse content for consumption. Our parameters for search selection must be driven by that WHY?

IBM Labs Announces ProAct, Text Analytics for Call Centers

March 17, 2008 / NewsShark

Researchers at IBM’s India Research Laboratory have developed software technology that uses sophisticated math algorithms to extract and deliver business insights hidden within the information gathered by companies during customer service calls and other interactions. The new business intelligence technology, called ProAct, is a text analytics tool, which automates previously manual analysis and evaluation of customer service calls and provides insight to help companies assess and improve their performance. ProAct provides an integrated analysis of structured information such as agent and product databases and unstructured data such as email, call logs, call transcription to identify reason for dissatisfaction, agent performance issues and typical product problems. Based on the Unstructured Information Management Analysis (UIMA) framework that IBM contributed to the open source Apache Software Foundation in 2006, the ProAct technology was initially developed as a service engagement. Now the new algorithms are being packaged in software and deployed in many IBM call center customers around the world. UIMA is an open source software framework that helps organizations build new analysis technologies that help organizations gain more insight from their unstructured information by discovering relationships, identifying patterns, and predicting outcomes. IBM uses UIMA to enable text analysis, extraction and concept search capabilities in other parts of its portfolio of enterprise search software products, including OmniFind Enterprise Edition, OmniFind Analytics Edition, and OmniFind Yahoo! Edition. http://www.research.ibm.com/irl/

Category: Semantic technologies (Page 32 of 72)

Powerset Unveils Semantic Search

Google Executive to Provide Opening Keynote Address on Search Quality at Upcoming Gilbane San Francisco Conference

Beyond Search report introductory price available for 1 more week

Only Humans can Ensure the Value of Search in Your Enterprise

Semantic Technologies and our CTO Blog

Introduction to Semantic Technology

Parsing the Enterprise Search Landscape

IBM Labs Announces ProAct, Text Analytics for Call Centers

Subscribe to the Gilbane Advisor

Choose Language

Topics we cover

Policies

Contact