Main

May 12, 2008

Powerset Unveils Semantic Search

Powerset released a service for searching Wikipedia that illustrates the capapbilities of the semantic search engine they are developing. From their site "Powerset's goal is to change the way people interact with technology by enabling computers to understand our language. ... Powerset is first applying its natural language processing to search, aiming to improve the way we find information by unlocking the meaning encoded in ordinary human language. ... Powerset's technology improves the entire search process. In the search box, you can express yourself in keywords, phrases, or simple questions. On the search results page, Powerset gives more accurate results, often answering questions directly, and aggregates information from across multiple articles. Finally, Powerset's technology follows you into enhanced Wikipedia articles, giving you a better way to quickly digest and navigate content." http://www.powerset.com/

April 22, 2008

Google Executive to Provide Opening Keynote Address on Search Quality at Upcoming Gilbane San Francisco Conference

The Gilbane Group and Lighthouse Seminars announced that Udi Manber, a Google Vice President of Engineering, will kick-off the annual Gilbane San Francisco conference on June 18th at 8:30am with a discussion on Google's search quality and continued innovation. Now in its fourth year, the conference has rapidly gained a reputation as a forum for bringing together vendor-neutral industry experts that share and debate the latest information technology experiences, research, trends and insights. The conference takes place June 18-20 at the Westin Market Hotel in San Francisco. Gilbane San Francisco helps attendees move beyond the mainstream content technologies they are familiar with, to enhanced "2.0" versions, which can open up new business opportunities, keep customers engaged, and improve internal communication and collaboration. The 2008 event will have its usual collection of information and content technology experts, including practitioners, technologists, business strategists, consultants, and the leading analysts from a variety of market and technology research firms. Topics to be covered in-depth at Gilbane San Francisco include: Web Content Management (WCM); Enterprise Search, Text Analytics, Semantic Technologies; Collaboration, Enterprise Wikis & Blogs; "Enterprise 2.0" Technologies & Social Media; Content Globalization & Localization; XML Content Strategies; Enterprise Content Management (ECM); Enterprise Rights Management (ERM); and Publishing Technology & Best Practices. Details on the Google keynote session as well as other keynotes and conference breakout sessions can be found at http://gilbanesf.com/conference-grid.html

Semantra Announces General Availability of Conversational Analytics Application for Microsoft Dynamics CRM

Semantra announced general availability of Semantra 2.0 for Microsoft Dynamics CRM. The application is a business intelligence tool that enables common language commands to retrieve specific information from back-end databases. Semantra 2.0 was specifically created to extend the value of Microsoft Dynamics CRM, enabling users to make ad hoc inquiries to retrieve precise results from any Microsoft Dynamics CRM database. Microsoft Dynamics CRM users can turn critical questions into precise and actionable information by entering familiar business terms into a search box. Semantra 2.0 will be distributed and supported by a national network of VARs and system integrators through the Semantra Reseller Program, which includes many of Microsoft's "Gold Certified" partners. Members of the program network are pre-qualified to install the product, conduct user training and provide a broad range of customization for users with specialized requirements. Semantra has initiated integration work with a variety of ERP and CRM applications, including Oracle Siebel and other Microsoft Dynamics solutions. http://www.semantra.com

March 17, 2008

IBM Labs Announces ProAct, Text Analytics for Call Centers

Researchers at IBM's (NYSE: IBM) India Research Laboratory have developed software technology that uses sophisticated math algorithms to extract and deliver business insights hidden within the information gathered by companies during customer service calls and other interactions. The new business intelligence technology, called ProAct, is a text analytics tool, which automates previously manual analysis and evaluation of customer service calls and provides insight to help companies assess and improve their performance. ProAct provides an integrated analysis of structured information such as agent and product databases and unstructured data such as email, call logs, call transcription to identify reason for dissatisfaction, agent performance issues and typical product problems. Based on the Unstructured Information Management Analysis (UIMA) framework that IBM contributed to the open source Apache Software Foundation in 2006, the ProAct technology was initially developed as a service engagement. Now the new algorithms are being packaged in software and deployed in many IBM call center customers around the world. UIMA is an open source software framework that helps organizations build new analysis technologies that help organizations gain more insight from their unstructured information by discovering relationships, identifying patterns, and predicting outcomes. IBM uses UIMA to enable text analysis, extraction and concept search capabilities in other parts of its portfolio of enterprise search software products, including OmniFind Enterprise Edition, OmniFind Analytics Edition, and OmniFind Yahoo! Edition. http://www.research.ibm.com/irl/

SAS acquires Teragram to Strengthen Text mining and Analytics

SAS announced the acquisition of privately held Teragram. The acquisition will enhance SAS' own text mining and analytical BI offerings, and extend them to enterprise and mobile search. Teragram, a 40-person firm headquartered in Cambridge, Mass., will be run as a SAS company. Terms of the acquisition deal were not disclosed. Teragram's natural language processing (NLP) technologies help turn text - in many languages and from many sources - into useable information. NLP enables richer data processing at the level of words, linguistic relations and word meanings. Teragram has developed and maintains large annotated dictionaries containing several hundred million words in more than 30 languages. Teragram's categorization technologies provide instant classification of documents according to custom criteria, applied throughout the organization. For enterprise search, Teragram's NLP technologies scan structured corporate databases and unstructured sources including text-based reports and Web pages to provide answers from these multiple information sources. Teragram's search capabilities deliver an easy-to-use environment for BI, extending the availability and use of BI throughout organizations. The combination of SAS and Teragram technologies provides indexing driven not just by a report's header, but by its actual content and the metadata associated with it. Teragram also brings SAS mobile search, helping individuals scan information remotely and get answers faster. Using Teragram's mobile search technology, individuals can store and retrieve information, connect to outside applications such as BI systems, and search databases from their BlackBerry, smart phone or other mobile device. http://www.sas.com, http://www.teragram.com/

March 12, 2008

SYSTRAN Launches Enterprise Server 6 Solution

SYSTRAN announced the release of SYSTRAN Enterprise Server 6, a solution that meets the full range of enterprise language translation needs. Enterprise Server 6 enables corporate users to understand multilingual information in real-time and to deliver consistent and validated translations enabling them to follow best business practices and communicate across different languages. Available in three editions targeted to the small and midsized businesses and enterprise platforms, Enterprise Server 6 addresses complex translation tasks and provides a workbench for managing translation projects. The solution automatically translates all types of documents and files ranging from manuals, procedures, reports, product and support information, content applications, websites, and all written texts. It translates most file types through a Web-based interface or a SYSTRAN Toolbar available on the user desktop. Corporations can integrate it into enterprise applications to drive multilingual information in and across channels, like the enterprise content management system, portal, search, website, etc. Common uses include adding an online translation service to the corporate intranet, on-demand website translation, localization for document workflows, integration with content management systems, databases, and other enterprise applications. SYSTRAN Enterprise Server 6, Workgroup Edition is designed for the small enterprise Intranet with up to 100 users. Price starts at $15,000. SYSTRAN Enterprise Server 6, Standard Edition is designed for the midsize Intranet or Extranet with up to 2,500 users using the Online Tools and Application Packs. Price starts at $30,000. SYSTRAN Enterprise Server 6, Global Edition is designed for enterprises with advanced translation requirements with unlimited user access. Price starts at $150,000. http://www.systransoft.com/

March 10, 2008

IBM Upgrades Enterprise Search Software

IBM (NYSE: IBM) introduced a new version, 8.5, of its OmniFind enterprise search software. The OmniFind advancements support the latest Lotus collaboration and social software allowing early adopters of Lotus tools such as Lotus Quickr and Lotus Connections to improve productivity, business networking and knowledge sharing. The new version also includes an interface that refines and graphically displays relevant search results; full global language support for Japanese, Chinese and Korean; and support for the latest versions of Red Hat Linux, Windows Server, IBM FileNet Enterprise Content Management software and the IBM Lotus Collaboration Suite. OmniFind Enterprise Edition also serves as a platform for versatile semantic search and content analytics solutions such as entity analytics, sentiment analysis, threat analysis and global name recognition, which are designed to help customers address industry-specific information management challenges. Among the new features of OmniFind Enterprise Edition 8.5 is OmniFind Top Results Analysis, which provides a graphical means of analyzing top search results based on metadata. In addition to generating a standard list of search results, results can be displayed graphically, allowing users to drill down further and interactively refine their search to find what they need faster. For example, a query for "enterprise search" will return a list of relevant results as well as a navigation pane with dynamic bar charts where results are organized by category, for example, Web search, Desktop search, eDiscovery, author, language or source. Drop-down menus are provided for dynamically selecting other fields for analysis. In addition to OmniFind Enterprise Edition, IBM offers a full range of search and content discovery software, including OmniFind Yahoo! Edition, OmniFind Discovery Edition, OmniFind Enterprise Starter Edition, and OmniFind Analytics Edition. The software is currently available from IBM and IBM Business Partners. http://www.ibm.com/software/

IAI and Across Systems Announce Strategic Partnership

IAI and Across Systems announced a strategic partnership and collaboration for interfaces between the CLAT (Controlled Language Authoring Tool) from IAI and the translation management system of Across. The integration of the two technologies enables automated and seamless inclusion of quality checks for extensive texts in the authoring or translation process, such as multilingual manuals or product catalogs. CLAT from IAI (Institute for Applied Information Science at Saarland University) facilitates quality proofing of texts, for example, to verify compliance with grammar rules or terminology conventions. CLAT is based on the same technology as the "DUDEN Korrektor," which is sold by Brockhaus Publishers. CLAT can also automatically check the uniformity of a text for company-specific standards such as corporate wording. Thus, there is no need for manual reconciliation with internal style guides. The Language Server of Across Systems is a software platform for all corporate language resources and translation processes. In addition to a translation memory and a terminology system, it comprises components for project management and workflow control. Industrial enterprises and language service providers use the Across Language Server for tasks such as the composition and translation of manuals, product catalogs, and other documents in multiple languages. The quality management of the source text and translations is complemented by the rules-based approach of CLAT. http://www.Across.net, http://www.iai-sb.de

March 4, 2008

Gilbane Group Announces New Report - "Beyond Search: What to Do When Your Enterprise Search System Doesn't Work"

Gilbane Group Inc. announced the launch of a new special report, "Beyond Search: What to do When Your Enterprise Search System Doesn't Work", by Stephen Arnold. The 250-page report also includes a "beyond search" market map, a chapter on Google's next generation plans for behind-the-firewall search, and a glossary. According to Lynda Moulton, Gilbane's Lead Analyst for Enterprise Search, "Over the past decade, companies and government agencies that have invested in major search technology have done so at great expense. This study recognizes that there are many large search systems out there that are in need of serious remediation or replacement. Mr. Arnold devotes over 40 pages to remediation options before presenting 24 'beyond search' technologies. These include both mature versions of established search products that have evolved to a stage of easier deployment, implementation and maintenance, and some new entrants that support highly specialized retrieval challenges within organizations. He leaves the reader with a feast for thought about how to meet enterprise search needs head on with both fix and replace options." The report will be available for purchase and download in early April. A workgroup license (up to 10) is $895 ($795 if ordered before April 15, 2008); an enterprise site license is $1595 ($1495 if ordered before April 15, 2008). http://gilbane.com/beyond-search.html

February 25, 2008

Northern Light Launches MI Analyst 2.0 Offering Deeper, Faster 'Meaning Extraction' from Market Intelligence Search Results

Northern Light launched its second major release of MI Analyst, a "meaning extraction" application designed specifically for market intelligence, market research and product research. Meaning extraction imbues a search application with an in-depth understanding of the searched material. MI Analyst 2.0 adds many new "facets" (categories of terms) by which the software can instantly analyze search results, automatically extracting meaning from internal and research documents, licensed secondary research, news stories and Web sources. Joining the previously released facets (Companies, Venture-Funded Companies, IT Technologies, IT Markets), new and expanded facets include Government Agencies, Industries, Business Issues and Strategic Scenarios. Also new in MI Analyst 2.0 is a facility to improve the value of search results based on the proximity of specified terms or phrases to each other, and more importantly, to any of the terms in any of the facets in MI Analyst. With the 2.0 release, MI Analyst expands beyond its roots in the IT sector to the pharmaceutical industry research. New facets relevant to pharmaceuticals include Human Anatomy, Diseases, Drugs, Cells, Cell Receptors, Proteins, Genes, Enzymes, Pharmaceutical Markets, Life Sciences Scenarios and Research Strategies and Therapeutic Approaches. MI Analyst can discern the tone of content - for example, assessing which market research reports and research analysts reflect a positive sentiment and which ones demonstrate a negative sentiment about a company and its competitors. MI Analyst is immediately available from Northern Light as an added-value option for SinglePoint enterprise market research portals, and as an integrated capability within Analyst Direct, Northern Light's subscription-based market research search engine. Unlimited enterprise-wide access to MI Analyst starts at $48,000 annually. http://www.northernlight.com/

Attensity Announces "VoC On-Demand" Software as a Service

Attensity announced Attensity VoC On-Demand, a new secure software as a service (SaaS) that enables users to access the company's "Voice of the Customer" (VoC) solution via the Web for on-demand customer feedback analysis. Enterprises can now extract and analyze data about their customers in Attensity's user interface and through customizable analytic dashboards. Attensity's VoC solution uses the company's Exhaustive Extraction engine to automatically identify facts, opinions, requests, trends and trouble areas from unstructured first person feedback found in surveys, service and call center notes, emails, web forums, blogs, news articles and other forms of customer contact. Attensity turns the first person feedback into "First Person Intelligence", enabling Attensity users to proactively understand and rapidly react to customer issues and requests. They also have the ability to discover product and/or service offering opportunities as well as potential areas for improvement. Attensity VoC On-Demand also offers a quick start implementation program, which includes appropriate data preparation - dictionary, domain and categorization development - to prepare data sets for extraction and output views and dashboards. Users can develop predefined analysis views, known as query templates, and dashboards tailored to the user organization's requirements. http://www.attensity.com

February 18, 2008

TEMIS and ANTIDOT Partner

TEMIS and ANTIDOT, a French provider of Enterprise Search Solutions specialized in intranet and web searches, announced they have signed a technology and business partnership agreement. To address global corporations, ANTIDOT decided to embed cross-lingual search functionality into its solution to enable users to submit one single query and get multi-lingual results. ANTIDOT integrated XeLDA technology, the semantic engine from TEMIS. XeLDA's Dictionary Look Up functionality offers a cross-lingual dimension by activating the correspondence between a term in its context and the translation in the selected target languages. This new functionality enhances the Antidot Finder Suite's unstructured information search (Web, Intranet), as well as its search and navigation capabilities through structured data (databases, knowledge bases). Beyond this XeLDA-based agreement, TEMIS and ANTIDOT integrate their solutions in order to propose content enrichment solutions. Luxid Annotation Factory extracts entities and relationships from documents, using domain-specific and exhaustive annotators. Luxid Annotation Factory identifies value-added metadata like names of people, companies or locations, mergers & acquisitions, market shares, etc. As soon as documents are enriched, they are indexed with AFS search engine, which takes into account all the metadata and knowledge added to initial data. http://www.temis.com, http://www.antidot.net

February 6, 2008

SchemaLogic Teams with IBM

SchemaLogic announced an agreement with IBM to enable the IBM Classification Module software to be integrated with the SchemaLogic Enterprise Suite. This partnership allows customers to take advantage of SchemaLogic enterprise software to organize and manage corporate metadata while using the IBM Classification Module to automatically classify documents by understanding their full context and consistently applying the metadata maintained within SchemaLogic. http://www.schemalogic.com/

January 29, 2008

Recommind and iCONECT Partner to Create Global Document Review and Analysis Platform

Recommind and iCONECT Development, LLC jointly announced a strategic alliance that will combine Recommind’s Axcelerate eDiscovery application with iCONECT’s nXT and eXT Web-based review platforms to create a new litigation review and analysis solution. With the integration of these two products, customers may benefit from the extensive use of automatically generated insight into a document collection, including each document’s responsiveness, priority, privileged nature and relationship to issues and subject codes. iCONECT will utilize the rich data about people, documents, concepts and phrases that is automatically generated by Axcelerate eDiscovery throughout the nXT EDD platform, greatly extending the utility of this data throughout the review process. Axcelerate eDiscovery provides language-agnostic tools like conceptual search, First Pass Review and One Click Coding functionality to expedite document review. Axcelerate eDiscovery offers law firms and enterprises more accurate and extensive document culling and filtering of virtually all document types, including both structured and unstructured data. For example, the solution automatically filters duplicates and near-duplicates between and across custodians and parties, and offers contextual email thread analysis, reducing the number of documents to be reviewed. http://www.iconect.com, http://www.recommind.com

January 28, 2008

Silobreaker Announces Semantic Search Service

Silobreaker announced the official release of its new service. More than a news aggregator, Silobreaker provides relevance by looking at the data it finds like a person does. It recognizes people, companies, topics, places and keywords; understands how they relate to each other in the news flow, and puts them in context for the user. The graphical search results enables users to understand connections, trends and topics or navigate deeper into the most relevant stories for them. Silobreaker pulls content on global issues, science, technology and business from approximately 10,000 news, blog, research and multimedia sources. With the engine’s focus on finding and connecting related data in the information flow, Silobreaker’s user tools and visualizations are meant for bringing meaning to content from either today’s Web or the evolving Semantic Web, or both. http://www.silobreaker.com

January 22, 2008

Vivisimo Adds Clustering 2.0 to the Mix

Vivisimo announced a new technology, remix clustering, to help users find new topics and gain insights related to their search queries. Vivisimo’s standard clustering organizes search results into topical folders on the fly, without any pre-processing of source documents. Clustering gives a quick overview of the main topics, enables access to valuable but low-ranked search results, and groups together related documents for joint consideration. Vivisimo’s Remix clustering helps searcher’s productivity. It works like this: The user first sees clustered results in the usual style. Then, a single click on Remix reveals submerged or secondary topics that were not generated in the initial clustering. It works by feedback: cluster the same search results again, but explicitly ignore the topics that the user already saw. Remix clustering functionality is built into Velocity 6.0. It is also now available on Clusty.com, Vivisimo’s online web search site. Test the Remix on a familiar topic and see how unfamiliar or subtler topics emerge by repeated clicking on Remix. http://clusty.com

January 15, 2008

W3C Opens Data on the Web with SPARQL

W3C (The World Wide Web Consortium) announced the publication of SPARQL, the key standard for opening up data on the Semantic Web. With SPARQL query technology, pronounced "sparkle," people can focus on what they want to know rather than on the database technology or data format used behind the scenes to store the data. Because SPARQL queries express high-level goals, it is easier to extend them to unanticipated data sources, or even to port them to new applications. Many successful query languages exist, including standards such as SQL and XQuery. These were primarily designed for queries limited to a single product, format, type of information, or local data store. Traditionally, it has been necessary to formulate the same high-level query differently depending on application or the specific arrangement chosen for the relational database. And when querying multiple data sources it has been necessary to write logic to merge the results. These limitations have imposed higher developer costs and created barriers to incorporating new data sources. The goal of the Semantic Web is to enable people to share, merge, and reuse data globally. SPARQL is designed for use at the scale of the Web, and thus enables queries over distributed data sources, independent of format. Because SPARQL has no tie to a specific database format, it can be used to take advantage of "Web 2.0" data and mash it up with other Semantic Web resources. Furthermore, because disparate data sources may not have the same 'shape' or share the same properties, SPARQL is designed to query non-uniform data. The SPARQL specification defines a query language and a protocol and works with the other core Semantic Web technologies from W3C: Resource Description Framework (RDF) for representing data; RDF Schema; Web Ontology Language (OWL) for building vocabularies; and Gleaning Resource Descriptions from Dialects of Languages (GRDDL), for automatically extracting Semantic Web data from documents. SPARQL also makes use of other W3C standards found in Web services implementations, such as Web Services Description Language (WSDL). http://www.w3.org/

OrcaTec LLC Partners with rPath to Launch V 2.0 Of its Information Retrieval Toolkit as a Software Appliance

OrcaTec LLC announced the release of Version 2.0 of the OrcaTec Information Retrieval Toolkit. The Toolkit will be distributed as an rPath-based software appliance, to make it simple to install and maintain. This software appliance provides an integrated collection of information analysis and management services, including concept search, near-duplicate clustering, language identification, and an interesting-phrase finder. These services are ideal for building scalable, reliable, and effective information analysis and management applications. The OrcaTec Information Retrieval Toolkit is designed to be a key component of systems for enterprise search, legal discovery, business intelligence, text data mining, content management, email archiving, knowledge management, and many other applications. OrcaTec Concept Searching learns the meaning of words from the documents that it reads. Concept searching allows users to find information even when they may not know exactly the specific words that a document's author used. Built on top of Lucene, the Toolkit also includes the full complement of Boolean and proximity searching. Version 2.0 supports data ingest rates as high as two million documents per day per system. These documents can be in any language from any source. The Toolkit is based on language modeling, which is the process of analyzing the patterns of language usage in a text and using these patterns to organize and retrieve it. The Toolkit has a REST-based API. http://www.rpath.com

December 19, 2007

Teragram Announces Microsoft Office SharePoint Server Integration for Automatic Categorization and Text Analytics

Teragram announced that its Automatic Metadata Generation Suite, which includes the company's automatic categorization, entities extraction and TK240 taxonomy management software, now fully integrates with the Microsoft Office SharePoint Server (MOSS) content management system and Microsoft SharePoint Enterprise Search. As a result of this integration, Teragram automatically adds metadata to documents stored in MOSS, enabling users to browse the content by topics and to perform faceted search with SharePoint Enterprise Search. Users can now view their search results broken down by category and linked to related topics. When documents are added to the MOSS content management system, Teragram's software automatically tags the content based on a set of predefined rules, which can be customized by industry and company name using the TK240 taxonomy management software. Teragram's industry-specific taxonomies can be used out-of-the-box with MOSS, or enterprises can create their own taxonomies using TK240 to automatically classify their content. Documents are then categorized based on the tags they receive. Finally, related terms are cross-referenced and linked by the TK240 system. These behind-the-scenes processes simplify enterprise search for end users, giving them a directory from which to choose the appropriate topic, in addition to the standard search bar. http://www.teragram.com/