<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>Enterprise Search Blog</title>
    <link rel="alternate" type="text/html" href="http://gilbane.com/search_blog/" />
    <link rel="self" type="application/atom+xml" href="http://gilbane.com/search_blog/atom.xml" />
    <id>tag:gilbane.com,2008-12-28:/search_blog//49</id>
    <updated>2012-01-31T21:36:56Z</updated>
    <subtitle>Analysis, opinion, and advice on enterprise search
technologies, applications, and practices
   </subtitle>
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type Pro 4.32-en</generator>

<entry>
    <title>Helping Enterprise Searchers Succeed</title>
    <link rel="alternate" type="text/html" href="http://gilbane.com/search_blog/2012/01/helping_enterprise_searchers_succeed.html" />
    <id>tag:gilbane.com,2012:/search_blog//49.11083</id>

    <published>2012-01-31T21:17:43Z</published>
    <updated>2012-01-31T21:36:56Z</updated>

    <summary>I can only say that time is the enemy for medical staff. When questions were raised, the answers were in the system; in other words, &quot;search worked.&quot; What was not available to staff was time to study the whole patient record and understand overlapping and sometimes conflicting orders about care.

It is shortsighted for any institution to believe that it can squeeze professionals to &quot;think-fast,&quot; &quot;on-their-feet&quot; for hours on end with no time to consider the massive amounts of searchable results they are able to assemble. Human beings should not be expected to sacrifice their professional integrity and work standards because their employers have put them in a constant time bind.</summary>
    <author>
        <name>Lynda Moulton</name>
        <uri>http://gilbane.com/blog/mt-cp.cgi?__mode=view&amp;blog_id=49&amp;id=14</uri>
    </author>
    
        <category term="Search Problems/Solved Search Problems" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="informationinfrastructure" label="Information infrastructure" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="searchcasestudies" label="Search case studies" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="searchinfrastructure" label="Search infrastructure" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://gilbane.com/search_blog/">
        <![CDATA[<p>I begin 2012 with a new perspective on enterprise search, one gained as purely an observer. The venues have all been medical establishments with multiple levels of complexity and healthcare workers. As the primary caregiver for a patient, and with some medical training, I take my role as observer and patient advocate quite seriously.</p>

<p>As soon as the patient was on the way to the emergency room, all of his medical records, insurance cards, medications, and contact information were assembled and brought to the hospital. With numerous critical care professionals intervening, and the patient being taken for various tests over several hours, I verbally imparted information I thought was important that might not yet show up in the system. Toward the end of the emergency phase, after being told several times that they had all his records available and "in the system" I relaxed to focus on the "next steps." </p>

<p>Numerous specialists were involved in the medical conditions and the first three days passed without "a crisis" but little did we know that medication choices were beginning to cause some major problems. Apparently, some parts of the patient's medical history were not fully considered, and once the medications caused adverse outcomes, all kinds of other problem arose. </p>

<p>Fortunately, I was there to verbally share knowledge that was in the patient's medical records and get choices of medicine reversed. On several occasions, doctor's care orders had been "overlooked" and complicating interventions were executed because the healthcare person "in the moment" took an action without "seeing" those orders. I personally watched the extensive recording of doctor's decisions and confirmed with them changes that were being made to the patient's care, but repeatedly had to ask why a change was not being implemented.</p>

<p>Observing for six to eight hours on several care floors, I can only say that <strong>time</strong> is the enemy for medical staff. When questions were raised, the answers were in the system; in other words, "search worked." What was not available to staff was time to study the whole patient record and understand overlapping and sometimes conflicting orders about care.</p>

<p>It is shortsighted for any institution to believe that it can squeeze professionals to "think-fast," "on-their-feet" for hours on end with no time to consider the massive amounts of searchable results they are able to assemble. Human beings should not be expected to sacrifice their professional integrity and work standards because their employers have put them in a constant time bind.</p>

<p>My family member had me, but what of patients with no one, or no one versed in medical conditions and processes to intervene. This extends to every line of business where risk is involved from the practice of law to engineering, manufacturing, design, research and development, testing, technical documentation writing, etc.</p>

<p>I don't minimize how hard it is for businesses and professional services to stay profitable and competitive when they are being pressed to leverage technology for information resource management. However, one measure that every enterprise must embrace is educating its workforce about the use of information technologies it employs. It is not enough to simply make a search engine interface accessible on the workstation. Every worker must be shown <u>how to search</u> for accurate information, authoritative information, and complete information, and be made aware of the ways to ingest and evaluate what they are finding. Finally, they must be given an alternative to getting a more complete chronicle when the results don't match the need, even if that alternative is to seek another human being instead of a technology. </p>

<p>Search experts are a professionally trained class of workers who can fill the role of trainers, particularly if they have subject matter expertise in the field where search is being deployed. The risks to any enterprise of short-changing workers by not allowing them to fully exploit and understand results produced from search are long-term, but serious.</p>

<p>It is important to leave this entry with recognition that, due to wonderful healthcare professionals and support staff, the outcomes for the patient have been positive. People listened when I had information to share and respected my role in the process. That in no way absolves institutions and enterprises from giving their employees the autonomy and time to pay attention to all the information flooding their sphere of operation. In every field of endeavor, human beings need the time and environment to mindfully absorb, analyze and evaluate all the content available. Technology can aid but cannot carry out thoughtful professional practice.<br />
</p>]]>
        
    </content>
</entry>

<entry>
    <title>Making Search Play Well with Content Solutions</title>
    <link rel="alternate" type="text/html" href="http://gilbane.com/search_blog/2011/12/making_search_play_well_with_content_solutions.html" />
    <id>tag:gilbane.com,2011:/search_blog//49.11081</id>

    <published>2011-12-14T20:57:30Z</published>
    <updated>2011-12-14T21:25:37Z</updated>

    <summary>The guidance here is to choose a search services firm that will move you efficiently and effectively along the path of systems integration. Expertise is available and you do not need to struggle alone knitting together best-of-breed components. Do your research and understand the differentiators among the companies. High touch, high integrity and commitment for the long haul should be high on your list of requirements - and of course, look for experience and expertise in deploying the technology solutions you want to use and integrate.</summary>
    <author>
        <name>Lynda Moulton</name>
        <uri>http://gilbane.com/blog/mt-cp.cgi?__mode=view&amp;blog_id=49&amp;id=14</uri>
    </author>
    
        <category term="Search Research and Reference Sites" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="enterpriseapplications" label="Enterprise applications" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="searchinfrastructure" label="Search infrastructure" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="softwareapplicationintegration" label="Software application integration" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://gilbane.com/search_blog/">
        <![CDATA[<p>In keynote sessions at the recent <a href="http://gilbaneboston.com/">Gilbane Boston Conference</a>, three speakers in a row made points about content management solutions that are also significant to selection and implementation of enterprise search. Here is a list of paraphrased comments.</p>

<ul>
	<li>From Forrester analyst, <a href="http://gilbaneboston.com/speakers.html#spowers">Stephen Powers</a> were these observations: 1. The promise has been there for years for an E (enterprise)CM suite to <u>do everything</u> but the reality is that no one vendor, even when they have all the pieces, integrates them well. 2. Be cautious about promises from vendors who claim to do it all; instead, focus on those who know how to do integration.</li>
	<li><a href="http://gilbaneboston.com/speakers.html#tbyrne">Tony Byrne</a> of the Real Story Group observed about Google in the enterprise that they frequently fail because Google doesn't really understand "how work gets done in the enterprise."</li>
	<li>Finally, <a href="http://gilbaneboston.com/speakers.html#SLiewehr">Scott Liewehr</a> of the Gilbane Group stated that a services firm selection is more important than the content management system application selection.</li>
</ul>

<p>Taken together these statements may not substantiate the current state of the content management industry but they do point to a trend. Evidence is accruing that products and product suppliers must focus on playing nice together and work <strong>for</strong> the enterprise. Most tend not to do well, out-of-the-box,  without the help of expertise and experts. Nominally, vendors themselves have a service division to perform this function but the burden falls on the buyer to make the "big" decisions about integration and deployment.</p>

<p>The real solution is waiting in the wings and I am increasingly talking to these experts, system integrators. They come in all sizes and configurations; perhaps they don't even self-identify as <em>system integrators</em>, but what they offer is deep expertise in a number of content software applications, including search.</p>

<p>Generally, the larger the operation the more substantial the number and types of products with which they have experience. They may have expertise in a number of web content management products or e-commerce offerings. A couple of large operations that I have encountered in Gilbane engagements are <a href="http://www.avalonconsult.com/">Avalon Consulting</a>, and <a href="http://www.searchtechnologies.com/">Search Technologies</a>, which have divisions each specializing in a facet of content management including search. You need to explore whether their strengths and expertise are a good fit with your needs.</p>

<p>The smaller companies specialize, such as working with several search engines plus tools to improve  metadata and vocabulary management so content is more findable. Specialists in enterprise search must still have an understanding of content management systems (CMS) because those are usually the source of metadata that feed high quality search. I've recently spoken with several small service providers whose commentaries and case work illustrate a solid and practical approach. Those you might want to look into are: <a href="http://www.appliedrelevance.com/">Applied Relevance</a>, <a href="http://contegrasystems.com/">Contegra Systems,</a> <a href="http://www.findwise.com/">Findwise</a>, <a href="http://kapsgroup.com/index.shtml">KAPS Group</a>, <a href="http://www.lucidimagination.com/">Lucid Imagination</a>, <a href="http://www.ideaeng.com/">New Idea Engineering</a>, and <a href="http://www.tnrglobal.com/">TNR Global</a>.</p>

<p>Each of these companies has a specialty and niche, and I am not making explicit recommendations. The simple reason is that what you need and what you are already working on is unique to your enterprise. Without knowledge of your resources, special needs and goals my recommendations would be guesses. What I am sharing is the idea that you need experts who can give value when they are the right experts for your requirements.</p>

<p>The guidance here is to choose a search services firm that will move you efficiently and effectively along the path of systems integration. Expertise is available and you do not need to struggle alone knitting together best-of-breed components. Do your research and understand the differentiators among the companies. High touch, high integrity and commitment for the long haul should be high on your list of requirements - and of course, look for experience and expertise in deploying the technology solutions you want to use and integrate.</p>

<p>Next month I'll share some tips on evaluating possible service organizations starting with techniques for doing research on the Web.</p>]]>
        
    </content>
</entry>

<entry>
    <title>Why is it so Hard to &quot;Get&quot; Semantics Inside the Enterprise?</title>
    <link rel="alternate" type="text/html" href="http://gilbane.com/search_blog/2011/11/why_is_it_so_hard_to_get_semantics_inside_the_enterprise.html" />
    <id>tag:gilbane.com,2011:/search_blog//49.11078</id>

    <published>2011-11-10T15:24:48Z</published>
    <updated>2011-11-10T22:47:39Z</updated>

    <summary>In the enterprise, the same care must be given to metadata, search engine &quot;meaning&quot; analysis tools and query interpretation for successful outcomes. Magic does not happen without people behind the scenes to meet these three criteria executing linguistic curation, content enhancement and computational linguistic programming.
...
Content contributors and inquirers are all highly educated specialists seeking answers to questions that have never been asked before. Think about it, search engines designed to deliver results for frequently asked questions or to find content on popular topics is hard enough, but finding the answer to a brand new question is a quantum leap of difficulty in comparison. </summary>
    <author>
        <name>Lynda Moulton</name>
        <uri>http://gilbane.com/blog/mt-cp.cgi?__mode=view&amp;blog_id=49&amp;id=14</uri>
    </author>
    
        <category term="Search Problems/Solved Search Problems" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="enterprisesearch" label="Enterprise search" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="semanticsearch" label="Semantic search" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="semanticsoftware" label="Semantic software" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://gilbane.com/search_blog/">
        <![CDATA[<p><a href="http://gilbane.com/Research-Reports.html#semantic"><em>Semantic Software Technologies: Landscape of High Value Applications for the Enterprise</em></a>  was published just over a year ago. Since then the marketplace has been increasingly active; new products emerge and discussion about what semantics might mean for the enterprise is constant. One thing that continues to strike me is the difficulty of explaining the meaning of, applications for, and context of semantic technologies. </p>

<p>Browsing through the topics in this excellent blog site, <a href="http://semanticweb.com">http://semanticweb.com</a> , it struck me as the proverbial case of the blind men describing an elephant. A blog, any blog, is linear. While there are tools to give a blog dimension by clustering topics or presenting related information, it is difficult to understand the full relationships of any one blog post to another. Without a photographic memory, an individual does not easily connect ideas across a multi-year domain of blog entries. Semantic technologies can facilitate that process.</p>

<p>Those who embrace some concept of semantics are believers that search will benefit from "semantic technologies." What is less clear is how evangelists, developers, searchers and the average technology user can coalesce around the applications that will semantically enable enterprise search. </p>

<p>On the Internet content that successfully drives interest, sales, opinion and individual promotion does so through a combination of expert crafting of metadata, search engine technology that "understands" the language of the inquirer and the content that can satisfy the inquiry. Good answers are reached when questions are understood first and then the right content is selected to meet expectations.</p>

<p>In the enterprise, the same care must be given to metadata, search engine "meaning" analysis tools and query interpretation for successful outcomes. Magic does not happen without people behind the scenes to meet these three criteria executing linguistic curation, content enhancement and computational linguistic programming.</p>

<p>Three recent meeting events illustrate various states of semantic development and adoption, even as the next conference, <a href="http://semtechbizdc2011.semanticweb.com/?c=stblfp">Semantic Tech & Business Conference</a> - Washington, D.C. on November 29 - is upon us:</p>

<p>Event 1 - A relatively new group, the <a href="http://www.iks-project.eu/"><em>IKS-Community</em></a> funded by the EU has been supporting open source software developers since 2009. In July they held a <a href="http://wiki.iks-project.eu/index.php/Workshops/EAworkshopParis"><em>workshop in Paris</em></a> just past the mid-point of their life cycle. Attendees were primarily entrepreneurs and independent open source developers seeking pathways for their semantically "tuned" content management solutions. I was asked to suggest where opportunities and needs exist in US markets. They were an enthusiastic audience and are poised to meet the tough market realities of packaging highly sophisticated software for audiences that will rarely understand how complex the stuff "under the hood" really is. My principal charge to them was to create tools that "make it really easy" to work with vocabulary management and content metadata capture, updates, and enhancements.</p>

<p>Event 2. - On this side of the pond, UK firm <a href="http://www.linguamatics.com/"><em>Linguamatics</em> </a>hosted its <a href="http://www.linguamatics.com/welcome/events/users_conferences.html"><em>user group meeting</em></a> in Boston in October. Having interviewed a number of their customers last year to better understand their I2E product line, I was happy to meet people I had spoken with and see the enthusiasm of a user community vested in such complex technology. Most impressive is the respectful tone and thoughtful sharing between Linguamatics principals and their customers. They share the knowledge of how hard it is to continually improve search technology that delivers answers to semantically complex questions using highly specialized language. Content contributors and inquirers are all highly educated specialists seeking answers to questions that have never been asked before. Think about it, search engines designed to deliver results for frequently asked questions or to find content on popular topics is hard enough, but finding the answer to a brand new question is a quantum leap of difficulty in comparison. </p>

<p>To make matters even more complicated, answers to semantic (natural language) questions may be found in internal content, in published licensed content or some combination of both. In the latter case, only the seeker may be able to put the two together to derive or infer an answer. </p>

<p>Publishers of content for licensing play a convoluted game of how they will license their content to enterprises for semantic indexing in combination with internal content. The Linguamatics user community is primarily in life sciences; this is one more hurdle for them to overcome to effectively leverage the vast published repositories of biological and medical literature. Rigorous pricing may be good business strategy, but research using semantic search could make more headway with more reasonable royalties that reflect the need for collaborative use across teams and partners. </p>

<p>Content wants to be found and knowledge requires outlets to enable innovation to flourish. In too many cases technology is impaired by lack of business resources by buyers or arcane pricing models of sellers that hold vital information captive for a well-funded few. Semantically excellent retrieval depends on an engine's indexing access to <u><strong>all</strong></u> contextually relevant content.</p>

<p>Event 3. - Leslie Owens of Forrester Research, at the <a href="http://conferences.infotoday.com/documents/139/ESS11Fall_FinalProgram.pdf"><em>Fall 2011 Enterprise Search Summit</em></a> conducted a very interesting interactive session that further affirms the elephant and blind men metaphor. Leslie is a champion of metadata best practices and writes about the competencies and expertise needed to make valuable content accessible. She engaged the audience with a series of questions about its wants, needs, beliefs and plans for semantic technologies. As described in an earlier paragraph about how well semantics serves us on the Web, most of the audience puts its faith in that model but is doubtful of how or when similar benefits will accrue to enterprise search. Leslie and a couple of others made the point that a lot more work has to be done on the back-end on content in the enterprise to get these high-value outcomes.</p>

<p>We'll keep making the point until more adopters of semantic technologies get serious and pay attention to content, content enhancement, expert vocabulary management and metadata. If it is automatic understanding of your content that you are seeking, the vocabulary you need is one that you build out and enhance for your enterprise's relevance. Semantic tools need to know the special language you use to give the answers you need.<br />
</p>]]>
        
    </content>
</entry>

<entry>
    <title>WHY ISN&apos;T ENTERPRISE SEARCH &quot;MISSION CRITICAL?&quot;</title>
    <link rel="alternate" type="text/html" href="http://gilbane.com/search_blog/2011/10/why_isnt_enterprise_search_mission_critical.html" />
    <id>tag:gilbane.com,2011:/search_blog//49.11047</id>

    <published>2011-10-04T19:11:10Z</published>
    <updated>2011-10-04T19:24:47Z</updated>

    <summary>From embedded search (within content management systems, archive and records management systems, museum systems, etc.), to standalone search engines designed to work well in discrete vertical markets or functional areas of enterprises (e.g., engineering, marketing, healthcare, energy exploration) buyers have a wealth of options from which to choose...Without the search component, all of the other technologies that have been so hot in the past are worthless.</summary>
    <author>
        <name>Lynda Moulton</name>
        <uri>http://gilbane.com/blog/mt-cp.cgi?__mode=view&amp;blog_id=49&amp;id=14</uri>
    </author>
    
        <category term="Search Technologies and Products" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="enterpriseapplications" label="Enterprise applications" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="enterprisesearchindustry" label="Enterprise search industry" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://gilbane.com/search_blog/">
        <![CDATA[<p>Why isn't "search" the logical end-point in any content and information management activity. If we don't care about being able to find valued and valuable information, why bother with any of the myriad technologies employed to capture, organize, categorize, store, and analyze content. What on earth is the point of having our knowledge workers document the results of their business, science, engineering and marketing endeavors, if we never aspire to having it retrieved, leveraged or re-purposed by others?</p>

<p>However, in <u>Information Week</u>, an article in the September 5, 2011 issue entitled <a href="http://www.informationweek.com/news/software/info_management/231500395">"HP Transformation: Autonomy is a Modest Start"</a> gave me a jolt with this comment: <em>Autonomy has very sophisticated search capabilities including federation--the ability to search across many repositories and sources--and video and image search. But with all that said, enterprise search isn't a hot, mission-critical business priority. </em>[NOTE: in the print version the "call-out" box had slightly different phrasing but it jumped off the page, anyway.] This is pretty provocative and disappointing to read in the pages of this particular publication.</p>

<p>Over the past few months, I have been engrossed in working on several client projects related to taxonomy development, vocabulary management and integration with content and search systems. There is no doubt that every one of these institutions is focused with laser intensity on getting the search interface to deliver the highest value for the effort and dollars expended. In each case, the project involved a content management component for capturing metadata with solid uniformity, strong vocabulary control, and rich synonym tables for ensuring findability when a search query has different language than the content or metadata. Every step in each of these projects has come back to the acid test, "will the searcher be able to find what they are looking for."</p>

<p>In past posts I have commented on the strength of enterprise search technologies, and the breath of offerings that cover a wide array of content findability needs and markets. From embedded search (within content management systems, archive and records management systems, museum systems, etc.), to standalone search engines designed to work well in discrete vertical markets or functional areas of enterprises (e.g., engineering, marketing, healthcare, energy exploration) buyers have a wealth of options from which to choose. Companies that have formerly focused on web site management, business intelligence, data mining, and numerous other content related tools are redefining themselves with additional terminology like <em>e-discovery</em>, <em>360-degree views</em> (of information), <em>content accessibility</em>, and <em>unified information</em>.</p>

<p>Without the search component, all of the other technologies that have been so hot in the past are worthless. The article goes on to say that the <em>hottest areas</em> (of software growth) <em>are business analytics and big-data analysis</em>. Neither of these contributes business value without search underpinnings.</p>

<p>So, let's get off this kick of under-rating and marginalizing search as "not mission critical" and think very seriously about the consequences of trying to run any enterprise without being able to find the products of our intellectual work output. <br />
</p>]]>
        
    </content>
</entry>

<entry>
    <title>Collaboration, Convergence and Adoption</title>
    <link rel="alternate" type="text/html" href="http://gilbane.com/search_blog/2011/06/collaboration_convergence_and_adoption.html" />
    <id>tag:gilbane.com,2011:/search_blog//49.10995</id>

    <published>2011-06-30T19:28:55Z</published>
    <updated>2011-06-30T19:44:34Z</updated>

    <summary>Enterprises, which previously sent people to learn about technologies and products to earlier meetings, are now in the implementation and deployment stages. Thus, they are now able to contribute presentations with real experience and commentary about products. Presenters are commenting on adoption issues, usability, governance, successful practices and pitfalls or unresolved issues.</summary>
    <author>
        <name>Lynda Moulton</name>
        <uri>http://gilbane.com/blog/mt-cp.cgi?__mode=view&amp;blog_id=49&amp;id=14</uri>
    </author>
    
        <category term="Search Technologies and Products" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="contentmanagement" label="Content management" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="enterpriseapplications" label="Enterprise applications" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="enterprisesearchsummit" label="Enterprise Search Summit" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="textanalyticssummit" label="Text Analytics Summit" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://gilbane.com/search_blog/">
        <![CDATA[<p>Here we are, half way through 2011, and on track for a banner year in the adoption of enterprise search, text mining/text analytics, and their integration with collaborative content platforms. You might ask for evidence; what I can offer is anecdotal observations. Others track industry growth in terms of dollars spent but that makes me leery when, over the past half dozen years, there has been so much disappointment expressed with the failures of legacy software applications to deliver satisfactory results. My antenna tells me we are on the cusp of expectations beginning to match reality as enterprises are finding better ways to select, procure, implement, and deploy applications that meet business needs.</p>

<p>What follows are my happy observations, after attending the  <a href="http://www.enterprisesearchsummit.com/Spring2011/Program.aspx">2011 Enterprise Search Summit</a> in New York and <a href="http://www.textanalyticsnews.com/text-mining-conference/">2011 Text Analytics Summit </a>in Boston. Other inputs for me continue to be a varied reading list of information industry publications, business news, vendor press releases and web presentations, and blogs, plus conversations with clients and software vendors. While this blog is normally focused on enterprise search, experiencing and following content management technologies, and system integration tools contribute valuable insights into all applications that contribute to search successes and frustrations.</p>

<p><em>Collaboration</em> tools and platforms gained early traction in the 1990s as technology offerings to the knowledge management crowd. The idea was that teams and workgroups needed ways to share knowledge through contribution of work products (documents) to "places" for all to view. Document management systems inserted themselves into the landscape for managing the development of work products (creating, editing, collaborative editing, etc.). However, collaboration spaces and document editing and version control activities remained applications more apart than synchronized. </p>

<p>The collaboration space has been redefined largely because SharePoint now dominates current discussions about collaboration platforms and activities. While early collaboration platforms were carefully structured to provide a thoughtfully bounded environment for sharing content, their lack of provision for idiosyncratic and often necessary workflows probably limited market dominance.</p>

<p>SharePoint changed the conversation to one of build-it-to-do-anything-you-want-the way-you-want (BITDAYWTWYW). What IT clearly wants is single vendor architecture that delivers content creation, management, collaboration, and search. What end-users want is workflow efficiency and reliable search results. This introduces another level of collaborative imperative, since the <em>BITDAYWTWYW</em> model requires expertise that few enterprise IT support people carry and fewer end-users would trust to their IT departments. So, third-party developers or software offerings become the collaborative option. SharePoint is not the only collaboration software but, because of its dominance, a large second tier of partner vendors is turning SharePoint adopters on to its potential. <em>Collaboration</em> of this type in the marketplace is ramping wildly.</p>

<p><em>Convergence</em> of technologies and companies is on the rise, as well. The non-Microsoft platform companies, OpenText, Oracle, and IBM are placing their strategies on tightly integrating their solid cache of acquired mature products. These acquisitions have plugged gaps in text mining, analytics, and vocabulary management areas. Google and Autonomy are also entering this territory although they are still short on the maturity model.  The convergence of document management, electronic content management, text and data mining, analytics, e-discovery, a variety of semantic tools, and search technologies are shoring up the "big-platform" vendors to deal with "big-data." </p>

<p>Sitting on the periphery is the open source movement. It is finding ways to alternatively collaborate with the dominant commercial players, disrupt select application niches (e. g. WCM ), and contribute solutions where neither the SharePoint model nor the big platform, tightly integrated models can win easy adoption. Lucene/Solr is finding acceptance in the government and non-profit sectors but also appeal to SMBs.</p>

<p>All of these factors were actively on display at the two meetings but the most encouraging outcomes that I observed were:</p>

<ul>
	<li>Rise in attendance at both meetings</li>
	<li>More knowledgeable and experienced attendees</li>
	<li>Significant increase in end-user presentations</li>
</ul>

<p>The latter brings me back to the <em>adoption</em> issue. Enterprises, which previously sent people to learn about technologies and products to earlier meetings, are now in the implementation and deployment stages. Thus, they are now able to contribute presentations with real experience and commentary about products. Presenters are commenting on adoption issues, usability, governance, successful practices and pitfalls or unresolved issues.</p>

<p>Adoption is what will drive product improvements in the marketplace because experienced adopters are speaking out on their activities. Public presentations of  user  experiences can and should establish expectations for better tools, better vendor relationship experiences, more collaboration among products and ultimately, reduced complexity in the implementation and deployment of products.</p>]]>
        
    </content>
</entry>

<entry>
    <title>Classifying Searchers - What Really Counts?</title>
    <link rel="alternate" type="text/html" href="http://gilbane.com/search_blog/2011/04/classifying_searchers_-_what_really_counts.html" />
    <id>tag:gilbane.com,2011:/search_blog//49.10937</id>

    <published>2011-04-14T00:06:28Z</published>
    <updated>2011-04-14T12:56:57Z</updated>

    <summary>There is a clear lesson here for seeking enterprise search solutions. Systems that favor one audience over another will always be problematic. Therefore, establishing who needs what and how each goes about searching needs to be answered, and then matched to the product that can provide for all target groups.</summary>
    <author>
        <name>Lynda Moulton</name>
        <uri>http://gilbane.com/blog/mt-cp.cgi?__mode=view&amp;blog_id=49&amp;id=14</uri>
    </author>
    
        <category term="Product Selection" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="conferences" label="Conferences" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="searchproductprocurement" label="Search product procurement" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="searchusability" label="Search usability" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="usergroups" label="User groups" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://gilbane.com/search_blog/">
        <![CDATA[<p>I continue to be impressed by the new ways in which enterprise search companies differentiate and package their software for specialized uses. This is a good thing because it underscores their understanding of different search audiences. Just as important is recognition that search happens in a context, for example:</p>

<p>•	Personal interest (enlightenment or entertainment)<br />
•	Product selection (evaluations by independent analysts vs. direct purchasing information)<br />
•	Work enhancement (finding data or learning a new system, process or product)<br />
•	High-level professional activities (e-discovery to strategic planning)</p>

<p>Vendors understand that there is a limited market for a product or suite of products that will satisfy every budget, search context and the enterprise's hierarchy of search requirements. Those who are the best focus on the technological strengths of their search tools to deliver products packaged for a niche in which they can excel.</p>

<p>However, for any market niche excellence begins with six basics:</p>

<p>•	Customer relationship cultivation, including good listening<br />
•	Professional customer support and services<br />
•	Ease of system installation, implementation, tuning and administration<br />
•	Out-of-the box integration with complementary technologies that will improve search<br />
•	Simple pricing for licensing and support packages<br />
•	Ease of doing business, contracting and licensing, deliveries and upgrades</p>

<p>While any mature and worthy company will have continually improved on these attributes, there are contextual differentiators that you should seek in your vertical market:</p>

<p>•	Vendor subject matter expertise<br />
•	Vendor industry expertise<br />
•	Vendor knowledge of how professional specialists perform their work functions<br />
•	Vendor understanding of retrieval and content types that contribute the highest value</p>

<p>At a recent client discussion the application of a highly specialized taxonomy was the topic. Their target content will be made available on a public facing web site and also to internal staff. We began by discussing the various categories of terminology already extracted from a pre-existing system.</p>

<p>As we differentiated how internal staff needed to access content for research purposes and how the public is expected to search, patterns emerged for how differently content needs to be packaged for each constituency. For you who have specialized collections to be used by highly diverse audiences, this is no surprise. Before proceeding with decisions about term curation and determining the granularity of their metadata vocabulary, what has become a high priority is how the search mechanisms will work for different audiences.</p>

<p>For this institution, internal users must have pinpoint precision in retrieval on multiple facets of content to get to exactly the right record. They will be coming to search with knowledge of the collection and more certainty about what they can expect to find. They will also want to find their target(s) quickly. On the other hand, the public facing audience needs to be guided in a way that leads them on a path of discovery, navigating through a map of terms that takes them from their "key term" query through related possibilities without demanding arcane Boolean operations or lengthy explanations for advanced searching.</p>

<p>There is a clear lesson here for seeking enterprise search solutions. Systems that favor one audience over another will always be problematic. Therefore, establishing who needs what and how each goes about searching needs to be answered, and then matched to the product that can provide for all target groups.</p>

<p>We are in the season for conferences; there are a few next month that will be featuring various search and content technologies. After many years of walking exhibit halls and formulating strategies for systematic research and avoiding a swamp of technology overload, I try now to have specific questions formulated that will discover the "must have" functions and features for any particular client requirement. If you do the same, describing a search user scenario to each candidate vendor, you can then proceed to ask: <em>Is this a search problem your product will handle? What other technologies (e.g. CMS, vocabulary management) need to be in place to ensure quality search results? Can you demonstrate something similar? What would you estimate the implementation schedule to look like? What integration services are recommended?</em></p>

<p>These are starting points for a discussion and will enable you to begin to know whether this vendor meets the fundamental criteria laid out earlier in this post. It will also give you a sense of whether the vendor views all searchers and their searches as generic equivalents or knows that different functions and features are needed for special groups.</p>

<p>Look for vendors for enterprise search and search related technologies to interview at the following upcoming meetings:</p>

<p><a href="http://www.enterprisesearchsummit.com/Spring2011/">Enterprise Search Summit, New York, May 10 - 11</a> [...<em>where you will learn strategies and build the skill sets you need to make your organization's content not only searchable but "findable" and actionable so that it delivers value to the bottom line.</em>] This is the largest seasonal conference dedicated to enterprise search. The sessions are preceded by separate workshops with in-depth tutorials related to search. During the conference, focus on case studies of enterprises similar to yours for better understanding of issues, which you may need to address.</p>

<p><a href="http://www.textanalyticsnews.com/text-mining-conference/">Text Analytics Summit, Boston, May 18 - 19</a> I spoke with Seth Grimes, who kicks off the meeting with a keynote, asking whether he sees a change in emphasis this year from straight text mining and text analytics. You'll have to attend to get his full speech but Seth shared that he see a newfound recognition that "Big Data" is coming to grips with text source information as an asset that has special requirements (and value). He also noted that unstructured document complexities can benefit from text analytics to create semantic understanding that improves search, and that text analytics products are rising to challenge for providing dynamic semantic analysis, particularly around massive amounts of social textual content.</p>

<p><a href="http://lucenerevolution.org/">Lucene Revolution, San Francisco, May 23 - 24</a> [...<em>hear from ... the foremost experts on open source search technology to a broad cross-section of users that have implemented Lucene, Solr, or LucidWorks Enterprise to improve search application performance, scalability, flexibility, and relevance, while lowering their costs.</em>] I attended this new meeting last year when it was in Boston. For any enterprise considering or leaning toward implementing open source search, particularly Lucene or Solr, this meeting will set you on a path for understanding what that journey entails.<br />
</p>]]>
        
    </content>
</entry>

<entry>
    <title>ETL and Building Intelligence Behind Semantic Search</title>
    <link rel="alternate" type="text/html" href="http://gilbane.com/search_blog/2011/03/etl_and_building_intelligence_behind_semantic_search.html" />
    <id>tag:gilbane.com,2011:/search_blog//49.10907</id>

    <published>2011-03-11T20:01:41Z</published>
    <updated>2011-03-14T14:03:06Z</updated>

    <summary>When documents and other forms of electronic content are fed to a knowledgebase for semantic retrieval, finely crafted metadata (data describing the content) and excellent vocabulary control add enormous value. These two content enhancers, metadata and controlled vocabularies, can transform good search into excellent search.
...
Tools that act as filters and statistical analyzers of text data warehouses will help reveal terminology for use in building specialized controlled vocabularies for use in auto-categorization.</summary>
    <author>
        <name>Lynda Moulton</name>
        <uri>http://gilbane.com/blog/mt-cp.cgi?__mode=view&amp;blog_id=49&amp;id=14</uri>
    </author>
    
        <category term="Search Technologies and Products" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="knowledgemanagement" label="Knowledge management" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="metadatamanagement" label="Metadata management" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="semanticsearch" label="Semantic search" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://gilbane.com/search_blog/">
        <![CDATA[<p>A recent inquiry about a position requiring ETL (Extraction/Transformation/Loading) experience prompted me to survey the job market in this area. It was quite a surprise to see that there are many technical positions seeking this expertise, plus experience with SQL databases, and XML, mostly in healthcare, finance or with data warehouses. I am also observing an uptick in contract positions for metadata and taxonomy development.</p>

<p>My research on <a href="http://gilbane.com/Research-Reports.html#semantic">Semantic Software Technologies</a> placed me on a path for reporters and bloggers to seek my thoughts on the Watson-Jeopardy story. Much has been written on the story but I wanted to try a fresh take on the meaning of it all. There is a connection to be made between the ETL field and building a knowledgebase with the smarts of Watson. Inspiration for innovation can be drawn from the Watson technology but there is a caveat; it involves the expenditure of serious mental and computing perspiration.</p>

<p>Besides baked-in intelligence for answering human questions using natural language processing (NLP) to search, an answer-platform like Watson requires tons of data. Also, data must be assembled in conceptually and contextually relevant databases for good answers to occur. When documents and other forms of electronic content are fed to a knowledgebase for semantic retrieval, finely crafted metadata (data describing the content) and excellent vocabulary control add enormous value. These two content enhancers, metadata and controlled vocabularies, can transform good search into excellent search.</p>

<p>The irony of current enterprise search is that information is in such abundance that it overwhelms rather than helps findability. Content and knowledge managers can't possibly contribute the human resources needed to generate high quality metadata for everything in sight. But there are numerous techniques and technologies to supplement their work by explicitly exploiting the mountain of information.</p>

<p>Good content and knowledge managers know where to find top quality content but may not know that, for all common content formats, there are tools to extract key metadata embedded (but hidden) in it. Some of these tools can also text mine and analyze the content for additional intelligent descriptive data. When content collections are very large but too small to justify (under a million documents) the most sophisticated and complex semantic search engines, ETL tools can relieve pressure on metadata managers by automating a lot of mining, extracting entities and concepts needed for good categorization.</p>

<p>The ETL tool array is large and varied. Platform tools from Microsoft (<a href="http://technet.microsoft.com/en-us/library/cc917721.aspx">SSIS</a>) and IBM (<a href="http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/index.jsp?topic=/com.ibm.swg.im.iis.mdbbr.doc/topics/t_MB_ImportUsingDesigner.html">DataStage</a>) may be employed to extract, transform and load existing metadata. Other independent products such as those from <a href="http://www.pervasive.com/">Pervasive</a> and <a href="http://seal-software.com/index.html">SEAL</a> may contribute value across a variety of platforms or functional areas from which content can be dramatically enhanced for better tagging and indexing. The call for ETL experts is usually expressed in terms of engineering functions who would be selecting, installing and implementing these products. However, i<u>t has to be stressed that subject and content experts are required to work with engineers. The role of the latter is to help tune and validate the extraction and transformation outcomes</u>, making sure terminology fits function.</p>

<p>Entity extraction is one major outcome of text mining to support business analytics, but tools can do a lot more to put intelligence into play for semantic applications. Tools that act as filters and statistical analyzers of text data warehouses will help reveal terminology for use in building specialized controlled vocabularies for use in auto-categorization. A few vendors that are currently on my radar to help enterprises understand and leverage their content landscape include <a href="http://www.entropysoft.net/cms/home/Product/contentetl">EntropySoft Content ETL</a>, <a href="http://www.infoextract.com/">Information Extraction Systems</a>, <a href="http://www.intelligenx.com/">Intelligenx</a>, <a href="http://www.isys-search.com/technology/isysfilereaders/index.html">ISYS Document Filters</a>, <a href="http://www.ramp.com/">RAMP</a>, and <a href="http://www.xsb.com/solutions_caseStudies.aspx#gmdf">XBS</a>, something here for everyone. </p>

<p>The diversity of emerging applications is a leading indicator that there is a lot of innovation to come with all aspects of ETL. While RAMP is making headway with video, another firm with a local connection is <a href="http://www.inforbix.com/">Inforbix</a>. I spoke with a co-founder, Oleg Shilovitsky for my semantic technology research last year before they launched. As he then asserted, it is critical to preserve, mine and leverage the data associated with design and manufacturing operations. This area has huge growth potential and Inforbix is now ready to address that market.</p>

<p>Readers who seek to leverage ETL and text mining will gain know-how from the cases presented at the <a href="http://www.textanalyticsnews.com/text-mining-conference/">2011 Text Analytics Summit</a>, May 18-19 in Boston. As well, the exhibits will feature products to consider for making piles of data a valuable knowledge asset. I'll be interviewing experts who are speaking and exhibiting at that conference for a future piece. I hope readers will attend and seek me out to talk about your metadata management and text mining challenges. This will feed ideas for future posts.</p>

<p>Finally, I'm not the only one thinking along these lines. You will find other ideas and a nudge to action in these articles.</p>

<p><small>Boeri, Bob. I<a href="http://www.slideshare.net/bboeri/d-1-boerifindabilitybehindfirewallwithnotes">mproving Findability Behind the Firewall</a>, 28 slides. Enterprise Search Summit 2010, NY, 05/2010.<br />
Farrell, Vickie. <a href="http://www.information-management.com/infodirect/20050909/1036703-1.html">The Need for Active Metadata Integration: The Hard Boiled Truth</a>. DM Direct Newsletter, 09/09/2005, 3p<br />
McCreary, Dan. <a href="http://semanticweb.com/entity-extraction-and-the-semantic-web_b10675?red=su">Entity Extraction and the Semantic Web</a>, Semantic Universe, 01/01/2009<br />
White, David. <a href="http://www.kmworld.com/Articles/Editorial/Feature/BI-or-bust-57555.aspx">BI or bust?</a> KMWorld, 10/28/2009, 3p.<br />
</small></p>]]>
        
    </content>
</entry>

<entry>
    <title>How Far Does Semantic Software Really Go?</title>
    <link rel="alternate" type="text/html" href="http://gilbane.com/search_blog/2011/02/how_far_does_semantic_software_really_go.html" />
    <id>tag:gilbane.com,2011:/search_blog//49.10882</id>

    <published>2011-02-04T01:00:09Z</published>
    <updated>2011-02-04T01:18:35Z</updated>

    <summary>State-of-the-art semantic software will have a back-end process for enabling implementer/administrators to use the results of search (direct commentary from users or indirectly by analyzing search logs) to discover where language has been misunderstood as evidenced by invalid results. Over time, more passes to update linguistic definitions, grammar rules, and concept relationships will continue to refine and improve the accuracy and comprehensiveness of search results.</summary>
    <author>
        <name>Lynda Moulton</name>
        <uri>http://gilbane.com/blog/mt-cp.cgi?__mode=view&amp;blog_id=49&amp;id=14</uri>
    </author>
    
        <category term="Search Problems/Solved Search Problems" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="ontologies" label="Ontologies" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="searchadministration" label="Search administration" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="semanticsearch" label="Semantic search" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="semanticsoftwareapplications" label="Semantic software applications" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://gilbane.com/search_blog/">
        <![CDATA[<p>A discussion that began with a graduate scholar at George Washington University in November, 2010 about semantic software technologies prompted him to follow up with some questions for clarification from me. With his permission, I am sharing three questions from Evan Faber and the gist of my comments to him. At the heart of the conversation we all need to keep having is, how far does this technology go and does it really bring us any gains in retrieving information?</p>

<blockquote></blockquote>1.	<em>Have AI or semantic software demonstrated any capability to ask new and interesting questions about the relationships among information that they process?</em>

<p>In several recent presentations and the Gilbane Group study on <a href="http://gilbane.com/Research-Reports.html#semantic">Semantic Software Technologies</a>, I share a simple diagram of the nominal setup for the relationship of content to search and the semantic core, namely a set of terminology rules or terminology with relationships. Semantic search operates best when it focuses on a topical domain of knowledge. The language that defines that domain may range from simple to complex, broad or narrow, deep or shallow. The language may be applied to the task of semantic search from a <em>taxonomy</em> (usually shallow and simple), a set of <em>language rules</em> (numbering thousands to millions) or from an <em>ontology</em> of concepts to a <em>semantic net</em> with millions of terms and relationships among concepts.</p>

<p>The question Evan asks is a good one with a simple answer, "Not without configuration." The configuration needs human work in two regions:<br />
•	Management of the linguistic rules or ontology<br />
•	Design of search engine indexing <u>and</u> retrieval mechanisms</p>

<p>When a semantic search engine indexes content for natural language retrieval, it looks to the rules or semantic nets to find concepts that match those in the content. When it finds concepts in the content with no equivalent language in the semantic net, it must find a way to understand where the concepts belong in the ontological framework. This discovery process for clarification, disambiguation, contextual relevance, perspective, meaning or tone is best accompanied with an interface making it easy for a human curator or editor to update or expand the ontology. A subject matter expert is required for specialized topics. Through a process of automated indexing that both categorizes and exposes problem areas, the semantic engine becomes a search engine <u>and</u> a questioning engine.</p>

<p>The entire process is highly iterative. In a sense, the software is asking the questions: "What is this?", "How does it relate to the things we already know about?", "How is the language being used in this context?" and so on.</p>

<blockquote></blockquote>2.	<em>In other words, once they [the software] have established relationships among data, can they use that finding to proceed - without human intervention- to seek new relationships?</em>

<p>Yes, in the manner described for the previous question. It is important to recognize that the original set of rules, ontologies, or semantic nets that are being applied were crafted by human beings with subject matter expertise. It is unrealistic to think that any team of experts would be able to know or anticipate every use of the human language to codify it in advance for total accuracy. The term AI is, for this reason, a misnomer because the algorithms are not thinking; they are only looking up "known-knowns" and applying them. The art of the software is in recognizing when something cannot be discerned or clearly understood; then the concept (in context) is presented for the expert to "teach" the software what to do with the information.</p>

<p>State-of-the-art software will have a back-end process for enabling implementer/administrators to use the results of search (direct commentary from users or indirectly by analyzing search logs) to discover where language has been misunderstood as evidenced by invalid results. Over time, more passes to update linguistic definitions, grammar rules, and concept relationships will continue to refine and improve the accuracy and comprehensiveness of search results.</p>

<blockquote></blockquote>3.	<em>It occurs to me that the key value added of semantic technologies to decision-making is their capacity to link sources by context and meaning, which increases situational awareness and decision space. But can they probe further on their own?</em>

<p>Good point on the value and in a sense, yes, they can. Through extensive algorithmic operations, instructions can be embedded (and probably are for high-value situations like intelligence work), <u>instructing the software what to do with newly discovered concepts</u>. Instructions might then place these new discoveries into categories of relevance, importance, or associations. It would not be unreasonable to then pass documents with confounding information off to other semantic tools for further examination. Again, without human analysis along the continuum and at the end point, no certainty about the validity of the software's decision-making can be asserted.</p>

<p>I can hypothesize a case in which a corpus of content contains random documents in foreign languages. From my research, I know that some of the semantic packages have semantic nets in multiple languages. If the corpus contains material in English, French, German and Arabic, these materials might be sorted and routed off to four different software applications. Each batch would be subject to further linguistic analysis, followed by indexing with some middleware applied to the returned results for normalization, and final consolidation into a unified index. Does this exist in the real world now? Probably there are variants but it would take more research to find the cases, and they may be subject to restrictions that would require the correct clearances.</p>

<p>Discussions with experts who have actually employed enterprise specific semantic software, underscores the need for subject expertise, and some computational linguistics training coupled with an aptitude for creative inquiry. These scientists informed me that individuals, who are highly multi-disciplinary <u>and</u> facile with electronic games and tools, did the best job of interacting with the software <u>and</u> getting excellent results. Tuning and configuration over time by the right human players is still a fundamental requirement.<br />
</p>]]>
        
    </content>
</entry>

<entry>
    <title>Enterprise Trends: Contrarians and Other Wise Forecasters</title>
    <link rel="alternate" type="text/html" href="http://gilbane.com/search_blog/2011/01/enterprise_trends_contrarians_and_other_wise_forecasters.html" />
    <id>tag:gilbane.com,2011:/search_blog//49.10854</id>

    <published>2011-01-13T19:44:59Z</published>
    <updated>2011-01-13T22:00:19Z</updated>

    <summary>It is time to stop treating enterprise search like a failed experiment and instead, leverage it to address some long-standing technology elephants roaming around our enterprises.</summary>
    <author>
        <name>Lynda Moulton</name>
        <uri>http://gilbane.com/blog/mt-cp.cgi?__mode=view&amp;blog_id=49&amp;id=14</uri>
    </author>
    
        <category term="Search Problems/Solved Search Problems" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="bi" label="BI" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="enterprisesearchforecasts" label="Enterprise search forecasts" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="governance" label="Governance" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="textanalytics" label="Text analytics" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://gilbane.com/search_blog/">
        <![CDATA[<p>The gradual upturn from the worst economic conditions in decades is reason for hope. A growing economy coupled with continued adoption of enterprise software, in spite of the tough economic climate, keep me tuned to what is transpiring in this industry. Rather than being cajoled into believing that "search" has become commodity software, which it hasn't, I want to comment on the wisdom of Jill Dyché and her <a href="http://www.information-management.com/blogs/business_intelligence_data_governance_MDM-10019396-1.html?ET=informationmgmt:e1954:2164046a:&st=email&utm_source=editorial&utm_medium=email&utm_campaign=IM_Blogs_082510_010511">Anti-predictions for 2011</a> in a recent <u>Information Management Blog</u>. There are important lessons here for enterprise search professionals, whether you have already implemented or plan to soon.</p>

<p>Taking her points out of order, I offer a bit of commentary on those that have a direct relationship to enterprise search. Based on past experience, Ms. Dyché predicts some negative outcomes but with a clear challenge for readers to prove her wrong. As noted, enterprise search offers some solutions to meet the challenges:<br />
<ol><br />
	<li><em>No one will be willing to shine a bright light on the fact that the data on their enterprise data warehouse isn't integrated</em>. It isn't just the data warehouse that lacks integration among assets, but among all applications housing critical structured and unstructured content. This does not have to be the case. <u>Several state-of-the-art enterprise search products that are not tied to a specific platform or suite of products do a fine job of federating indexing of disparate content repositories</u>. In a matter of weeks or few months, a search solution can be deployed to crawl, index and search multiple sources of content. Furthermore, newer search applications are being offered for pre-purchase testing for out-of-the-box suitability in pilot or proof-of-concept (POC) projects. Organizations that are serious about integrating content silos have no excuse for not taking advantage of easier to deploy search products.</li><br />
	<li><em>Even if they are presented with proof of value, management will be reluctant to invest in data governance</em>. Combat this entrenched bias with a strategy to overcome lack of governance; a cost cutting  argument is unlikely to change minds. However, <u>risk is an argument that will resonate</u>, particularly when bolstered with examples. Include instances when customers were lost due to poor performance or failure to deliver adequate support services, sales were lost because answers to qualifying questions could not be answered or were not timely, legal or contract issues could not be defended due to inaccessibility of critical supporting documents, or when maintenance revenue was lost due to incomplete, inaccurate or late renewal information getting out to clients. One simple example is the consequences of not sustaining a concordance of customer name, contact, and address changes. The inability of content repositories to talk to each other or aggregate related information in a search because a Customer labeled as <em>Marion University</em> at one address is the same as the  Customer labeled <em>University of Marion</em> at another address will be embarrassing in communications and, even worse, costly. Governance of processes like naming conventions and standardized labeling enhances the value and performance of every enterprise system including search.</li><br />
	<li><em>Executives won't approve new master data management or business intelligence funding without an ROI analysis</em>. This ties in with the first item because many enterprise search applications include excellent tools for performing business intelligence, analytics, and advanced functions to track and evaluate content resource use. The latter is an excellent way to understand who is searching, for what types of data, and the language used to search. <u>These supporting functions are being built into applications for enterprise search and do not add additional cost to product licenses or implementation</u>. Look for enterprise search applications that are delivered with tools that can be employed on an ad hoc basis by any business manager.</li><br />
	<li><em>Developers won't track their time in any meaningful way</em>. This is probably true because many managers are poorly equipped to evaluate what goes into software development. However, in this era of adoption of open source, particularly for enterprise search, organizations that commit to using Lucene or Solr (open source search) must be clear on the cost of building these tools into functioning systems for their specialized purposes. <u>Whether development will be done internally or by a third party, it is essential to place strong boundaries around each project and deployment, with specifications that stage development, milestones and change orders</u>. "Free" open source software is not free or even cost effective when an open meter for "time and materials" exists.</li><br />
	<li><em>Companies that don't characteristically invest in IT infrastructure won't change any time soon. So, the silo-ed projects will beget more silo-ed data...</em>Because the adoption rate for new content management applications is so high, and the ease for deploying them encourages replication like rabbits, it is probably futile to try to staunch their proliferation. <u>This is an important area for governance to be employed, to detect redundancy, perform analytics across silos, and call attention to obvious waste and duplication of content and effort</u>. Newer search applications that can crawl and index a multitude of formats and repositories will easily support efforts to monitor and evaluate what is being discovered in search results. Given a little encouragement to report redundancy and replicated content, every user becomes a governor over waste. Play on the natural inclination for people to complain when they feel overwhelmed by messy search results, by setting up a simple (click a button) reporting mechanism to automatically issue a report or set a flag in a log file when a search reveals a problem. </li></ol><br />
It is time to stop treating enterprise search like a failed experiment and instead, leverage it to address some long-standing technology elephants roaming around our enterprises.</p>

<p><br />
To follow other search trends for the coming year, you may want to attend a forthcoming webinar, <a href="http://www.coveo.com/en/news-and-events/events/es-20">11 Trends in Enterprise Search for 2011</a>, which I will be moderating on January 25th. These two blogs also have interesting perspectives on what is in store for enterprise applications: <a href="http://www.information-management.com/blogs/information_management_predictions-10019381-1.html">CSI Info-Mgmt: Profiling Predictors 2011</a>, by Jim Ericson and <a href="http://blogs.forrester.com/clay_richardson/10-12-23-the_hottest_bpm_trends_you_must_embrace_in_2011">The Hottest BPM Trends You Must Embrace In 2011!</a>, by Clay Richardson. Also, some of Ms. Dyché's commentary aligns nicely with "best practices" offered in this recent beacon, <a href="http://gilbane.com/beacons.html#entsearchsuccess">Establishing a Successful Enterprise Search Program: Five Best Practices</a> <br />
</p>]]>
        
    </content>
</entry>

<entry>
    <title>Focused on Unifying Content to Reduce Information Overload</title>
    <link rel="alternate" type="text/html" href="http://gilbane.com/search_blog/2010/12/focused_on_unifying_content_to_reduce_information_overload.html" />
    <id>tag:gilbane.com,2010:/search_blog//49.10842</id>

    <published>2010-12-10T00:37:58Z</published>
    <updated>2010-12-10T18:53:05Z</updated>

    <summary>Whether the process is referred to as unified indexing, federating content or information integration, each constitutes a similar focus among the vendors I took time to engage with at the conference (KMWorld 2010).</summary>
    <author>
        <name>Lynda Moulton</name>
        <uri>http://gilbane.com/blog/mt-cp.cgi?__mode=view&amp;blog_id=49&amp;id=14</uri>
    </author>
    
        <category term="Product Selection" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="accessinnovation" label="Access Innovation" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="attivio" label="Attivio" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="coveo" label="Coveo" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="enterprisesearchindustry" label="Enterprise search industry" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="searchmarketplace" label="Search marketplace" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://gilbane.com/search_blog/">
        <![CDATA[<p>A theme running through the sessions I attended at <em>Enterprise Search Summit</em> and <em>KMWorld 2010</em> in Washington, DC last month was the diversity of ways in which organizations are focused on getting answers to stakeholders more quickly. Enterprises deploying content technologies, all with enterprise search as the end game, seek to narrow search results accurately to retrieve and display the best and most relevant content.</p>

<p>Whether the process is referred to as unified indexing, federating content or information integration, each constitutes a similar focus among the vendors I took time to engage with at the conference. Each is positioned to solve different information retrieval problems, and were selected to underscore what I have tried to express in my recent <u>Gilbane Beacon</u>, <a href="http://gilbane.com/beacons.html#entsearchsuccess"><em>Establishing a Successful Enterprise Search Program: Five Best Practices</em></a>, namely the need to first establish a strategic business need. The best practices include the need for understanding how existing technologies and content structures function is the enterprise before settling on any one product or strategy. The essential activity of conducting a proof of concept (POC) or pilot project to confirm product suitability for the targeted business challenge is clearly mandated.</p>

<p>These products, in alphabetic order, are all notable for their unique solutions tailored to different audiences of users and business requirements. All embody <u>an</u> approach to unifying enterprise content for a particular business function:</p>

<p><u>Access Innovations</u> (AI) was at KMWorld to demonstrate the aptly named product suite, <a href="http://www.dataharmony.com/Products.html">Data Harmony</a>. AI products cover a continuum of tools to build and maintain controlled vocabularies (AKA taxonomies and thesauri), add content metadata through processes tightly integrated with the corresponding vocabularies, search and navigation. Its vocabulary and content management tools can be layered to integrate with existing CMS and enterprise search systems.</p>

<p><a href="http://www.attivio.com/"><u>Attivio</u></a>, a company providing a platform solution known as Active Intelligence Engine (AIE), has developers specializing in open source tools for content retrieval solutions with excellent retrieval as the end point. AIE is a platform for enterprises seeking to unify structured and unstructured content across the enterprise, and from the web. By leveraging open source components they provide their customers with a platform that can be developed to enhance search for a particular solution, including bringing Web 2.0 social content into unity with enterprise content for further business intelligence analysis.</p>

<p><a href="http://www.coveo.com/en/"><u>Coveo</u></a> has steadily marched into a dominant position across all vertical industries with its efficiently packaged and reasonably priced enterprise search solutions, since I was first introduced to them in 2007. Their customers are always enthusiastic presenters at KMWorld, representing a population of implementers who seek to make enterprise search available to users quickly, and with a minimum of fuss. This year,<a href="http://www.enterprisesearchsummit.com/Fall2010/Program/Wednesday.aspx#session_3666"> Shelley Norton</a> from Children's Hospital Boston did not disappoint. She ticked off steps in an efficient selection, implementation and deployment process for getting enterprise search up and running smoothly to deliver trustworthy and accurate results to the hospital's constituents. I always value and respect customer story-telling.</p>

<p><u>Darwin Awareness Engine</u> was named the <a href="http://blog.darwineco.com/2010/11/darwin-awareness-engine-wins-2010-km-promise-award.html">KMWorld Promise Award Winner for 2010</a>. Since their founder is local to our home-base and a frequent participant in the Boston KM Forum (KMF) meetings, we are pretty happy for their official arrival on the scene and the recognition. It was just a year ago that they <a href="http://kmforum.org/blog/?p=572">presented</a> the prototype at the KMF. Our members were excited to see the tool exposing layers of news feeds to hone in on topics of interest to see what was aggregated and connected in really "real-time." Darwin content presentation is unique in that the display reveals relationships and patterns among topics in the Web 2.0 sphere that are suddenly apparent due to their visual connections in the display architecture. The public views are only an example of what a very large enterprise might reveal about its own internal communications through social tools within the organization.</p>

<p>The newest newcomer, <a href="http://www.ramp.com/"><u>RAMP</u></a>, was introduced to me by Nate Treloar in the closing hours of KMWorld. Nate came to this start-up from Microsoft and the FAST group and is excited about this new venture. Neither exhibiting, nor presenting, Nate was anxious to reach out to analysts and potential partners to share the RAMP vision for converting speech from audio and video feeds to reliable searchable text. This would enable the unification of audio, video and other content to finally be searched from its "full text" on the Web in a single pass. Now, we depend on the contribution of explicit metadata by contributors of non-text content. Long awaiting excellence in speech to indexing for search, I was "all ears" during our conversation and look forward to seeing more of RAMP at future meetings.</p>

<p>Whatever the strategic business need, the ability to deliver a view of information that is unified, cohesive and contextually understandable will be a winning outcome. With the <u>Beacon</u> as a checklist for your decision process, information integration is attainable by making the right software selection for your enterprise application.</p>]]>
        
    </content>
</entry>

<entry>
    <title>Coherence and Augmentation: KM-Search Connection</title>
    <link rel="alternate" type="text/html" href="http://gilbane.com/search_blog/2010/11/coherence_and_augmentation_km-search_connection.html" />
    <id>tag:gilbane.com,2010:/search_blog//49.10829</id>

    <published>2010-11-29T21:59:17Z</published>
    <updated>2010-11-29T22:20:29Z</updated>

    <summary>Search is a human operation and begins with the workforce. Going back to Stewart who commented on the need to recognize different kinds of knowledge, I posit that different kinds of knowledge demand different kinds of search. This is precisely what so many &quot;enterprise search&quot; initiatives fail to deliver. Implementers fail to account for all the different kinds of search, search for facts, search for expertise, search for specific artifacts, search for trends, search for missing data, etc.
...
And when Snowden notes that &quot;There are limits to semantic technologies: Language is constantly changing so there is a requirement for constant tuning to sustain the same level of good results,&quot; he is reminding us that technology is only good for cognitive augmentation. Technology is not a &quot;plug &apos;n play,&quot; install and reap magical cognitive insights. It requires constant tuning to adapt to new kinds of knowledge.</summary>
    <author>
        <name>Lynda Moulton</name>
        <uri>http://gilbane.com/blog/mt-cp.cgi?__mode=view&amp;blog_id=49&amp;id=14</uri>
    </author>
    
        <category term="Management" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="knowledgemanagement" label="Knowledge management" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="searchinfrastructure" label="Search infrastructure" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="semanticsoftware" label="Semantic software" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://gilbane.com/search_blog/">
        <![CDATA[<p>This space is not normally used to comment on knowledge management (KM), one of my areas of consulting, but a recent conference gives me an opening to connect the dots between KM and search. Dave Snowden and Tom Stewart always have worthy commentary on KM and as keynote speakers they did not disappoint at KMWorld. It may seem a stretch but by taking a few of their thoughts out of context, I can synthesize a relationship between KM and search.</p>

<p><em>KMWorld</em>, <em>Enterprise Search Summit</em>, <em>SharePoint Symposium</em> and <em>Taxonomy Boot Camp</em> moved to Washington D.C. for the 2010 Fall Conference earlier this month. I attended to teach a workshop on building a semantic platform, and to participate in a panel discussion to wrap up the conference with two other analysts, Leslie Owen and Tony Byrne with Jane Dysart moderating.</p>

<p>Comments from the first and last keynote speakers of the conference inspired my final panel comments, counseling attendees to lead by thoughtfully leveraging technology only to enhance knowledge. But there were other snippets that prompt me to link search and KM.</p>

<p>Tom Stewart's talk was entitled,<em> Knowledge Driven Enterprises: Strategies & Future Focus</em>, which he couched in the context of achieving a "coherent" winning organization. He explained that to reach the coherence destination requires understanding of different types of knowledge and how we need to behave for attaining each type (e.g. "knowable complicated "knowledge calls for experts and research; "emergent complex" knowledge calls for leadership and "sense-making.").</p>

<p>Stewart describes successful organizations as those in which "the opportunities <em>outside</em> line up with the capabilities <em>inside</em>." He explains that those "companies who do manage to reestablish focus around an aligned set of key capabilities" use their "intellectual capital" to identify their intangible assets," human capability, structural capital, and customer capital. They build relationship capital from among these capabilities to create a coherent company. Although Stewart does not mention "search," it is important to note that one means to identify intangible assets is well-executed enterprise search with associated analytical tools.</p>

<p>Dave Snowden also referenced "coherence," (<em>messy coherence</em>), even as he spoke about how failures tend to be more teachable (memorable) than successes. If you follow Snowden, you know that he founded the <a href="http://www.cognitive-edge.com/">Cognitive Edge</a> and has developed a model for applying cognitive learning to help build resilient organizations. He has taught complexity analysis and sense-making for many years and his interest in human learning behaviors is deep.</p>

<p>To follow the entire thread of Snowden's presentation on the "The Resilient Organization" follow <a href="http://www.cognitive-edge.com/presentationdetails.php?presentationid=72">this link</a>. I was particularly impressed with his statement about the talk, "one of the most heart-felt I have given in recent years." It was one of his best but two particular comments bring me to the connection between KM and search.</p>

<p>Dave talked about technology as "cognitive augmentation," its only truly useful function. He also puts forth what he calls the "three Golden rules: Use of distributed cognition, <em>wisdom but not foolishness of crowds</em>; finely grained objects, <em>information and organizational</em>; and disintermediation, <em>putting decision makers in direct contact with raw data</em>."</p>

<p>Taking these fragments of Snowden's talk, a technique he seems to encourage, I put forth a synthesized view of how knowledge and search technologies need to be married for consequential gain.</p>

<p>We live and work in a highly chaotic information soup, one in which we are fed a steady diet of fragments (links, tweets, analyzed content) from which we are challenged as thinkers to derive coherence. The best knowledge practitioners will leverage this messiness by detecting weak signals and seek out more fragments, coupling them thoughtfully with "raw data" to synthesize new innovations, whether they be practices, inventions or policies. Managing shifting technologies, changing information inputs, and learning from failures (our own, our institution's and others) contributes to building a resilient organization.</p>

<p>So where does "search" come in? Search is a human operation and begins with the workforce. Going back to Stewart who commented on the need to recognize different kinds of knowledge, I posit that different kinds of knowledge demand different kinds of search. This is precisely what so many "enterprise search" initiatives fail to deliver. Implementers fail to account for all the different kinds of search, search for facts, search for expertise, search for specific artifacts, search for trends, search for missing data, etc.<br />
 <br />
When Dave Snowden states that "all of your workforce is a human scanner," this could also imply the need for multiple, co-occurring search initiatives. Just as each workforce member brings a different perspective and capability to sensory information gathering, so too must enterprise search be set up to accommodate all the different kinds of knowledge gathering. And when Snowden notes that "There are limits to semantic technologies: Language is constantly changing so there is a requirement for constant tuning to sustain the same level of good results," he is reminding us that technology is only good for cognitive augmentation. Technology is not a "plug 'n play," install and reap magical cognitive insights. It requires constant tuning to adapt to new kinds of knowledge.</p>

<p>The point is one I have made before; it is the human connection, human scanner and human understanding of all the kinds of knowledge we need in order to bring coherence to an organization. The better we balance these human capabilities, the more resilient we'll be and the better skilled at figuring out what kinds of search technologies really make sense for today, and tomorrow we had better be ready for another tool for new fragments and new knowledge synthesis.<br />
</p>]]>
        
    </content>
</entry>

<entry>
    <title>Lucene Open Source Community Commits to a Future in Search</title>
    <link rel="alternate" type="text/html" href="http://gilbane.com/search_blog/2010/10/lucene_open_source_community_commits_to_a_future_in_search.html" />
    <id>tag:gilbane.com,2010:/search_blog//49.10793</id>

    <published>2010-10-27T23:04:12Z</published>
    <updated>2010-10-28T00:51:08Z</updated>

    <summary>The argument would be to go with open source for many institutions when there is an imperative or call for major customization...This appears to be the case for two types of enterprises that were featured on the program: educational institutions and government agencies.</summary>
    <author>
        <name>Lynda Moulton</name>
        <uri>http://gilbane.com/blog/mt-cp.cgi?__mode=view&amp;blog_id=49&amp;id=14</uri>
    </author>
    
        <category term="Search Technologies and Products" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="enterpriseapplications" label="Enterprise applications" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="lucidimagination" label="Lucid Imagination" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="opensourcesearch" label="Open source search" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://gilbane.com/search_blog/">
        <![CDATA[<p>It has been nearly two years since I <a href="http://gilbane.com/search_blog/2009/01/open_source_search_search_appl.html">commented</a> on an article in <u>Information Week</u>, <em>Open Source, Its Time has Come</em>, Nov. 2008. My main point was the need for deep expertise to execute enterprise search really well. I predicted the growth of service companies with that expertise, particularly for open source search. Not long after I <a href="http://gilbane.com/search_blog/2009/04/march_madness_in_the_search_industry.html">announced</a> that, Lucid Imagination was launched, with its focus on building and supporting solutions based on Lucene and, its more turnkey version, Solr.</p>

<p>It has not taken long for <a href="http://www.lucidimagination.com/">Lucid Imagination</a> (LI) to take charge of the Lucene/Solr community of practice (CoP), and to launch its own platform built on Solr, Lucidworks Enterprise. Open source depends on deep and sustained collaboration; LI stepped into the breach to ensure that the hundreds of contributors, users and <a href="http://producingoss.com/en/committers.html">committers</a> have a forum. I am pretty committed to CoPs myself and know that nurturing a community for the long haul takes dedicated leadership. In this case it is undoubtedly enlightened self-interest that is driving LI. They are poised to become the strongest presence for driving continuous improvements to open source search, with Apache Lucene as the foundation.</p>

<p>Two weeks ago LI hosted <a href="http://www.lucidimagination.com/events/revolution2010/presentation-abstracts">Lucene Revolution</a>, the first such conference in the US. It was attended by over 300 in Boston, October 7-8 and I can report that this CoP is vibrant, enthusiastic. Moderated by <a href="http://arnoldit.com/wordpress/2010/10/">Steve Arnold</a>, the program ran smoothly and with excellent sessions. Those I attended reflected a respectful exchange of opinions and ideas about tools, methods, practices and priorities. While there were allusions to <u>vigorous</u> debate among committers about priorities for code changes and upgrades, the mood was collaborative in spirit and tinged with humor, always a good way to operate when emotions and convictions are on stage.</p>

<p>From my 12 pages of notes come observations about the three principal categories of sessions:<br />
1.	Discussions, debates and show-cases for significant changes or calls for changes to the code<br />
2.	Case studies based on enterprise search applications and experiences<br />
3.	Case studies based on the use of Lucene and Solr embedded in commercial applications</p>

<p>Since the first category was more technical in nature, I leave the reader with my simplistic conclusions: core Apache Lucene and Solr will continue to evolve in a robust and aggressive progression. There are sufficient committers to make a serious contribution. Many who have decades of search experience are driving the charge and they have cut their teeth on the more difficult problems of implementing enterprise solutions. In announcing <a href="http://www.lucidimagination.com/enterprise-search-solutions/lucidworks">Lucidworks Enterprise</a>, LI is clearly bidding to become a new force in the enterprise search market.</p>

<p>New and sustained build-outs of Lucene/Solr will be challenged by developers with ideas for diverging architectures, or "forking" code, on which Eric Gries, LI CEO, commented in the final panel. He predicted that forking will probably be driven by the need to solve specific search problems that current code does not accommodate. This will probably be more of a challenge for the spinoffs than the core Lucene developers, and the difficulty of sustaining separate versions will ultimately fail.</p>

<p>Enterprise search cases reflected those for whom commercial turnkey applications will not or cannot easily be selected; for them open source will make sense. Coming from LI's counterpart in the Linux world, Red Hat, are <a href="http://www.informationweek.com/news/software/linux/showArticle.jhtml?articleID=227900364&cid=nl_IW_daily_2010-10-21_html">these earlier observations</a> about why enterprises should seek to embrace open source solutions, in short the sorry state of quality assurance and code control in commercial products. Add to that the cost of services to install, implement and customize commercial search products. The argument would be to go with open source for many institutions when there is an imperative or call for major customization.</p>

<p>This appears to be the case for two types of enterprises that were featured on the program: educational institutions and government agencies. Both have procurement issues when it comes to making large capital expenditures. For them it is easier to begin with something free, like open source software, then make incremental improvements and customize over time. Labor and services are cost variables that can be distributed more creatively using multiple funding options. Featured on the program were the <a href="http://www.lucidimagination.com/files/LuceneRevPreso_Wang_Smithsonian.pdf">Smithsonian</a>, <a href="http://www.lucenerevolution.org/sites/default/files/LuceneRevPreso_Arnold_Leveraging_Lucene_Government.pdf">Adhere Solutions</a> doing systems integration work for a number of government agencies, <a href="http://www.lucidimagination.com/files/Lucene Rev Preso Smiley Spatial Search.pdf">MITRE</a> (a federally funded research laboratory), <a href="http://www.lucenerevolution.org/sites/default/files/LuceneRevPreso_BurtonWest.pdf">U. of Michigan</a>, and <a href="http://www.lucenerevolution.org/sites/default/files/LuceneRevPreso_Lovins_Scaling_the_Stacks.pdf">Yale</a>. <a href="http://lucenerevolution.com/sites/default/files/slides/Lucene Rev Preso Sonali_Cisco.ppt">CISCO</a> also presented, a noteworthy commercial enterprise putting Lucene/Solr to work.</p>

<p>The third category of presenters was, by far, the largest contingent of open source search adopters, producers of applications that leverage Lucene and Solr (and other open source software) into their offerings. They are solidly entrenched because they are diligent committers, and share in this community of like-minded practitioners who serve as an extended enterprise of technical resources that keeps their overhead low. I can imagine the attractiveness of a lean business that can run with an open source foundation, and operates in a highly agile mode. This must be enticing and exciting for developers who wilt at the idea of working in a constrained environment with layers of management and political maneuvering.</p>

<p>Among the companies building applications on Lucene that presented were: <a href="http://www.lucidimagination.com/events/revolution2010/presentation-abstracts#leveraging-lucene">Access Innovations</a>, <a href="http://www.lucenerevolution.org/sites/default/files/Lucene Rev Preso Busch Realtime_Search_LR1010.pdf">Twitter</a>, <a href="http://lucenerevolution.com/sites/default/files/slides/Lucene Rev Preso Wang LinkedIn Search.pdf">LinkedIn</a>, <a href="http://lucenerevolution.com/sites/default/files/slides/Lucene Rev Preso Wolanin Drupal.pdf">Acquia</a>, <a href="http://www.lucenerevolution.org/sites/default/files/LuceneRevPreso_Verkaik_Rivet_Alfresco-solr-v6.pdf">RivetLogic</a> and <a href="http://www.lucidimagination.com/files/LuceneRev_BPress_SalesForceCloudSearch.pdf">Salesforce.com</a>. These stand out as relatively mature adopters with traction in the marketplace. There were also companies present that contribute their value through Lucene/Solr partnerships in which their products or tools are complementary including: <a href="http://www.lucidimagination.com/files/2010-10-Building-Global-Listening-Platform-with-Solr.pdf">Basis Technology</a>, <a href="http://www.documill.com/en/">Documill</a>, and <a href="http://www.lucidimagination.com/files/LuceneRevPreso_Gifford_Loggly.pdf">Loggly</a>.</p>

<p>Links to presentations by organizations mentioned above will take you to conference highlights. Some will appeal to the technical reader for there was a lot of code sharing and technical tips in the slides. The diversity and scale of applications that are being supported by Lucene and Solr was impressive. Lucid Imagination and the speakers did a great job of illustrating why and how open source has a serious future in enterprise search. This was a confidence building exercise for the community.</p>

<p>Two sentiments at the end summed it up for me. On the technical front Eric Gries observed that it is usually clear what needs to be core (to the code) and what does not belong. Then there is a lot of gray area, and that will contribute to constant debate in the community. For the user community, Charlie Hull, of <a href="http://www.flax.co.uk/">flax</a> opined that customers don't care whether (the code) is in the open source core or in the special "secret sauce" application, as long as the product does what they want.</p>]]>
        
    </content>
</entry>

<entry>
    <title>What an Analyst Needs to Do What We Do</title>
    <link rel="alternate" type="text/html" href="http://gilbane.com/search_blog/2010/09/what_an_analyst_needs_to_do_what_we_do.html" />
    <id>tag:gilbane.com,2010:/search_blog//49.10745</id>

    <published>2010-09-27T22:48:11Z</published>
    <updated>2010-09-27T22:58:24Z</updated>

    <summary>The most desirable contacts for learning about any product are customers with direct experience using the application. Sometimes we gain access to customers through vendor introductions but we also try very hard to get users to speak to us through surveys and interviews...</summary>
    <author>
        <name>Lynda Moulton</name>
        <uri>http://gilbane.com/blog/mt-cp.cgi?__mode=view&amp;blog_id=49&amp;id=14</uri>
    </author>
    
        <category term="Search Research and Reference Sites" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="analysts" label="Analysts" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="research" label="Research" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="searchmarketplace" label="Search marketplace" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="semanticsoftwareapplications" label="Semantic software applications" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://gilbane.com/search_blog/">
        <![CDATA[<p><a href="http://gilbane.com/Research-Reports.html"><em>Semantic Software Technologies: Landscape of High Value Applications</em></a> for the Enterprise is now posted for you to download for free; please do so. The topic is one I've followed for many years and was convinced that the information about it needed to be captured in a single study as the number of players and technologies had expanded beyond my capacity for mental organization.</p>

<p>As a librarian, it was useful to employ a genre of publications known as "bibliography of bibliographies" on any given topic when starting a research project. As an analyst, gathering the baskets of emails, reports, and publications on the industry I follow, serves a similar purpose. Without a filtering and sifting of all this content, it had become overwhelming to understand and comment on the individual components in the semantic landscape.</p>

<p>Relating to the process of report development, it is important for readers to understand how analysts do research and review products and companies. Our first goal is to avoid bias toward one vendor or another. Finding users of products and understanding the basis for their use and experiences is paramount in the research and discovery process. With software as complex as semantic applications, we do not have the luxury of routine hands-on experience, testing real applications of dozens of products for comparison.</p>

<p>The most desirable contacts for learning about any product are customers with direct experience using the application. Sometimes we gain access to customers through vendor introductions but we also try very hard to get users to speak to us through surveys and interviews, often anonymously so that they do not jeopardize their relationship with a vendor. We want these discussions to be frank.</p>

<p>To get a complete picture of any product, I go through numerous iterations of looking at a company through its own printed and online information, published independent reviews and analysis, customer comments and direct interviews with employees, users, former users, etc. Finally, I like to share what I have learned with vendors themselves to validate conclusions and give them an opportunity to correct facts or clarify product usage and market positioning.</p>

<p>One of the most rewarding, interesting and productive aspects of research in a relatively young industry like semantic technologies is having direct access to innovators and seminal thinkers. Communicating with pioneers of new software who are seeking the best way to package, deploy and commercialize their offerings is exciting. There are many more potential products than those that actually find commercial success, but the process for getting from idea to buyer adoption is always a story worth hearing and from which to learn.</p>

<p>I receive direct and indirect comments from readers about this blog. What I don't see enough of is posted commentary about the content. Perhaps you don't want to share your thoughts publicly but any experiences or ideas that you want to share with me are welcomed. You'll find my direct email contact information through <a href="http://gilbane.com/contact.html">Gilbane.com</a> and you can reach me on Twitter at <em>lwmtech</em>. My research depends on getting input from all types of users and developers of content software applications, so, please raise your hand and comment or volunteer to talk.</p>]]>
        
    </content>
</entry>

<entry>
    <title>Leveraging Two Decades of Computational Linguistics for Semantic Search</title>
    <link rel="alternate" type="text/html" href="http://gilbane.com/search_blog/2010/09/leveraging_two_decades_of_computational_linguistics_for_semantic_search.html" />
    <id>tag:gilbane.com,2010:/search_blog//49.10722</id>

    <published>2010-09-14T18:27:56Z</published>
    <updated>2010-09-14T18:40:56Z</updated>

    <summary>Early adopters are key contributors to any software development. It is notable that Cognition has attracted experts in fields as diverse as medical research, legal e-discovery and Web semantic search. This gives the company valuable feedback for their commercial development. In any highly technical discipline, it is challenging and exciting to finding subject experts knowledgeable enough to contribute to product evolution and Cognition is learning from client experts where the best opportunities for growth lie.</summary>
    <author>
        <name>Lynda Moulton</name>
        <uri>http://gilbane.com/blog/mt-cp.cgi?__mode=view&amp;blog_id=49&amp;id=14</uri>
    </author>
    
        <category term="Search Technologies and Products" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="semanticsearch" label="Semantic search" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="semanticsoftwareapplications" label="Semantic software applications" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://gilbane.com/search_blog/">
        <![CDATA[<p>Over the past three months I have had the pleasure of speaking with Kathleen Dahlgren, founder of Cognition, several times. I first learned about Cognition at the Boston Infonortics Search Engines meeting in 2009. That introduction led me to a closer look several months later when researching auto-categorization software. I was impressed with the comprehensive English language semantic net they had doggedly built over a 20+ year period. </p>

<p>A semantic net is a map of language that explicitly defines the many relationships among words and phrases. It might be very simple to illustrate something as fundamental as a small geographical locale and all named entities within it, or as complex as the entire base language of English with every concept mapped to illustrate all the ways that any one term is related to other terms, as illustrated in this <a href="http://upload.wikimedia.org/wikipedia/commons/thumb/6/67/Semantic_Net.svg/305px-Semantic_Net.svg.png">tiny subset</a>. Dr. Dahlgren and her team are among the few companies that have created a comprehensive semantic net for English.</p>

<p>In 2003, Dr. Dahlgren established Cognition as a software company to commercialize its semantic net, designing software to apply it to semantic search applications. As the Gilbane Group launched its new research on <em>Semantic Software Technologies</em>, Cognition signed on as a study co-sponsor and we engaged in several discussions with them that rounded out their history in this new marketplace. It was illustrative of pioneering in any new software domain. </p>

<p>Early adopters are key contributors to any software development. It is notable that Cognition has attracted experts in fields as diverse as medical research, legal e-discovery and Web semantic search. This gives the company valuable feedback for their commercial development. In any highly technical discipline, it is challenging and exciting to finding subject experts knowledgeable enough to contribute to product evolution and Cognition is learning from client experts where the best opportunities for growth lie.</p>

<p>Recent interviews with Cognition executives, and those of other sponsors, gave me the opportunity to get their reactions to my conclusions about this industry. These were the more interesting thoughts that came from Cognition after they had reviewed the Gilbane report:</p>

<p>•	Feedback from current clients and attendees at 2010 conferences, where Dr. Dahlgren was a featured speaker, confirms escalating awareness of the field; she feels that "This is the year of Semantics." It is catching the imagination of IT folks who understand the diverse and important business problems to which semantic technology can be applied.<br />
•	In addition to a significant upswing in semantics applied in life sciences, publishing, law and energy, Cognition sees specific opportunities for growth in <strong>risk assessment</strong> and <strong>risk management</strong>. Using semantics to detect signals, content salience, and measures of relevance are critical where the quantity of data and textual content is too voluminous for human filtering. There is not much evidence that financial services, banking and insurance are embracing semantic technologies yet, but it could dramatically improve their business intelligence and Cognition is well positioned to give support to leverage their already tested tools.<br />
•	<strong>Enterprise semantic search</strong> will begin to overcome the poor reputation that traditional "string search" has suffered. There is growing recognition among IT professionals that in the enterprise 80% of the queries are unique; these cannot be interpreted based on popularity or social commentary. Determining relevance or accuracy of retrieved results depends on the types of software algorithms that apply computational linguistics, not pattern matching or statistical models.<br />
•	In Dr. Dahlgren's view, there is no question that a team approach to deploying semantic enterprise search is required. This means that IT professionals will work side-by-side with subject matter experts, search experts and vocabulary specialists to gain the best advantage from semantic search engines. <br />
•	The unique language aspects of an enterprise content domain are as important as the software a company employs. The Cognition baseline semantic net, out-of-the-box, will always give reliable and better results than traditional string search engines. However, it gives top performance when enhanced with enterprise language, embedding all the ways that subject experts talk about their topical domain, jargon, acronyms, code phrases, etc.</p>

<p>With elements of its software already embedded in some notable commercial applications like Bing, Cognition is positioned for delivering excellent semantic search for an enterprise. They are taking on opportunities in areas like risk management that have been slow to adopt semantic tools. They will deliver software to these customers together with services and expertise to coach their clients through the implementation, deployment and maintenance essential to successful use. The enthusiasm expressed to me by Kathleen Dahlgren about semantics confirms what I also heard from Cognition clients. They are confident that the technology coupled with thoughtful guidance from their support services will be the true value-added for any enterprise semantic search application using Cognition.</p>

<p>The free download of the Gilbane study and deep-dive on Cognition was announced on their <a href="http://blog.cognition.com/?p=138">Web site at this page</a>.<br />
</p>]]>
        
    </content>
</entry>

<entry>
    <title>Semantically Focused and Building on a Successful Customer Base</title>
    <link rel="alternate" type="text/html" href="http://gilbane.com/search_blog/2010/09/semantically_focused_and_building_on_a_successful_customer_base.html" />
    <id>tag:gilbane.com,2010:/search_blog//49.10713</id>

    <published>2010-09-01T18:12:38Z</published>
    <updated>2010-09-02T13:59:39Z</updated>

    <summary>Now, as confidence in and understanding of the technology ramps up, Linguamatics are getting more complex and sophisticated questions from their customers and prospects. This is the exciting part as they are able to sell I2E&apos;s ability to &quot;synthesize new information from millions of sources in ways that humans cannot.&quot; This is done by using the technology to keep track of and processing the voluminous connections among information resources that exceed human mental limits.</summary>
    <author>
        <name>Lynda Moulton</name>
        <uri>http://gilbane.com/blog/mt-cp.cgi?__mode=view&amp;blog_id=49&amp;id=14</uri>
    </author>
    
        <category term="Search Technologies and Products" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="lifesciencesindustries" label="Life sciences industries" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="semanticsoftwareapplications" label="Semantic software applications" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://gilbane.com/search_blog/">
        <![CDATA[<p>Dr. Phil Hastings and Dr. David Milward spoke with me in June, 2010, as I was completing the Gilbane report, <em>Semantic Software Technologies: A Landscape of High Value Applications for the Enterprise</em>. My interest in a conversation was stimulated by several months of discussions with customers of numerous semantic software companies. Having heard perspectives from early adopters of Linguamatics' I2E and other semantic software applications, I wanted to get some comments from two key officers of Linguamatics about what I heard from the field. Dr. Milward is a founder and CTO, and Dr. Hastings is the Director of Business Development.</p>

<p>A company with sustained profitability for nearly ten years in the enterprise semantic market space has credibility. Reactions from a maturing company to what users have to say are interesting and carry weight in any industry. My lines of inquiry and the commentary from the Linguamatics officers centered around their own view of the market and adoption experiences.</p>

<p>When asked about growth potential for the company outside of pharmaceuticals where Linguamatics already has high adoption and very enthusiastic users, Drs. Milward and Hastings asserted their ongoing <em>principal focus in life sciences</em>. They see a lot more potential in this market space, largely because of the vast amounts of unstructured content being generated, coupled with the very high-value problems that can be solved by text mining and semantically analyzing the data from those documents. Expanding their business further in the life sciences means that they will continue engaging in research projects with the academic community. It also means that Linguamatics semantic technology will be helping organizations solve problems related to healthcare and homeland security.</p>

<p>The wisdom of a measured and consistent approach comes through strongly when speaking with Linguamatics executives. They are highly focused and cite the pitfalls of trying to "do everything at once," which would be the case if they were to pursue all markets overburdened with tons of unstructured content. While pharmaceutical <em>terminology</em>, a critical component of I2E, is complex and extensive, there are many aids to support it. The language of life sciences is in a constant state of being enriched through refinements to published thesauri and ontologies. However, <u>in other industries with less technical language, Linguamatics can still provide important support to analyze content in the detection of signals and patterns of importance to intelligence and planning</u>.</p>

<p>Much of the remainder of the interview centered on what I refer to as the "team competencies" of individuals who identify the need for any semantic software application; those are the people who select, implement and maintain it. When asked if this presents a challenge for Linguamatics or the market in general, Milward and Hastings acknowledged a learning curve and the need for a larger pool of experts for adoption. This is a professional growth opportunity for <em>informatics</em> and<em> library science</em> people. These professionals are often the first group to identify Linguamatics as a potential solutions provider for semantically challenging problems, leading business stakeholders to the company. They are also good advocates for selling the concept to management and explaining the strong benefits of semantic technology when it is applied to elicit value from otherwise under-leveraged content.</p>

<p>One Linguamatics core operating principal came through clearly when talking about the personnel issues of using I2E, which is the necessity of working closely with their customers. This means making sure that expectations about system requirements are correct, examples of deployments and "what the footprint might look like" are given, and best practices for implementations are shared. They want to be sure that their customers have a sense of being in a community of adopters and are not alone in the use of this pioneering technology. Building and sustaining close customer relationships is very important to Linguamatics, and that means an emphasis on services co-equally with selling licenses.</p>

<p>Linguamatics has come a long way since 2001. Besides a steady effort to improve and enhance their technology through regular product releases of I2E, there have been a lot of "show me" and "prove it" moments to which they have responded. Now, as confidence in and understanding of the technology ramps up, they are getting more complex and sophisticated questions from their customers and prospects. This is the exciting part as they are able to sell I2E's ability to "synthesize new information from millions of sources in ways that humans cannot." This is done by using the technology to keep track of and processing the voluminous connections among information resources that exceed human mental limits.</p>

<p>At this stage of growth, with early successes and excellent customer adoption, it was encouraging to hear the enthusiasm of two executives for the evolution of the industry and their opportunities in it.</p>

<p>The Gilbane report and a deep dive on Linguamatics are available through this <a href="http://www.linguamatics.com/welcome/news/press_releases/Semantic_Software_Technologies_Study.html">Press Release</a> on their Web site.<br />
</p>]]>
        
    </content>
</entry>

</feed>

