As the world of search becomes more and more sophisticated (and that process has been underway for decades,) we may be approaching the limits of software's ability to improve its ability to find what a searcher wants. If that is true, and I suspect that it is, we will finally be forced to follow the trail of crumbs up the content life cycle... to its source. Indeed, most of the challenges inherent in today's search strategy and products appears to grow from the fact that while we continually increase our demands for intelligence on the back end, we have done little if anything to address the chaos that exists on the front end. You name it, different word processing formats, spreadsheets, HTML tagged text, database delimited files, and so on are all dumped into what we think of as a coherent, easily searchable body of intellectual property. It isn't and isn't likely to become so any time soon unless we address the source. Having spent some time in the library automation world, I can remember the sometimes bitter controversies over having just two major foundations for cataloging source material (Dewey and LC; add a third if you include the NICEM A/V scheme.) Had we known back then that the process of finding intellectual property would devolve into the chaos we now confront, with every search engine and database product essentialy rolling its own approach to rational search, we would have considered ourselves blessed. In the end, it seems, we must begin to see the source material, its physcial formats, its logical organization and its inclusion of rational cataloging and taxonomy elements as the conceptual raw material for its own location. As long as the word processing world teaches that anyone creating anything can make it look like it should in a dozen different ways, ignoring any semblance of finding-aid inclusion, we probably won't have a truly workable ability to find what we want without reworking the content or wading through a haystack of misses to find our desired hits. Unfortunately, the solutions of yesteryear, including after-creation cataloging by a professional cataloger, probably won't work now either, for cost if no other reason. We will be forced to approach the creators of valuable content, asking them for a minimum of preparation for searching their product, and providing the necessary software tools to make that possible. We can't act too soon because, despite the growth of software elegance and raw computer power, this situation will likely get worse as the sheer volume of valuable content grows. Regards, Barry Read more: Enterprise Search Practice Blog: http://gilbane.com/search_blog/
In the end, good search may depend on good source.
Categories:
Tags:
Categories
- Authoring (29)
- CMS - Content Management Systems (9)
- CTO Blog News
- Collaboration (7)
- DITA - Darwin Information Typing Architecture (28)
- ECM - Enterprise Content Management (6)
- Enterprise Blogs (1)
- Enterprise Publishing (9)
- Enterprise Search (3)
- Financial Services (1)
- Government (7)
- HTML/XHTML (4)
- Industry
- Information Architecture (5)
- Information Technology Market (1)
- Manufacturing (1)
- Open Source (2)
- Publishing (21)
- RSS & ATOM (2)
- Software Infrastructure (3)
- WCM - Web Content Management (3)
- Web 2.0, Enterprise 2.0 etc. (5)
- XBRL (19)
- XML (67)
- XQuery (6)
- XSL-FO, CSS, & Style Sheets (11)
- XSLT (11)
- admin (1)
Monthly Archives
- November 2010 (1)
- October 2010 (3)
- September 2010 (5)
- March 2010 (1)
- January 2010 (2)
- November 2009 (4)
- September 2009 (1)
- August 2009 (4)
- July 2009 (3)
- June 2009 (4)
- May 2009 (2)
- April 2009 (4)
- March 2009 (7)
- February 2009 (8)
- January 2009 (6)
- December 2008 (5)
- November 2008 (4)
- October 2008 (2)
- September 2008 (2)
- August 2008 (4)
- July 2008 (2)
- June 2008 (1)
- April 2008 (4)
- March 2008 (6)
- February 2008 (8)
- January 2008 (1)
Pages
OpenID accepted here
Learn more about OpenID
Author Archives
Tag Cloud
Bill's latest Tweet
NewsShark
Sign-up for our weekly NewsShark newsletter.
Content technology industry news without the hype:
Leave a comment