As the world of search becomes more and more sophisticated (and that process has been underway for decades,) we may be approaching the limits of software’s ability to improve its ability to find what a searcher wants. If that is true, and I suspect that it is, we will finally be forced to follow the trail of crumbs up the content life cycle… to its source.
Indeed, most of the challenges inherent in today’s search strategy and products appears to grow from the fact that while we continually increase our demands for intelligence on the back end, we have done little if anything to address the chaos that exists on the front end. You name it, different word processing formats, spreadsheets, HTML tagged text, database delimited files, and so on are all dumped into what we think of as a coherent, easily searchable body of intellectual property. It isn’t and isn’t likely to become so any time soon unless we address the source.
Having spent some time in the library automation world, I can remember the sometimes bitter controversies over having just two major foundations for cataloging source material (Dewey and LC; add a third if you include the NICEM A/V scheme.) Had we known back then that the process of finding intellectual property would devolve into the chaos we now confront, with every search engine and database product essentialy rolling its own approach to rational search, we would have considered ourselves blessed. In the end, it seems, we must begin to see the source material, its physcial formats, its logical organization and its inclusion of rational cataloging and taxonomy elements as the conceptual raw material for its own location.
As long as the word processing world teaches that anyone creating anything can make it look like it should in a dozen different ways, ignoring any semblance of finding-aid inclusion, we probably won’t have a truly workable ability to find what we want without reworking the content or wading through a haystack of misses to find our desired hits.
Unfortunately, the solutions of yesteryear, including after-creation cataloging by a professional cataloger, probably won’t work now either, for cost if no other reason. We will be forced to approach the creators of valuable content, asking them for a minimum of preparation for searching their product, and providing the necessary software tools to make that possible.
We can’t act too soon because, despite the growth of software elegance and raw computer power, this situation will likely get worse as the sheer volume of valuable content grows. Regards, Barry Read more: Enterprise Search Practice Blog: https://gilbane.com/search_blog/