Bitform Technology Inc. announced the availability of Bitform Extract SDK, a software component for high-performance access to the contents, including text, metadata and content structure information, of the most important file types used by global organizations. Cross platform support and a flexible API make Bitform Extract SDK a tool for a broad range of solutions that need access to file content. Target applications include analysis and inspection for security and compliance solutions and text extraction for indexing, categorization and other search, knowledge management and content management processes. On the surface, content extraction can be confused with “text filtering”. It’s a relatively simple task to develop a program that opens files and treats every element inside as text. This approach can produce impressive performance figures with regard to gigabytes of files processed per hour, but precision and accuracy suffer tremendously. Complex binary file formats such as Word, Excel, PowerPoint and PDF contain thousands of internal structures that can either be bypassed by a simple text filter – thus causing incomplete analysis of content – or can be incorrectly treated as text – thereby producing false positives. Bitform’s approach of completely modeling supported file types and taking into account the unique structure of specific formats produces high precision with performance that meets market needs in multiple application segments. For security, policy and compliance applications, Bitform Extract SDK can be combined with Bitform’s metadata and hidden information inspection tool, Bitform Secure SDK, to provide a comprehensive content extraction, inspection and remediation solution. Bitform Extract SDK is available immediately. http://www.bitform.net
Leave a Reply
You must be logged in to post a comment.