CambridgeDocs Corp. announced the release of Version 2.01 of its xDoc PDF-XML Converter and integration of it into its xDoc Converter Desktop and Server products, enhancing their platform for extracting document content to meaningful XML. As XML, the previously PDF content can be meaningfully used for indexing by search engines, XML repositories and content management systems, for example allowing it to be stored as chapters, sections, tables or cells within any repository for fast, easy and accurate re-use. The xDoc PDF-XML Converter extracts PDF content to XML and provides functionality for enabling conversion that yields: Stylistic XML, including format, layout and content information; Extraction of financial data; Organization of related XML “chunks”, such as financial tables; Compatibility with existing target XML schemas or DTD’s, such as Docbook or DITA; Conversion to HTML/XHTML, with visual information than surpasses even Google’s “view as HTML” functionality; and Conversion to simple text. Version 2.01 adds the PDF-XML Converter as a special module in the xDoc Converter 2.01 platform and includes sample conversions of PDF documents into a variety of XML formats, such as Docbook and DITA. The release also adds a new and improved user interface, called the TableDef interface for extracting financial data using positioning and textual clues. The integration of the PDF-XML Converter into the xDoc Converter enables easy access to its functionality by consolidating download, installation and licensing processes. It also provides access to xDoc’s Visual Mapping tool and works with xDoc’s Adobe Acrobat plug-in. The PDF-XML Conversion functionality is available for download now at
Leave a Reply
You must be logged in to post a comment.