Global Companies Relying on Business Critical xDoc Based Applications... 

 

 
 

CambridgeDocs xDoc Document Conversion to XML

xDoc is able to convert your content from a number of different file formats into XML using an integrated multi-step process, whereby a file is first converted into a stylistic XML format using an appropriate xDoc Java Preprocessing Conversion Driver, which is a Java application specifically geared to parse an appropriate file format. 

Once the file has been transformed into the stylistic XML, it can then be converted into other XML formats such as DocBook and DITA, and/or it can be processed further.  The content can be indexed in a database, it can have custom data inserted, and it can be re-published in different document formats, depending upon your needs.

The table below lists the conversions that xDoc is capable of making. Please click on the appropriate conversion for more information.
 

Conversion

Input File Extension

xDoc Conversion Driver

Microsoft Word to XML *.doc Java Word Driver
Adobe PDF to XML *.pdf Java PDF Driver
HTML to XML *.html Java HTML Driver
Adobe FrameMaker MIF to XML *.mif Java MIF Driver

In addition to the conversions listed above, CambridgeDocs also provides xDoc Drivers to read and convert Microsoft WordML (*.xml) and RTF files (*.rtf).  If you are working with those formats, you might want to consider using the xDoc Java Word Driver, as it is the most sophisticated and complete Java-based means of parsing Microsoft Word documents on the market today.