Salvage Text from Corrupt Word 2007 DOCX Format Files with Damaged DOCX2TXT

, MD (PressExposure) April 22, 2009 -- S2 Services releases Damaged DOCX2TXT. This program requires Microsoft`s .Net Version 2 framework to be installed. Damaged DOCX2TXT will recover text from corrupt Word 2007 docx format files where Word 2007 refuses to salvage the text.

Word 2007 docx format files are zipped collections of XML files. All of the text is contained in the document.xml file within this collection. XML is by design a very unforgiving medium for file corruption. From the errors returned from attempts at salvaging the text from corrupt docx files, Word 2007 appears to be using a standard interpreter of XML. Damaged DOCX2TXT on the other hand uses PERL coding to simply remove the hypertext coding around the text in the document.xml files. It is based on a PERL script by Sandeep Kumar (

What Mr. Kumar's script did not provide and this script does is use of a command line unzipper that is tolerant of zip file corruption. Thus Damaged DOCX2TXT is tolerant of corruption of both the zip and xml natures of the docx files. Another feature addition to Mr. Kumar's script is a GUI front end making the program friendly to beginners.

The program can also double as a viewer and editor of the text found in Word 2007 docx format files. Changes made in the editor need to be saved from the right click menu of the display and not from the File Menu. This bug is being worked on.

The program is coded by Paul Pruitt of Bethesda, MD USA, and is available from both his software page at and the home page of his data recovery freeware list at

