Jacques B Posted November 16, 2023 Report Share Posted November 16, 2023 All, I wrote a Python script to parse various artifacts from a MS Word document and dump it to 4 different worksheets in an Excel file. Myself and a few colleagues are using the script to help us with some testing of scenarios in MS Word to see how artifacts are impacted by different actions. https://github.com/jjrboucher/MS-Word-Parser For example, did you know that if you upload a DOCx to Google Docs, and later download it back to your computer, Google Docs strips out core.xml and app.xml, thus you lose the author and created date among other metadata? And if you subsequently edit that newly downloaded document with MS Word, MS Word will add core.xml and app.xml, and set the created date as the date it edited the newly download document, as it adds core.xml and app.xml at that point. I know of someone dealing with a situation where a LNK file shows a created date in June for a DOCx on a USB drive. The document was no longer on it, but they were able to recover it. The metadata of the document shows a created date in July. After I explained the above scenario to them, they said it made perfect sense based on their knowledge of the case. Best, Jacques Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.