I am processing a 10GB PST file and am uncertain about the best way to deduplicate the files before export.  Should I create a search on the entire PST and dedupe that - then select the files/show parents/export.  Or, should I create a search on the PST/show parents/deduplicate/export?


Doing it the first way gives me far fewer parent files in the end - about 6,000 less.  I'm just worried that I'm missing something important. 


Any help would be greatly appreciated!

The second options sounds the most logical one, but it really depends on what you're after. Assume you have two emails that have the same attachment and a search that matches this attachment. Deduplicating the set first will remove one of the matching attachments and the export will only contain one of the e-mails. Doing it the other way around will get both e-mails in the export.

