westerndigital Posted September 13, 2016 Report Share Posted September 13, 2016 I am processing a 10GB PST file and am uncertain about the best way to deduplicate the files before export. Should I create a search on the entire PST and dedupe that - then select the files/show parents/export. Or, should I create a search on the PST/show parents/deduplicate/export? Doing it the first way gives me far fewer parent files in the end - about 6,000 less. I'm just worried that I'm missing something important. Any help would be greatly appreciated! Link to comment Share on other sites More sharing options...
arjohn Posted September 13, 2016 Report Share Posted September 13, 2016 The second options sounds the most logical one, but it really depends on what you're after. Assume you have two emails that have the same attachment and a search that matches this attachment. Deduplicating the set first will remove one of the matching attachments and the export will only contain one of the e-mails. Doing it the other way around will get both e-mails in the export. Link to comment Share on other sites More sharing options...
jon.pearse Posted September 14, 2016 Report Share Posted September 14, 2016 We actually have a video on creating load files which discusses deduplication, this may be helpful. http://community.vound-software.com/index.php?/topic/402-creating-a-load-file-in-intella/ Link to comment Share on other sites More sharing options...
westerndigital Posted September 14, 2016 Author Report Share Posted September 14, 2016 Ah, thank you both. Your responses were very helpful! Link to comment Share on other sites More sharing options...
Recommended Posts