Search the Community
Showing results for tags 'deduplication'.
Hello! My apologies if this has already been address, but I could not find it through search. I am dealing with MST Exchange emails. The emails contain a mix of standard SMTP email address as well as Exchange X.400-style addresses. De-duplication becomes a big problem here. Emails that are otherwise identical have different message hashes when one email has the SMTP address and another email has an X-400-style address. Is there any way currently to de-duplicate these? I know that as of the latest version of Intella, you can configure Message Hash to ignore certain attributes (including headers and recipients). This should work, but I'd really like to have more fine-tuned control than this. Ideally, it would be amazing if Intella could intelligently recognize that two emails are identical even if they use a mix of SMTP and X.400-style addresses. From my experience, this issue is very common in dealing with Exchange exports. Any thoughts would be greatly appreciated. Thank you! Bryan
Hi Guys, We have noticed that for the edisclosure jobs where a linear review of many documents is required, a facility to not only de-duplicate exact matches but also near matches is required. This is most commonly request when you have many mailboxes where the subject line, sender and recipients and content of two emails are the same, the only difference is the dates & times in the email headers are different. For many reviews these are 'identical' from a categorization point of view and should be filtered by some form of dedup functionality. Are there any plans for this on the horizon? Does any one have any strategies how this may be dealt with currently? Best regards, Jason