jcoyne Posted September 5, 2016 Report Share Posted September 5, 2016 Hi Guys, We have noticed that for the edisclosure jobs where a linear review of many documents is required, a facility to not only de-duplicate exact matches but also near matches is required. This is most commonly request when you have many mailboxes where the subject line, sender and recipients and content of two emails are the same, the only difference is the dates & times in the email headers are different. For many reviews these are 'identical' from a categorization point of view and should be filtered by some form of dedup functionality. Are there any plans for this on the horizon? Does any one have any strategies how this may be dealt with currently? Best regards, Jason Link to comment Share on other sites More sharing options...
Andrej Posted September 5, 2016 Report Share Posted September 5, 2016 Hi Jason, the message hash that is calculated for emails does gets close to this, but it does include the date and time, so it's too strict for your purpose. We're considering this feature. Link to comment Share on other sites More sharing options...
AdamS Posted September 6, 2016 Report Share Posted September 6, 2016 I know this was talked about last year but suspect other updates have pushed it back a bit. http://community.vound-software.com/index.php?/topic/252-additional-deduping/?p=1266 I also think the de-duping capabilities could use some enhancing for near dupes and also email threading as discussed in the link above. Link to comment Share on other sites More sharing options...
Recommended Posts