Jump to content

Matching items with MD5 Hash Values


philrodo

Recommended Posts

One other question. While exporting certain emails from Intella, I generated a cvs report. In looking at the report, I see two columns pertaining to MD5 hashes. One is labeled "MD5 Hash" and the other is labeled "Message Hash." What is the difference, since both MD5 values ostensibly pertain to the same message? 

Link to comment
Share on other sites

Phil as it happens I've spent the last few days delving very deeply into this in order to recover a case that went belly up after a Windows BSOD.

 

You can indeed import an MD5 hash list, it's one of the facet options right near where you can import keyword lists, and it works in the same fashion, you can import a .csv file with one MD5 per line and must have a heading MD5 hash (if you want to see the format just export a CSV with Intella and only select the MD5 value.

 

To the hash question, I would wait for confirmation from the admins but my suspicion is the MD5 is the entire file (email including attachments) and the message hash is either just the body of the email, or the email sans attachments.

 

I always use the MD5 hash when importing/exporting file lists.

Link to comment
Share on other sites

Hi,

 

> Is there a way to import a table containing various MD5 hash values and use it to match messages or attachments in an Intella case?

 

You can import MD5 hash list into the "MD5 and Message Hash" facet panel for search. Please refer to the section 13.1.13 "MD5 and Message Hash" of Intella User Manual.

 

> One is labeled "MD5 Hash" and the other is labeled "Message Hash." What is the difference, since both MD5 values ostensibly pertain to the same message?

 

MD5 hash captures every detail of the message (including source-specific technical information) that is often not too useful for finding duplicate items. For instance, it is not uncommon if two copies of the same message get different MD5s if retrieved from different PST files. In contrast, Message Hashes are calculated only from message body and substantial headers, thus allowing loose message copies to be found. See the same section of the User Manual for detailed explanation between those types of hashes.

Link to comment
Share on other sites

Hello Adam,

 

That is correct: the calculation of an MD5 hash of a given binary is mathematically defined (see https://en.wikipedia.org/wiki/MD5 for more details) and should work across applications - that's even one of its intended purposes.

 

What you should take care of is how each tool expects the MD5 to be encoded in the hash file. For Intella a simple text file with one MD5 per line in hexadecimal notation (e.g. d41d8cd98f00b204e9800998ecf8427e) is best. You can also use the CSV that Intella creates when you export out the MD5 and Message Hash columns, as it splits each line on commas and checks each value for being a valid MD5 hash (so headers in the CSV are filtered out).

 

I have seen other tools that wrap hashes in quotes, mix MD5 hashes with other types of hashes in the same file, etc. Those are better removed.

Link to comment
Share on other sites

×
×
  • Create New...