Re-processing after cracking passwords

Jacques B · January 9, 2023

I occasionally encounter encrypted PDFs that Intella was unable to decrypt. Naturally, I only know this after processing is done. I've had success cracking passwords of PDFs of bank statements where the password is numeric (part of the account number). Once cracked, I know I can add it to the keystore. But as far as I can tell, I then have to re-index the entire evidence item(s) with content that needs to be decrypted. I don't see any option to simply decrypt and index the 10, 20 or 30 files that are encrypted. I have to re-index tens or hundreds of thousands of files in the evidence source(s).

Is there a way to have Intella only re-index select items instead of all items in a source?

ShaunC · January 19, 2023

Personally, I would just use the facet to filter to, and then export the encrypted items and add them as a new source.

Jacques B · January 20, 2023

Thanks. I’ve done that in the past. But the down side of that approach is the decrypted item is not at the original path within the evidence. For example, if the original is an attachement in an email, the decrypted version won’t be if imported as a new source.

It would be great if Intella had the ability to index filtered files instead of needing to index all of them.

ShaunC · January 20, 2023

I wonder if you could script it as part of initial processing?

It would be pretty unintelligent, but I wonder if you could do something like (100% pseudo-code):

if item.encrypted = true
	wordlist = get-content item.parent (separator 'whitespace')
	foreach word in wordlist
		try item.decrypt word

You could build your wordlist in a way that makes sense. The above is hoping the parent is an email and they've supplied the password in the email for example

Jacques B · January 20, 2023

I’m not sure if Intella supports that type of scripting. In my case I’ve been using John the Ripper in a Linux VM to crack PDF docs typically. So I don’t think there would be any way to call upon it from Windows.

The other challenge is that in the case of PDF bank statements for example, the accompanying email from the bank usually provides the mask for the password (e.g., the middle six characters of the bank account number) which I use as a parameter for cracking the password. In other cases, I’ve found the password right in the email. “Hey John, here’s the encrypted spreadsheet for your review. The password is “abc123”.

i wouldn’t want to delay Intella processing while it tries to brute force each time it finds an encrypted file it can’t automatically decrypt. I appreciate your suggestions as possible alternative options. The ideal solution rests with Vound adding the ability to process/re-process selected files. You would think you could choose only docs that it couldn’t decrypt and reprocess those with the keystore rather than hanving to reprocess every item in the data set.

Thanks again for taking the time to offer suggestions.

ShaunC · January 22, 2023

No worries at all and I agree; it would be best solved with that sort of mechanism - the permutations of what you would come across would be way too complex to script effectively.

Mateusz · February 3, 2023

Hey guys, I have to admit that these are all good ideas and I have sent these through to the dev team for consideration. Thanks!

February 6, 2023

Hello @Jacques B,

I just wanted to give you an update on the ability of our crawler scripts to decrypt password protected PDFs. Currently, the scripts do not have access to the native file, which is necessary for decryption. We are actively working on adding this capability to the scripting engine, so that you will be able to run code to decode a PDF. As soon as I have more information on this topic, I'll be sure to update you. Thank you for your patience and understanding!

Marco

Jacques B · February 6, 2023

Thanks Marco! Does this mean you'll be able to enter passwords in the keystore and then run it against specific files and process only those rather than having to re-process all items in a source?

Fortunately, it's not something I encouter frequently. But when I do and manage to crack a password (or get it from the email itself - people can be lazy sometimes ), I will add that to the keystore and then re-process so that it's available to the investigator. Being able to selectively reprocess would be a huge time saver in those cases.

Jacques

February 6, 2023

Hi @Jacques B,

I wanted to clarify that the keystore is used for all encrypted files, not just PDFs. With a crawler script, you can create a custom procedure for each file. This means that the script would have access to the file, attempt to decrypt it, and then return it in a decrypted form. However, it does require some programming skills to implement once the capability is added to the scripting engine.

I also wanted to let you know that selective reprocessing is a high priority on our list of features that we're working to include in the future.

Marco

Jacques B · February 6, 2023

OK, thanks Marco. The selective reprocessing would be the ideal solution.

Adding the ability to try and crack it during processing is nice. But it would be very difficult to use a one size fits all decryption approach if using a third party such as John the Ripper. As it will depend on the document type, and if you have a mask for the password. And as you also know, password cracking can take a long time. You wouldn't want processing of the rest of the source to be held up by the attempt to crack a password. It would be important for processing to complete and make everything available to the investigator for review while password cracking goes on in the background.

If it will be implemented in a manner that processing stops while password cracking is attempted, that will have an undesirable delay and make it impractical to use. If that's the only option, I would suggest putting that time into the selective reprocessing instead, as that will be far more useful. But if the script simply passes on the encyprted password to an external process and then carries on, then that's fine. But that also means at some point, it has to reprocess those files once the password is cracked.

I do have some programming skills (scripting skills - BASH, Python, and some light PowerShell). So I don't mind that.

Thanks,

Jacques

February 7, 2023

Hello @Jacques B,

Crawler scripts are executed when we index (crawl) the data. You are right, we do not want any unnecessary delays during this process.

One approach to use a crawler script is to copy all encrypted files that are discovered and execute a command to decrypt them. This approach is useful when there is no information about the password(s) required to decrypt the files. In such a scenario, brute forcing may be the only solution.

The keystore passwords can be used to decrypt supported files. With selective reprocessing this will become a lot more valuable because now you need to provide the passwords before processing a source. When using the keystore for passwords, it is recommended to keep the list of passwords short. This approach ensures that the process remains efficient and reduces the chances of delays. The keystore was designed to try out passwords that are already known, rather than for brute forcing.

Regards,

Marco

Jacques B · February 7, 2023

Thanks Marco for that follow-up.

To be honest, I had not considered using the keystore as a cracking dictionary :). I've only ever put in passwords I obtained from emails or that I've cracked.

Thanks,

Jacques

Jacques B · March 20, 2023

Hi Marco,

Providing an update on this. I am currently working a case where I have 5 PSTs in it. Intella identified 89 items that it could not decrypt. In looking at them, many are in a few ZIP files, so I gather it's the ZIP that needs to be cracked, not each file within it. At any rate, for some of the other encrypted items, the user sent the password in a separate message (email or Teams message) which is common.

My workflow when I have encrypted items from an Exchange mailbox is to look at the parent email of the encrypted item to see if the person shared the password, or references that it will be sent in another email. I was able to find a few passwords and added them to the key store. I could see from Location facet that the encrypted items were spread across 3 of the 5 PSTs. This meant I had to reprocess those 3 PSTs to have Intella use the passwords in the key store to decrypte the items and then index them.

This is a prime example of where the current workflow used by Intella to deal with encrypted items is inefficient. We are not likely going to know which items are encrypted, much less the password for the items, until after it's in Intella and processed. In addition to being able to selectively re-process files rather than an entire source, it would be really helpful if Intella noted what processing was already done on that source (e.g., OCR, content analysis) and prompted the user if it wants those additional processes to be run as well on the decrypted items.

I do see email threading as an exception here. You can't run email threading on only decrypted items. It has to be run against all emails in the case to get email threading across all your data.

Thanks,

Jacques

Re-processing after cracking passwords

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Guest Marco de Moulin

Link to comment

Share on other sites

Link to comment

Share on other sites

Guest Marco de Moulin

Link to comment

Share on other sites

Link to comment

Share on other sites

Guest Marco de Moulin

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation