Jump to content

Script for generating SHA1, SHA256, and SHA512 values during processing


Recommended Posts

import hashlib

from api.scripting import ScriptService
from api.scripting.ScriptService import (Action, CustomColumn,
                                         CustomColumnType, CustomColumnValue,
                                         FoundItemResult, ProcessedItemResult)

# Hashing functions
def sha256(file):
    hash_sha256 = hashlib.sha256()
    with open(file, "rb") as f:
        for chunk in iter(lambda: f.read(4096), b""):
    return hash_sha256.hexdigest()

def sha1(file):
    hash_sha1 = hashlib.sha1()
    with open(file, "rb") as f:
        for chunk in iter(lambda: f.read(4096), b""):
    return hash_sha1.hexdigest()

def sha512(file):
    hash_sha512 = hashlib.sha512()
    with open(file, "rb") as f:
        for chunk in iter(lambda: f.read(4096), b""):
    return hash_sha512.hexdigest()

class ScriptHandler(ScriptService.Iface):

    def itemFound(self, item):
        return FoundItemResult(action=Action.Include)

    def itemProcessed(self, item):
        custom_columns = []

        if item.binaryFile is not None:
            # Generate SHA-256
            file_sha256 = sha256(item.binaryFile)
            sha256_column = CustomColumn("SHA-256", CustomColumnType.String, CustomColumnValue(value=file_sha256))

            # Generate SHA-1
            file_sha1 = sha1(item.binaryFile)
            sha1_column = CustomColumn("SHA-1", CustomColumnType.String, CustomColumnValue(value=file_sha1))

            # Generate SHA-512
            file_sha512 = sha512(item.binaryFile)
            sha512_column = CustomColumn("SHA-512", CustomColumnType.String, CustomColumnValue(value=file_sha512))

            # Add all custom columns
            custom_columns = [sha256_column, sha1_column, sha512_column]

        return ProcessedItemResult(action=Action.Include, customColumns=custom_columns)


  • Like 1
Link to comment
Share on other sites

Thanks for sharing. I'm looking at writing a crawler script for something else. Seeing examples helps.

The unfortunate aspect of crawler scripts is you can only run one, and it must be while indexing the case. I remember Vound saying they are hoping that in the future, this won't be the case. But for now, that's the only time you can run a script. I am currently running one to look for emails with blank subjects and tagging them accordingly. I have another I'd like to run to extract a particular data point from a MS Word DOCx file and add that in a column. I'll either have to choose one or the other or process the case twice to be able to run both crawler scripts. With some cases taking 12+ hours to index, that's not a very attractive option.

There is a caveat with yours that you are calculating three additional hashes per file which will add processing time. For a small case, that likely won't be too noticeable. But if you have a case with 100GB+ for example, the additional processing time will certainly be noticeable.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Create New...