matt Posted Thursday at 12:01 PM Report Share Posted Thursday at 12:01 PM import hashlib from api.scripting import ScriptService from api.scripting.ScriptService import (Action, CustomColumn, CustomColumnType, CustomColumnValue, FoundItemResult, ProcessedItemResult) # Hashing functions def sha256(file): hash_sha256 = hashlib.sha256() with open(file, "rb") as f: for chunk in iter(lambda: f.read(4096), b""): hash_sha256.update(chunk) return hash_sha256.hexdigest() def sha1(file): hash_sha1 = hashlib.sha1() with open(file, "rb") as f: for chunk in iter(lambda: f.read(4096), b""): hash_sha1.update(chunk) return hash_sha1.hexdigest() def sha512(file): hash_sha512 = hashlib.sha512() with open(file, "rb") as f: for chunk in iter(lambda: f.read(4096), b""): hash_sha512.update(chunk) return hash_sha512.hexdigest() class ScriptHandler(ScriptService.Iface): def itemFound(self, item): return FoundItemResult(action=Action.Include) def itemProcessed(self, item): custom_columns = [] if item.binaryFile is not None: # Generate SHA-256 file_sha256 = sha256(item.binaryFile) sha256_column = CustomColumn("SHA-256", CustomColumnType.String, CustomColumnValue(value=file_sha256)) # Generate SHA-1 file_sha1 = sha1(item.binaryFile) sha1_column = CustomColumn("SHA-1", CustomColumnType.String, CustomColumnValue(value=file_sha1)) # Generate SHA-512 file_sha512 = sha512(item.binaryFile) sha512_column = CustomColumn("SHA-512", CustomColumnType.String, CustomColumnValue(value=file_sha512)) # Add all custom columns custom_columns = [sha256_column, sha1_column, sha512_column] return ProcessedItemResult(action=Action.Include, customColumns=custom_columns) 1 Quote Link to comment Share on other sites More sharing options...
Jacques B Posted Thursday at 01:22 PM Report Share Posted Thursday at 01:22 PM Thanks for sharing. I'm looking at writing a crawler script for something else. Seeing examples helps. The unfortunate aspect of crawler scripts is you can only run one, and it must be while indexing the case. I remember Vound saying they are hoping that in the future, this won't be the case. But for now, that's the only time you can run a script. I am currently running one to look for emails with blank subjects and tagging them accordingly. I have another I'd like to run to extract a particular data point from a MS Word DOCx file and add that in a column. I'll either have to choose one or the other or process the case twice to be able to run both crawler scripts. With some cases taking 12+ hours to index, that's not a very attractive option. There is a caveat with yours that you are calculating three additional hashes per file which will add processing time. For a small case, that likely won't be too noticeable. But if you have a case with 100GB+ for example, the additional processing time will certainly be noticeable. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.