Jump to content

Crawler script help please - Want to filter data based on file extension, file path, file name, and file signature + date and time range

Jung Son

Recommended Posts

We are currently filtering our data using the Forensic Explorer (FEX) script.

It does signature and metadata analysis, then includes data within the date and time range only
after that, it excludes certain files based on their file extension file name or file path.
it then includes data based on the file extension file path, and file signature

We are wondering if the same filter can be done using the crawler script please? if so would it be possible to get your guidance on how to achieve that using your crawler script (example for one each would be greatly appreciated)

Link to comment
Share on other sites

Hi Jung,

Thanks for posting this question here. Let me give you some pointers where you could start.

First of all, take a look at the GitHub repo: https://github.com/vound-software/intella-crawler-scripts. It has extensive documentation and a lot of samples that cover many scenarios.

Then, the file type filtering (or signature filtering) can be done via UI. So, if you know for sure that you are only interested in certain types, you could use that instead of scripting. The only caveat is that it can only stub the unwanted items, not remove them completely. If you want to remove them from the case, you would need to use scripting. See: https://www.vound-software.com/docs/intella/2.6.1/#_file_type_settings

In the upcoming 2.7 release Intella will have a new feature where you can exclude items by their name or extension.



To filter items by date please see this example: https://github.com/vound-software/intella-crawler-scripts/blob/main/samples/advanced/filter_date_toplevel.py.

This article shows how to filter items by type and size:


Now, the only problematic part in your request is filtering by location (path). That is not currently fully supported. If you index disk images, then you could filter the top-level files in the disk image by path. But that's the only option at the moment. Please see an example of how to do that here: https://github.com/vound-software/intella-crawler-scripts/blob/main/samples/advanced/filter_fs_path.py

I would recommend to first look at the individual scripts above and do some testing to make sure that it works as you expect. Then, you could combine it all into a final single script.

Also, please take a look at the current limitations, specifically handling of duplicates: https://github.com/vound-software/intella-crawler-scripts#current-limitations

I hope this helps.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Create New...