Jung Son Posted October 1 Report Share Posted October 1 Hi there, I am working on developing a crawler script that can filter out certain file paths and extensions that we don't need, such as dll files and the windows\help folder. Once the data is processed, I can see a nice outcome in CSV format, which shows the files that have been included and the ones that have been skipped. If some items are skipped during the filtering process and we later decide to process those particular items, is there a way to use a URI or ID to reprocess and include those items? For example, if I want to include two items under the prefetch folder, is there a way to re-index the case and include certain items based on their IDs or URIs, assuming the URIs won't change? Any help you can provide or sample script would be greatly appreciated. Thanks! Quote Link to comment Share on other sites More sharing options...
igor_r Posted October 2 Report Share Posted October 2 Hello Jung, It is definitely possible. I don't have a ready-to-use script at the moment, but the idea is the following: First, you need to parse the so-called "Script Log" produced by Intella. This is a CSV file where you can find all items that were skipped. After parsing the CSV you can collect the IDs or URIs of the items that you want and save them to a separate file. Let's call this file "items-to-include.csv". Now, you can modify your script to add a new condition: if the item ID is from that list, this item is always included. So this check is done before other checks. Use "item.id" and "item.uri" attributes. Then, you can simply re-index the source with the modified script and it should include the skipped items. It's important to remember that when you re-index an existing case all item IDs and URIs won't change. Here is a useful link if you need to parse a CSV in Python: https://www.digitalocean.com/community/tutorials/parse-csv-files-in-python Quote Link to comment Share on other sites More sharing options...
Jung Son Posted October 3 Author Report Share Posted October 3 Great, thanks Igor. If you could share a sample script for this, that would be great. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.