Search the Community
Showing results for tags 'temp'.
-
Hi! I have two questions regarding OCR. First, is there any easy way to keep track of progress and see how many docs remain to be OCRed? I am usually OCRing via a command-line script immediately after processing (using a Task file). The command output simply says "Post-processing", so I don't know how many OCR candidates were identified. I see that files are being created in the following folder in my current case: .\tmp\ocr-service9057841158103284728. It looks like the final OCR results are being placed in the "ocr-results" folder here, so that seems to be a good number as to how many files have been OCRed thus far. I just don't know how many files are still going to be processed. Also, I notice that when OCR finishes, this "ocr-results" folder is immediately deleted. Is there any way to prevent this? I like to keep keep OCR results for future use. Sometimes, we need to ingest new data that contains a lot of duplicates of files already OCRed. It would be fantastic to just be able to import the OCR results for these rather than need to OCR them all over again. I'd appreciate any ideas for the above. Thank you! Bryan
- 2 replies
-
- ocr
- command-line
- (and 4 more)