Jump to content

OCR Text for Redacted documents


jcoyne

Recommended Posts

Hi Guys,

 

Is there a way to replace the OCR'd text within Intella for document redacted within Intella?

 

For example:-

 

Initially Tiff's and empty PDF's were OCR'd and the text read back into Intella.

 

The reviewers reviewed and redacted thousands of documents.

 

Its currently easy to locate the redacted documents in Intella, so I could export them and OCR them again. But then the text files are only outside of Intella and they will be named with either MD5 hash or Item_ID - or some other variable.

 

When I export via load file, I will place-holder the text within the text files directory of the redacted items - but the redacted files are now amongst thousand of other image files, and probably named 000001.000001.0000034 or what ever.  It is now quite difficult to marry up the redacted text files. If you OCR the redacted files at this point, it not easy to locate the redacted files within the thousands of others.  

 

Its even harder if you exported Tiff Image files as there will be one image file per page and the Text file needs to display the text for all of the pages in one file. 

 

Has anyone found a good workflow for this scenario?

 

Best regards,

 

Jason 

 

Link to comment
Share on other sites

Hi Jason,

 

Can you try to do the following:

1) Export items to load file and use an export set. Skip the texts for redacted items.

2) Export all redacted items using the export set column, so they will get proper names like 000001.000001.0000034.

3) Now OCR the files exported in the step 2. And replace the place-holdered text files in the load file with the new OCRed files. (the names should match)

 

Do you think it would work?

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...