Jump to content

Searching OCR post case conversion


Recommended Posts

Hi all,

I have a compound case that I have upgraded to 2.6 (I'm using Intella Pro).

I have yet to re-index the sources as it's hundreds of GB and we're actively using the case.

I've performed a keyword search for a phrase, and I get matches in several Word Documents.

There is a PDF document (that was OCR'd prior to case conversion) in which I can see the phrase in the OCR tab of that document, however the PDF is "unresponsive" to the search.

I tried copying the text out of the OCR tab of the PDF and pasted that into the search box in case there's something funny going on with a character being substituted (like a lower case l for a 1 or something) and it still doesn't get returned.

I don't have anything de-selected in the searching options drop-down.

Checking the "Words" tab for the document just shows the words from the metadata of the file (this could well be normal behaviour - I've never looked to see if the OCR words get added to the "words" tab before to be honest.)

Something a bit interesting/weird is that once I search the phrase (with my "unresponsive" PDF already open in a preview window), that phrase gets highlighted in the document.

I then tried other OCR'd PDF files and the same thing happens - they are unresponsive but when previewed the phrase is highlighted anyway.

I'll kick off the re-index overnight and see if that helps.

Can anyone else replicate this out of curiosity?


Link to comment
Share on other sites

I re-indexed the sources in both source cases (remembering I'm using a Compound case here)

I also re-OCR'd the items in both source cases, however I believe it was set to skip items already OCR'd.

I then opened the compound case and searched the phrase again and it still did not return the PDF I expected

I then re-OCR'd the item itself directly (while in the compound case), and that did then enable the PDF to be returned in the phrase search

Is this intended and should be an extra step in the case conversion steps, or is this a bug and the PDF should have been responsive without having to re-OCR it?

Link to comment
Share on other sites

Hello ShaunC,

This feature has been updated in the newly released hotfix Please update your Intella and the OCR functionality should work properly after conversion.

We advise that you convert the compound case once again after updating to Intella and it should carry the OCRed items over as well.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Create New...