Jump to content

OCR - ABBYY FineReader (embedded) - how to get support for Indian languages (e.g. Hindi or Marathi)?


Recommended Posts

Posted

I am working corporate cases from time to time where I have a need to OCR-process pictures and pdf documents containing different Indian languages. As far as I can tell, there is no support for e.g., Hindi or Marathi in the built-in ABBYY FineReader (embedded) function the Intella Pro.

Is there a way to expand on supported languages in the embedded OCR function?

Kind regards,

JonasĀ 

Posted

I'm not sure what ABBYY FineReader uses for OCR (in-house, open source, or combination). Tesseract is a free OCR solution that does support those languages.

https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html

I don't know if there is a way to use a crawler script to use an external OCR tool. Or alternatively (less desirable of course) is to export those items, OCR them externally and bring them back in as new items.

Posted

Hi Jacques, Thank you for your reply. I am already using an external "free" package to partly solve the need from time to time - what I was looking for was a reply on IF Intella built-in capabilities can be expanded on. Since no-one else have responded to this, perhaps I will have to close this search for the time being. Kind regards, Jonas

Posted

OK. I know there are admins on here. I'm not sure if they work for Vound, or are volunteers. Hoping one of them can provide you with an answer.

Alternatively, you can submit a support ticket to them. For that, they are responsive. Where this is a community board, it may rely more on community members vs Vound staff.

Posted

Hello Jonas ,

ABBYY doesn't support Hindi or Marathi. You would need to look for an alternative OCR solution.

Please look at the using of external OCR tool section in the user manual: https://www.vound-software.com/docs/intella/2.7.1/#_using_an_external_ocr_tool

Basically you can export the items that you need to OCR to plain text files, then run the OCR tool that supports Hindi and then import the OCRed files back to Intella. That would be the best option in your case.

  • Thanks 1

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...