Jonas Posted August 14 Report Posted August 14 I am working corporate cases from time to time where I have a need to OCR-process pictures and pdf documents containing different Indian languages. As far as I can tell, there is no support for e.g., Hindi or Marathi in the built-in ABBYY FineReader (embedded) function the Intella Pro. Is there a way to expand on supported languages in the embedded OCR function? Kind regards, JonasĀ Quote
Jacques B Posted August 20 Report Posted August 20 I'm not sure what ABBYY FineReader uses for OCR (in-house, open source, or combination). Tesseract is a free OCR solution that does support those languages. https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html I don't know if there is a way to use a crawler script to use an external OCR tool. Or alternatively (less desirable of course) is to export those items, OCR them externally and bring them back in as new items. Quote
Jonas Posted August 26 Author Report Posted August 26 Hi Jacques, Thank you for your reply. I am already using an external "free" package to partly solve the need from time to time - what I was looking for was a reply on IF Intella built-in capabilities can be expanded on. Since no-one else have responded to this, perhaps I will have to close this search for the time being. Kind regards, Jonas Quote
Jacques B Posted August 28 Report Posted August 28 OK. I know there are admins on here. I'm not sure if they work for Vound, or are volunteers. Hoping one of them can provide you with an answer. Alternatively, you can submit a support ticket to them. For that, they are responsive. Where this is a community board, it may rely more on community members vs Vound staff. Quote
igor_r Posted August 29 Report Posted August 29 Hello Jonas , ABBYY doesn't support Hindi or Marathi. You would need to look for an alternative OCR solution. Please look at the using of external OCR tool section in the user manual: https://www.vound-software.com/docs/intella/2.7.1/#_using_an_external_ocr_tool Basically you can export the items that you need to OCR to plain text files, then run the OCR tool that supports Hindi and then import the OCRed files back to Intella. That would be the best option in your case. 1 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.