Foreign Language PDF files

fuzed · July 6, 2018

I have a new client question, they have a number of foreign language PDF files, which are in arabic, chinese, hebrew etc., they are wanting to know how these could be made searchable, we've tested a few PDF's using the Intella platform, and word documents are searchable, however the PDF's do not seem to be.

Any thoughts on how this can be achieved, is this a codepage issue, or are we required to do more work?

Alex · July 6, 2018

PDF's may contain photocopies or scanned images of documents. In this case, they need to be OCRed in order to be searchable in Intella. Please note that if embedded ABBYY FineReader method is used, all foreign languages must be selected on the configuration page of the OCR wizard.

For PDF documents containing a parseable text, non-English language should not be a problem. Otherwise, we suggest to open a support ticket and attach few examples of problematic PDFs.

fuzed · July 6, 2018

thanks Alex, I'll ask the client if I can send over a few examples - I've tried with the built-in OCR tool, and selected the correct language but the examples we used didn't appear to OCR correctly.

Sign In

Foreign Language PDF files

Recommended Posts

fuzed

Link to comment

Share on other sites

Alex

Link to comment

Share on other sites

fuzed

Link to comment

Share on other sites

Join the conversation

Browse

Activity