fuzed Posted July 6, 2018 Report Share Posted July 6, 2018 I have a new client question, they have a number of foreign language PDF files, which are in arabic, chinese, hebrew etc., they are wanting to know how these could be made searchable, we've tested a few PDF's using the Intella platform, and word documents are searchable, however the PDF's do not seem to be. Any thoughts on how this can be achieved, is this a codepage issue, or are we required to do more work? Quote Link to comment Share on other sites More sharing options...
Alex Posted July 6, 2018 Report Share Posted July 6, 2018 PDF's may contain photocopies or scanned images of documents. In this case, they need to be OCRed in order to be searchable in Intella. Please note that if embedded ABBYY FineReader method is used, all foreign languages must be selected on the configuration page of the OCR wizard. For PDF documents containing a parseable text, non-English language should not be a problem. Otherwise, we suggest to open a support ticket and attach few examples of problematic PDFs. Quote Link to comment Share on other sites More sharing options...
fuzed Posted July 6, 2018 Author Report Share Posted July 6, 2018 thanks Alex, I'll ask the client if I can send over a few examples - I've tried with the built-in OCR tool, and selected the correct language but the examples we used didn't appear to OCR correctly. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.