tufelkinder Posted June 12, 2013 Report Share Posted June 12, 2013 I have received a load file with TIFF images and extracted text instead of a native document set. What is the best way to deal with/process these? Thanks in advance... Link to comment Share on other sites More sharing options...
tufelkinder Posted June 14, 2013 Author Report Share Posted June 14, 2013 I kind of hacked together a solution if anyone is ever interested. I converted the TIFFs to PDFs and OCRed them, then wrote a Python script to parse the load file and join the pages into a single PDF where they belonged. Worked pretty well, but a little tedious. Link to comment Share on other sites More sharing options...
Chris Posted June 18, 2013 Report Share Posted June 18, 2013 Hello Walt, That sounds like a good solution to me. In the future release we will add support for OCR-ing documents and images. Until then there are some workarounds, but the problem is often that the original item and its extracted text become separate units of information. Your solution resolves that issue. Link to comment Share on other sites More sharing options...
Recommended Posts