Jump to content

wmfiske

Members
  • Posts

    9
  • Joined

  • Last visited

Everything posted by wmfiske

  1. I have the same question on how this can be done if a reviewer is using Connect. I can do this type of limitation to the keyword list if I use Intella Viewer or Intella Pro. However, in my case, I have a need for this to happen with the reviewer using Connect.
  2. I am using 1.9.1 to read a CSV load file and it will not read the file properly when it comes to the Extracted Text field. A sample Extracted Text field is "Images\001\001\00000001.txt" During the Validation step, it says it cannot read the file. The example above shows [path]\Images00100100000001.txt The backslashes in the CSV file are not being read and shows it as one long string.
  3. When I use Preview Item (CTRL+O), I can enter either an Item ID number or a URI. If I want to identify a series of Item ID's, I can add those Item ID's items to a single text file and import that list via the Item ID Lists facet. Can you expand the Item ID Lists facet to include a text file containing URI's?
  4. [disregard - wrong forum]
  5. I would like to open a community discussion on OCR settings and programs as I have been doing some performance testing recently. There are two versions of ABBYY that I have been testing: FineReader Corporate (4 core) and Recognition Server (RS v4). My first assumption was that RS v4 would be faster since it is 4-5x the cost of the 4-core Corporate version. I was using an unlimited core version and I liked the idea that I could export/import files directly from Intella v1.9. In one test, I sent 100 non-searchable PDF files to RS using the Intella interface. I preconfigured a workflow in RS to export to Text format. The PDF files were random sizes, 4 had errors (corrupted) and they totaled 1,067 pages. TEST #1 (Good): RS server, which was running on a separate server than Intella, completed the task in 26 minutes. (Note: One downside to using the Intella interface to export/import to RS was I could not use Intella while it was processing) TEST #2 (Better): Corporate, which was running the Hot Folder function on a separate server, completed the task in less than 19 minutes. The output and other settings was equivalent to the RS workflow. TEST #3 (Best): I then wanted to figure out a way to squeeze more performance from Corporate Hot Folder. I created a batch file that split my PDF files into 4 subfolders. I did this based on the starting value of the MD5 filename (16 variables split 4 ways). Of course that will not equally balance the workload but it was good enough for testing. I started the 4 jobs on the Hot Folder interface at the same time (one job per subfolder). Although it was still limited to 4 cores, the split did make a difference. All jobs were completed in less than 10 minutes. This made me consider the option of buying two Corporate 4-core licenses running on separate servers instead of using RS. If you wait, ABBYY often sells 4-core at a 40% discount for $359/license. So roughly $700 for unlimited OCR compared to RS pricing. Questions for the community: 1) What do you use for OCR? Has it been a good ROI? 2) What OCR settings do you use? What works best for an eD environment? Thanks for reading, Wm
  6. Does Intella calculate the total expanded size of data? I know that I can export the table to a CSV list with each entry and sum the size. That export takes some time.
  7. Igor, The recovery of deleted items is exactly what took the longest. It took a total of 9 hours to process the entire 30 GB OST file. According to the log, 7 hours was used to process deleted items.
  8. The OST file was created by Outlook 2010.
  9. I am currently running ver 1.9 and indexing a 30 GB OST file. Intella was cranking along just fine for the first 80 minutes. Then it reduced to a minimal processing mode. The Index New Data window shows no new activity. However, through Resource Monitor, I can see that Java is reading the OST file, but it appears to be extremely slow in contrast. Is there a way to determine exactly what is going on? I am running it on a Windows 7 box (i7-5930 3.5 GHz with 64GB RAM) with evidence on one drive and temp on a 512 GB M.2 card. This is my second test on this OST file. When I first tested this OST file, it ran on Intella for 13 hours (acted the same way as described above). I finally stopped this process and figured there must be errors in the OST file. I then ScanPST multiple times on it to see if that would help improve it.
×
×
  • Create New...