Jump to content

markjrouse

Members
  • Posts

    102
  • Joined

  • Last visited

Everything posted by markjrouse

  1. Hi, I would quite like to see the Smart Search (which to me is the near dup detection) to be more automated during processing. At the moment, Connect doesn't support Smart Search. If it did, and you have a large population of emails you need to review, your are affectively asking the end users to not only review the email, but also manually review any possible near dups, which can seriously slow down your review. It would be better if Intella did near dup detection automatically, perhaps a near dup hash, so those that process the data, can exclude near dups from the population that gets reviewed, and attorneys don't tell you that they have alredy seen this document, and that document. Regards
  2. Hi, I'm not sure if this has already been addressed, but I was wondering if there are any plans to introduce workflow management in Connect? What I mean by this is the ability for a Case Manager, to assign a batch of documents to a reviewer, have revew progress stats for each reviewer, auto assign another batch of documents to a reviewer, when the current batch is complete And other workflow management tasks. At the moment I have to manually tag documents to each reviewer in order to assign them. I, as the Case Manager, have no way to monitor review progress of each reviewer. When I log into Connect as Admin, I can see my work, on the dashboard, but not those of each reviewer. Instead I have to run manual searches. Also the progress on the dashboard is a bit missleading. It is telling me the progress, expressed as percentages, against the entire case file. So at the moment it tells me that 4% of 11 million items is tagged, but doesn't tell me the same percentages against my review population, which is only 25k out of that 11 million. In this case, I'm not sure how useful knowing 4% of 11 million items tagged is. Regards
  3. Hi, On a current project we are faced with potentially having to identify internal emails so that they can be tagged accordingly. When I say internal, I mean employee A emailing employee B. I'm looking to do this through Connect. I had originally thought about the email addresses facet, but it seems that it just lists all email addresses. Maybe a combination of "from:*@acme.com AND sender:*@acme.com" (without the speech marks). Does that seem like a reasonable search, or is there a better way of doing it? There is another level of complexity as well. If the client bought another company, Acme GFG, after date: X then only email addresses after this date should be included as well. So something like: "from:*@acme.com AND (sender:*@acme.com or sender:@acme-cfg.com)". I'm just not sure how to incorporate the date criteria into this search.
  4. Hi, One suggestion for a future feature that might be worth adding is a folder hash, for when folders are present in email mailboxes, and those folders contain several emails, and you want to deduplicate these folders when facing multiple monthly backup copies of the same mailboxes, or a users mailboxes from different locations, i.e a mailbox on the server and a local copy as well. The reason for this suggestion is based on a current project I'm working on. We have mulitple mail boxes from monthly backups. Naturally, we need to capture new emails, and deleted emails, so we process all the mailboxes. We find though that in Jan's backup you have a folder XYZ which contains 25 emails and their attachments. Of course, in Feb's backup you have the same folder with the same 25 emails; or do you? It seems that one of our keywords hits on the XYZ folder name, as well as maybe 1 or two emails in this folder. So in our review population, we have the same folder appearing 12 times, the emails appearing several times etc. And of course what has happened is that reviwer A is assigned 1 or two copies of the same folder, reviewer B gets assigned 1 or 2 copies of the same folder, and so on. At the moment there is no way to say for sure if folder XYZ is the same in each monthly backup. A user could delete an email or save new emails in the folder each month. Having a folder hash (maybe you can take the message hashs from the emails and MD5 hashs from attachments in a folder to use in generating a folder hash), would help in deduplicating folders containing emails. Perhaps this could be useful as well when dealing with monthly backups of netshares. If you can say for sure that this folder is the same as that folder, you could include the Jan backup in your review, and tag Feb-Dec as dups, if those folders haven't changed over time of course. Regards
  5. I can see in the Server log, several of the following entries: java.io.IOException: An established connection was aborted by the software in your host machine. Could my XP laptop where Connect is installed be causing these performance issues?
  6. Hi Lukasz, I'm affraid Connect 1.8.2 doesn't seem any faster when mulptle reviewers log into Connect, even when I turn off hit highlighting in List view. I don't have any other cases to test this on. It is random, it could take 25 mins to open up a simple email with no attachments. There doesn't seem to be any pattern as to why it run slows.
  7. Hi, We are using Pro to run some keywork searches, and now trying to run a query, and now it tells me "Invalid query: *@xxxx*, where as this used to work before in 1.8.1. Why does this query not work anymore? Is there a syntax I can use to correct this. My keyword, which is part of a keyword list, wants to look for all email addresses at domain xxxx.
  8. Hi, We have gone back into Pro, and now trying to run a query, and now it tells me "Invalid query: *@xxxx*, where as this used to work before. Why does this query not work anymore? Is there a syntax I can use to correct this. My keyword, which is part of a keyword list, wants to look for all email addresses at domain xxxx.
  9. Hi, We installed the latest versions of Pro and Connect. I've turned off the hit highlighting in list view as suggested. We will know better tomorrow to see if it has improved performance. However, from reviewing a couple of documents in Connect, when the Previewer is opened, we have noticed that it no longer advances to the next document, even thought option is tagged, when we click on a tag. Also there apears to be spreadsheets that don't render properly - you can only see half the columns and there is no way to scroll to the right. Any thoughts?
  10. Thanks Lukasz. I've now updated all the review laptops to use the latest Chrome, which from the Server Log, seems to resolve the broken connection issues. Currently download the latest Pro and Connect installers so will try those along with turning of the hit highlighting in List view, and we will let you know the outcome. Thanks for your help.
  11. Can't use Win7 for Connect at the moment as I'm onsite and don't have a powerfull enough Win7 laptop. Connect Server Intella Connect is run from a laptop with the following spec: Intel Core i7 Processor 16GB RAM Win XP 64-Bit 1 GBps Ethernet Port Intella Connect is not currently installed as a service. Data Evidence and Case File data sits on separate volumes within a Synology NAS RAID connected to a 1 GBps Netgear switch. Evidence data is 871GBs in total. Case File is 161 GBs. Network Network speed is 1 GBps. Reviewer laptops Reiew laptops have the following specs: Intel Centrino 2 Processor Win 7 Enterprise 2 GB RAM 32-Bit 1 GBps Ethernet port Microsoft Forefront Endpoint Protection IE 9.0.8112 If we shut down Connect and 1 person uses Pro, review is fast. There are no problems in opening documents or time delays. With Conect, we have two reviewers logged in and looking at documents. Both reviewers double click on a document to open it in the previewer. Previewer takes 25 minutes to display the contents of the documents. After 25 minutes the contents of the document each reviewer wanted to open appears on their respective screens simultaneously. Intella Pro has had no problems with the volume of data. It ran a 32 keyword list against 11.4 million items in 1 and a half minutes. When I look in the Server Log, I notice that I keep getting the following errors: The conection was broken. It was probably closed by the client. org.eclipse.jetty.io.EofException: null Unable to read contents of config issue from disk Unable to decrypt cookie credentials java.lang.NumberFormatException: For input string: ""
  12. Any ideas on when 1.8.3 will be released? We are experiencing major performance issues onsite at a client. It is getting to the point where we just can't use the software because it takes Connect 25 minutes to open one document in the previewer.
  13. Hi, In List view mode the setting option is greyed out.
  14. We are currently running Connect 1.8.2 to review tagged documents in my case file. However, we have noticed a performance issue when more than 1 reviewer logs into Connect to review their assigned documents in the same case file. It seems that loading documents takes a long time to open in the previewer. I can understand those that have large attachments, but smaller documents appear to take a while to load. We are running a local, small mini network supported by a 1GBps switch, with 5 reviewer laptops. We are running Connect from a Win XP 64-bit Laptop, with 16GB ram. Could these performance issues be related to Connect on a XP machine? Are there any things I can check or do to make the platform run faster, or at least reduce the performance bottlenecks?
  15. Hi, It may have been a one off as other items seem to be tagged ok. But just wondering if there were any know issues that may cause this so that I can appropriately troubleshoot. Of course, I can't be entirely sure that it was a one off so when the review is finished I'll have to do some further QA checks. Thanks.
  16. Hi, In Intella Pro, my case file has been set to to do family tagging (Also tag all over nested items in the same top level item), which works perfectly in Pro. However, when I share my case file in Connect 1.8.2, it appears that the family tagging has been ignored. So we have reviewed an email and tagged it as "False Hit", but when you look in the Tree tab, all the attachments have not had the "False Hit" tag assigned to them. Any thoughts?
  17. Is there anyway of saving the keyword hits? At the moment I run in Pro a Keyword List of 32 keywords and it returns 25k hits, with keywords highlighted when in previewer. But 25k is alot to review for one person. Is there a way of assigning different views a percentage of the 25k population and retain the highlighted keyword hits? Without this functionality it makes it difficult to determine what an item matched on.
  18. I'm using the latest version of Connect and Chrome, and if I upload a keyword list and run the searches, Connect doesn't seem to do anything and no hits are returned. I urgently need to resolve this as ts important that search hits are highlighted so that the review process can start. I'm also concerned that if my search returns hits and I tag them up as "Search Hits", then assign a percentage of "Search Hits" to a reviewer, the highlighted keyword hit is lost.
  19. Hi, I would like to add to the wish list, if it's not already here, the ability to add back ino the case file items that could not be decrypted, that I then subsequently export and password crack externally. So much like the OCR mechanism already built into Intella, it would be useful if there was a similar mechanism that allowed me to export items that could not be decrypted, password crack in an external tool, and then allow me to re-import those items back into their repsective families, or atleast someone way to link them back to their respective parent families.
  20. So when someone asks me how many duplicates, what's the best figure to give: the figure on the processing status screen, or Statistics?
  21. Thanks for this. But I was hoping that there is some query syntax that I could use as part of a keyword list. So if as part of the processing tasks, I want all hits that match *@domain* to be tagged with a certain tag, then having that query syntax in my keyword list will help to automate the process.
  22. Hi, I'm trying to find a way to search emails that have senders and receivers from a specific domain, or variations of a domain, using a wildcard. So for example, @domain* would return @domain.com or @domain-support.net. I've tried agent:@domain* but it doesn't seem to work. I'm wondering if there is a specific search syntax I could use to add to a keyword list.
  23. With the older versions of Excel or Word, what is the best way to automate the saving of lets say 570 Excel files in an older format into a new format? The only issue I can see with this approach is that: how do you get the recently saved updated version back into the parent email to overwrite the existing one? Or if you import the 570 newley saved as xlsx files, and you add as a new source, how do you then link them back to their respective parents. And of course, the metadata will be changed on the Excel files when you save it as a newer format.
  24. Hi, I'm having some issues reconciling figures reported during the processing, and those I see in statistics, and was hoping someone could explain the differences. During processing, the processing status screen, when finished all 11 steps, tells me that there are 1,517,290 items in total, and that there are 1,081,355 duplicate items. So 1,517,290 minus 1,081,355 should leave me a unquie deduplicated population of 435,935. In fact on the processing status screen, Intella tells me that unique items are 435,935. However, when I go into the Statistics screen, yes, I'm told that there are 1,517,290 All items, but after deduplication there are 443,206. So there appears to be a difference of 7,271 as to what the population is after deduplication. Does the processing status screen calculate unique items differently from the statistics screen? So if after deduplication the population is 443,206, then the duplicate items count is: 1,074,084, and not 1,081,355. Similiarly, with the reported exception items on the processing status screen I get 115,408. I've naturaly assumed that this is after deduplication. In the statistics screen Exceptions Items after Deduplication is 123,404. A difference of 7,996. Should there be a difference?
  25. Hi, I've just finished processing a large dataset and looking in the Exception Report I have noticed the following errors, to which I'm not sure how to resolve, or what they mean: Processing Errors java.lang.UnsupportedOperationException: Non-extended character Pascal Strings are not supported right night. java.lang.IndexOutOfBoundsException: Unable to read 512 bytes from 20992 in stream of length 11638. java.lang.NullPointerException java.lang.ArrayIndexOutOfBoundsException Not enough data (0) to read requested (2) bytes Unprocessable Items The supplied spreadsheet seems to be Excel 5.0/7.0 (BIFF5) format. [i assume this is because it's an old Excel file, but what should one do when encountering these old Excel types?] Expected to find a ContinueRecord in order to read remaining 7 of 13 chars. Initialisation of record 0x55 left 2 bytes remaining still to be read The document is too old - Word 95 or older. Try HWPFOldDocument instead? For some emails I get the folowing in the warning description: "In-Reply-To header". Does this mean it has found text or characters that can't be processed Regards
×
×
  • Create New...