AdamS Posted August 12, 2013 Report Posted August 12, 2013 I've got a case running on Team 1.7.0 x64 with a number of email PST archives which I extracted from an EDB archive using Systools Exchange Recovery. The image I'm working from was taken using FTK imager by another firm so I'm unsure what process they used when they were imaging the Exchange Server but I strongly suspect it was a live image of the system so the EDB would have been live rather than unmounted so resulting in a dirty copy. Some of the PST user emails exported out nice and quick, yet others took 2 or 3 days to export out, admittedly some are large (28gb PST archives) but on the whole the process was successful bar one user which can't be exported for some reason. I have imported all the PST's into Intella and indexed successfully and applied keywords and tagged the results. So that's the background, so far so good. Now I'm trying to take advantage of Intella's ability to export lists so I can show the client the Total keyword hit results, then give them the 'per mailbox or user' breakdown for each keyword. My approach has been to select the user mailbox in 'location' facet then click 'include'. Then go to the tags facet and select all tags and press search. Once they have loaded in I can highlight all the tags in the bottom left panel, right click, export values and I get a nice little CSV file with the total results for the keywords, then another column with the numbers for the current location I've selected to include. I can then repeat this process for each different location and have the makings to build a nice little spreadsheet neatly detailing all the results. Problem is Intella sort of locks up after I highlight all the keywords and press search. I've been able to successfully export out 2 of 7 lists but can't do anything with the reamining 5. I'm assuming it's trying to work through the keywords but it just seems to be taking a very long time to resolve. The first lot only had about 1500 to get through and that took about 10 seconds to display the coloured balls and then I could export the list. The next one had about 36,000 and only took about 20 seconds before I could export. The one I'm currently on has about 55k and I've been waiting for around 10 minutes now with no coloured balls and no response on screen. Task manager shows the process as active though. Is there known issues working with larger keyword results? I wouldn't have thought there would be any issue at these low numbers. I have also tried with 32bit Intella but makes no difference.
AdamS Posted August 12, 2013 Author Report Posted August 12, 2013 Please disregard, if I reverse my process (include all tags, then search on location) then there is no issue and results can be exported instantly.
Chris Posted August 12, 2013 Report Posted August 12, 2013 Still thank you for the description of your process, always nice to know how people are using Intella We are asked regularly for functionality to produce cross tables, e.g. custodians x number of keyword hits. I think we should do something to better facilitate that. As for responsiveness of keyword lists: two factors determine how long it takes to draw a Cluster Map out of it: Size and complexity of the keyword list. Know that wildcard searches are effectively translated to a Boolean OR of all matching terms, so terms with wildcards that match a lot of terms can take some time to evaluate. Number of clusters in the graph. The time needed for calculating a layout is roughly quadratic w.r.t. the number of clusters. It often occurs that people use the keyword list "as a whole", i.e. for their task it's not important which term caused the hit, just knowing that a document matches one of the terms in the list suffices. That's why we added a "Combine queries" checkbox at the bottom, turning the keyword list into one giant OR query, producing only a single cluster.
AdamS Posted August 13, 2013 Author Report Posted August 13, 2013 We are asked regularly for functionality to produce cross tables, e.g. custodians x number of keyword hits. I think we should do something to better facilitate that. This is something that I do as a matter of course for every job I do, mainly to show the clients the total hits (including duplicates) then pare it down to the more relevant data and remove the duplicates. I find this is a great way to show the value to our clients ie instead of reading through 40,000 emails because of this wonderful software we have there are now only 1,000 relevant emails to look at If we were able to export a list which includes these results (total plus hits for individual keywords) that would be fantastic. If the export itself was sent to excel format with the filters enabled for each column we then have a nice neat customizable spreadsheet that we can quickly filter out what we need and save off copies showing the relevant information.
Recommended Posts