Jump to content

Leaderboard


Popular Content

Showing content with the highest reputation since 02/25/2019 in all areas

  1. 1 point
    Hello Bryan, Please try running the installer like this: setup-intella...exe /S It will run the installer in the background and install Intella in the default location. Some windows will still briefly open and close when certain settings are made, but no user interaction is necessary. Note: we have not tested this switch a lot and therefore we do not officially support it. It worked fine on my system though and I am quite confident that it will work on other systems.
  2. 1 point
    Has anyone successfully imported a Slack Enterprise messaging archive into Intella? It is a json format. Thanks for the help.
  3. 1 point
    We raised this requirement before too. It would be critical for Intella use the SLACK API with Legal-Hold privileges to select and pull data from Slack. Slack has become very big. So, count our vote on this too please. For API reference see: https://api.slack.com/
  4. 1 point
    I had been thinking a bit about this question and wanted to throw out an alternative approach. Of course, it's correct that Lucene does not directly support proximity searches between phrases. However, as has been previously mentioned in a pinned post, it does allow you to identify the words included in those phrases, as they appear in overall proximity to each other. Thus, your need to search for "Fast Pace" within 20 words of "Slow Turtle" should first be translated to: "fast pace slow turtle"~20 . This search will identify all instances where these 4 words, in any order, appear within a 20 word boundary anywhere in your data set. Then, with this search in place, you can perform an addition search, applied via an Includes filter, to include your two specific phrases: "fast pace" AND "slow turtle" By doing this, you should be left with a very close approximation of the exact search you initially intended, with your results filtered to only show your exact phrase hits, but within the overall proximity boundary previously specified. Hope that helps!
  5. 1 point
    Hi John, That's strange though because we kept on searching and we found that we were able to use RegEx to search for properties using a different syntax in the search bar. If we surround the RegEx with a leading and a trailing forward slash "/", the RegEx expression also found hits in the properties.
  6. 1 point
    I think what Todd is likely referring to is a Relativity-centric concept rooted in the so-called search term report (STR), which calculates hits on search terms differently than Intella. I know I have communicated about this issue in the past via a support ticket, and created such a report manually in Intella, which is at least possible with some additional effort involving keyword lists, exclusion of all other items in the list, and recording the results manually. What the STR does is communicate the number of documents identified by a particular search term, and no other search term in the list. It is specifically defined as this: Unique hits - counts the number of documents in the searchable set returned by only that particular term. If more than one term returns a particular document, that document is not counted as a unique hit. Unique hits reflect the total number of documents returned by a particular term and only that particular term. I have been aware of this issue for years, and although I strongly disagree regarding the value of such data as presented in the STR (and have written about extensively to my users), the fact is that, in ediscovery, groupthink is extremely common. The effect is that a kind of "requirement" is created that all practitioners must either use the exact same tools, or that all tools are required to function exactly the same (which I find to be in stark contrast to the forensics world). I actually found myself in a situation where, in attempting to meet and confer with an opposing "expert," that they were literally incapable of interpreting the keyword search results report we had provided because it was NOT in the form of an STR. In fact, they demanded we provide one, and to such an extent that we decided that the most expedient course of action was just to create a new column that provided those numbers (whether they provided any further insight or not). So in responding to Jon's question, I believe the answer is NO. In such a case, within the paradigm of the STR, a document that contains 5 different keywords from the KW list would actually be counted ZERO times. Again, what the STR does is communicate the number of documents identified by a particular search term, and no other search term in the list. I think it's a misleading approach with limited value, and is a way to communicate information outside of software. Further, and perhaps why it actually exists, is that it sidesteps the issue of hit totals in columns that add up to more more documents than the total number identified by all search criteria. In other words, it doesn't address totals for documents that contain more than one keyword. This is in contrast to the reports Intella creates, where I am constantly warning users not to start totaling the columns to arrive at document counts, as real world search results almost inevitably contain huge numbers of hits for multiple terms per document. Instead, I point them to both a total and unique count, which I manually add to the end of an Intella keyword hit report, and advise them that full document families will increase this number if we proceed to a review set based on this criteria. Hopefully that clarified the issue and provided a little more context to the situation! Jason
  7. 1 point
    I guess in the future I could select each of the individual MBOXs from the IMAP collection except the ALL MAIL MBOX, index the collection, and then add the ALL MAIL MBOX in as a second step. Anything that was a duplicate in ALL MAIL would be duped out. As a workaround, I showed the "duplicates" column in the listing pane, sorted based on location and tagged for export any item in the ALL MAIL location that did not show a duplicate, but did not tag any item that did show a duplicate. All other relevant items from other Gmail 'folders' were tagged and all tagged items were exported.
  8. 1 point
    In the ediscovery world, we are bombarded by both vendors and developers heralding the promise of advanced text analytics capabilities to effectively and intelligently reduce review volumes. First it was called predictive coding, then CAR, then TAR, then CAL, and now it's AI. Although Google and Facebook and Amazon and Apple and Samsung all admit to having major hurdles ahead in perfecting AI, in ediscovery, magical marketing tells us that everyone but me now has it, that it's completely amazing and accurate and that we are Neanderthals if we do not immediately institute and trust it. And all this happened in a matter of months. It totally didn't exist, and now it apparently does, and these relatively tiny developers have mastered it when the tech giants have not. Back in reality, I have routinely achieved with Intella that which I'm told is completely impossible. As Intella has evolved its own capabilities, I have been able to continually evolve my processes and workflows to take full advantage of its new capabilities. As a single user, and with Intella Pro, I have been able to effectively cull data in data sets up to 500 GB into digestible review sets, from which only a far smaller number of documents are actually produced. PSTs, OSTs, NSFs, file share content, DropBox, 365, Gmail, forensic images - literally anything made up of 1s and 0s. These same vendors claim I can not and should not be doing this, it's not possible, not defensible, I need their help, etc. My response is always, in using Intella with messy, real-world data in at least 300 litigation matters, why has there not been a single circumstance where a key document in my firm's possession has ever been produced by an opposing party, that was also in our possession, in Intella, but that we were unaware of? Of course, the answer is that, used to its fullest, with effectively designed, iterative workflows and QC and competent reviewers, Intella is THAT GOOD. In the process, I have made believers out of others, who had no choice but to reverse course and accept that what they had written off as impossible was in fact very possible, when they sat there and watched me do it, firsthand. However, where I see the greatest need for expanded capabilities with Intella is in the area of more advanced text analytics, to further leverage both its existing feature set, and the quality of Connect as a review platform. Over time, I have seen email deduplication become less effective, with the presence of functional/near duplicates plaguing review sets and frustrating reviewers. After all, the ediscovery marketing tells them they should never see a near duplicate document, so what's wrong with Intella? You told us how great it is! The ability to intelligently rank and categorize documents is also badly needed. I realize these are the tallest of orders, but after hanging around as Intella matured from version 1.5.2 to the nearly unrecognizable state of affairs today (and I literally just received an email touting AI for law firms as I'm writing this), I think that some gradual steps toward these types of features is no longer magical thinking. Email threading was a great start, but I need improved near duplicate detection. From there, the ability to identify and rank documents based on similarity of content is needed, but intelligently - simple metadata comparison is no longer adequate with ever-growing data volumes (which Intella can now process with previously unimaginable ease). So that's my highest priority wishlist contribution request for the desktop software, which we see and use as the "administrative" component of Intella, with Connect being its review-optimized counterpart. And with more marketing materials touting the "innovative" time saving of processing and review in a unified platform, I can't help but think to respond, "Oh - you mean like Intella has been from very first release of Connect?" Would love to hear others share their opinions on this subject, as I see very little of this type of thing discussed here. Jason
  9. 1 point
    Introduction We receive numerous support tickets from our customers in regards to advice for using Proximity searches. The user manual provides the basic syntax and there is additional information at these Forum posts. http://community.vound-software.com/index.php?/topic/245-proximity-search-using-more-than-two-words/?hl=prox%2A http://community.vound-software.com/index.php?/topic/359-proximity-search-with-a-phrase-search/?hl=proximity In most cases we are provided with examples of the syntax which the customer has used. In some cases the syntax is very complex and, often the syntax is incorrect. Some customers ask us whether the syntax is correct or ask why their proximity search is not working. This is something that we cannot answer on an individual basis. The point of this document is to provide examples to help our customers to get a better understanding of proximity search syntax so that they can create the correct search syntax for the search that they want to perform. Note: Most of this information applies to all versions of Intella which support Proximity searching. There is a known issue with hit highlighting in versions prior to 1.9.1. We recommend that you update to version 1.9.1 if you encounter this issue. What is a proximity search? Proximity searches are search syntax specifically crafted to find items based on words that are within a specified maximum distance from each other in the item’s text. For example, if I wanted to find all items that have the words 'desktop' and 'application' within 10 words of each other then I would use the following: “desktop application”~10 A proximity search differs from a phrase search in that it does not matter whether 'desktop' is before or after the term 'application' in the text. For example, documents containing either of the passages of text below will be respondent to the proximity search above. "You must turn on your desktop computer before you can open an application." "I have copied the shortcut for the application onto the desktop." Using the Correct Proximity Syntax As mention above we receive proximity search syntax from customers. A lot of the time we see that the customer has created search strings such as the examples provided below: (Baxter Jason) ~20 (article) OR (paper) OR (presentation) OR (public) OR (report) "national OR fire OR service"~30 (truck) OR (department) These examples have been sanitized and shortened however, the original search strings contained several lines of OR statements. This makes the search string complex, cumbersome, prone for errors and difficult to troubleshoot. Example 1 If we look at the first example above, we can see immediately that there are several issues which make this syntax incorrect. One issue is that the terms to be searched are not encased in double quotes. Another issue is that the number of words to be within (~20 in this case) is not at the end of the proximity search syntax as there are several OR statements after this number. The user manual shows a basic example of the syntax “desktop application”~10. Note that the structure is to have two (or more) search terms encased in double quotes followed by the number of words that the terms must be within. The proximity string can be made more useful for larger queries by adding more search terms. The additional search terms need to be separated by the OR operator and encased in parentheses. For example, the first example above could be rewritten this way: "(Baxter OR Jason) (article OR paper OR presentation OR public OR report)"~20. Because the user is looking for one of two terms within 20 words of one of several other terms, we have grouped the keywords by placing them in parentheses and separating the terms with the OR operator, e.g: (Baxter OR Jason) and (article OR paper OR presentation OR public OR report). Note: All of the search terms are still encased in double quotes, followed by the number of words that the terms must be within. This syntax will return any items where Baxter or Jason is within 20 words of article, paper, presentation, public or report. Example 2 Again we see that there are issues with the search syntax in example 2. This time double quotes are used however, they do not encase all of the search terms. Also, we see a similar trend to example 1 where there are several search terms within parentheses and separated by the OR operator. We see a lot of samples like this and wonder whether this format of proximity search has come from another tool. The way I read this example is as follows: Find all items that have national, fire, or service within 30 words of truck or department. The syntax can be rewritten this way: "(national OR fire OR service) (truck OR department)"~30. Again we use the parentheses to group the search terms into the two groups and make sure that all terms are encased in double quotes. Limitations Because the double quotes need to encase all of the search terms, you cannot have a search phrase within a proximity search. A search phrase would require double quotes and you can't have nested double quotes within a proximity search. That said, you can use phrases in keyword lists (see below). In the past we have been provided with proximity search strings where the syntax contained over 40 words separated by the OR operator. As mentioned above, this format is not correct. Even if we corrected the syntax, 40 words in a proximity search makes the search string complex, cumbersome, prone for errors and difficult to troubleshoot. We have also received extremely long search syntax where all search terms contained wildcards. Such complex queries with many wildcards are known to have very poor performance, especially for hit highlighting in the Previewer window. Workarounds There are a couple a methods one could use to manage complex proximity searches that contain a large number of search terms separated by the OR operator. One is to break down the search string and two is to use keyword lists. Breaking down the search string A complex search string can be broken down into several shorter proximity search strings. The shorter search strings are then placed into a keyword list. E.g. “Baxter article”~20 “Baxter paper”~20 “Baxter presentation”~20 “Baxter public”~20 “Baxter report”~20 Intella will be able to process the list of shorter proximity searches more efficiently than one large complex search string. With a small amount of Excel work you can create a keyword list that includes all of your shortened proximity searches in a single list Using keyword lists The idea behind using keyword lists is to reduce the number of items that your proximity search needs to search across. Two keyword lists can be created, one list which contains the search terms in the left group of a proximity search and a second list which contains all the other terms in the right group, e.g. Keyword list 1 Keyword list 2 Baxter article Jason paper presentation public report Next, run the two keyword lists and Tag the overlapping cluster. This cluster will contain the items that have search terms from both keyword lists. Set this Tag as an 'Include' search and run the proximity search. This provides faster searching as you are not searching over the entire dataset. However, be aware that hit highlighting can still be slow or hang Intella if the proximity search is complex and contains wildcards. The advantage of using keyword lists is that you can use the following types of searches and operators: Wild cards (article*, paper* etc) Phrases ("national fire", "fire service" etc) Other search operators
  10. 1 point
    As an update, a customer has provided a solution where the screen res for the entire PC does not need to be changed. You can actually set Intella to not use the high res settings (thanks Chad). The solution is: 1) Right click on the Intella shortcut and select Properties. 2) Click on the Compatibility tab. 3) Check the option to 'Override High DPI scaling behavior' and select the 'System' option from the dropdown (see the screenshot below).
  11. 1 point
    We have recently considered a new deployment scenario for CONNECT. It turned out not to be viable as it would require purchase of many more Microsoft server CALs and other Microsoft licenses at significant cost. Hence I wanted to raise the question what it would take to have the CONNECT server run in Linux instead of Windows (excluding index creation)? As it is a Java application it would seem to be portable (possibly with loss of functionality such as PST creation). Any thoughts?
×
×
  • Create New...