Jump to content

jasoncovey

Members
  • Posts

    34
  • Joined

  • Last visited

  • Days Won

    3

jasoncovey last won the day on March 19 2019

jasoncovey had the most liked content!

Profile Information

  • Gender
    Male
  • Location
    Atlanta, GA

Recent Profile Visitors

861 profile views

jasoncovey's Achievements

Newbie

Newbie (1/14)

15

Reputation

  1. So this is a very basic task in Connect. First, you would retrieve the items tagged as YES under your potentially responsive category. Next, you would use the Show Family feature (in Connect, selecting all item in the table view, then right-click > Show Family...). Then, with deduplication disabled (which is important, as duplicate items that may exist across multiple families would otherwise not be selected, resulting in incomplete families), select the results that include that families (i.e. your original YES tag items, plus their families), then apply a new tag to the entirety of these items. The review batches can then be created based on this new tag, which will include the complete document families you're after. Hope that helps! Jason
  2. Although I haven't performed this specific task with CDR data, I'm familiar with it, and have performed similar tasks with all manner of misc. metadata. Right off the bat, the only issue I foresee that you might run into is the inability to map data to certain fields in Intella. Off the top of my head, I don't know if any fields that are essential to communication analysis that cannot be written to, but that's something you would need to investigate. Perhaps a mod could speak to that specific issue. That issue aside, the route to achieving this type of data import is via Intella's load file import functionality. I don't know if you have any experience in the creation of custom load files of this nature, as it's not an introductory-level task, but by creating a DAT or CSV load file, and probably some custom columns to accommodate any fields that cannot be directly mapped to certain of Intella's internal fields, you should be able to achieve what you want. I'm including a link to a presentation I did for Vound that is specific to load files, which will hopefully get you pointed in the right direction. Jason
  3. In working with versions 2.3 and 2.3.1, I am encountering a new issue that is proving to be quite inconvenient, and think that some additional granularity with regard to permissions is the most obvious solution. I am finding it a problem that, as a case admin, I can no longer edit tags: (1) that are created by other users; or (2) that were created automatically, such as via import of a coding layout. Ideally, I would like the ability to prevent users from creating their own tags, which can be very important when working with less experienced users and keeping the Tags facet from being properly maintained. For example, many users don't understand nesting of tags, and their inter-relationship with coding layouts. If that were accomplished via an affirmative permission "Can create new tags," that would be fine. In addition, it's even more important that I be able to edit tags in order to enforce naming conventions, etc., and efficiently address issues that are encountered. Just today, I discovered that a user had created new tags and named them incorrectly. This should be a quick, direct fix for a user with case admin permissions, as has always been the case. Further, I have discovered that even the desktop software cannot override and permit the changes to be performed - at least when accessing the shared case via a Viewer license. ***EDIT: in performing some additional work today, I just realized that the inability to edit another user's tag introduce's an additional limitation. I just had occasion to edit an existing tag that included a long tag description. Thus, I wanted to open, copy that text, then use it as the basis for the description of a subsequent tag, only needing to modify a single word. Unfortunately, because the tag was not created by my, this is not possible via the UI. Again, this is in the context of a currently-shared case in Connect, accessed via Viewer license rather than opening the unshared case, directly with a desktop license. While I'm discussing tags, I would also like to add that Connect would benefit from the addition of the ability to edit the top-level tag AFTER a tag has been created, as is possible in desktop. As it stands now, if I forget to assign a parent tag at the time of the tag's creation, the only remedy is to delete the tag and re-create it. Thanks! Jason
  4. During recent reviews and in light of user feedback, I wanted to propose two coding layout improvements for a future version of Connect. The first has to do with adding a collapsing arrowhead for the top level tags, as they currently exist under the Tags facet. The use case for this has to do with the presence of long list of coding options. Examples might be required references to corresponding Document Request numbers (a common component of production specifications), or complex issues tagging that reviewers apply during review in order to leverage in later stages of a litigation matter when document productions are complete. Allowing these lists to be collapsible would presumably make for an improved user experience, as well as made the coding panel less cluttered when these specific options are not in use, as different reviewers have different objectives at different times, etc. Another possible alternative to accommodate this might be allowing multiple, customizable tabs to be added, with certain coding layout content assigned to certain tabs. I know I have seen this approached used in some other review platforms, and it offered a reasonable way to fit more options into a smaller space. The second has to do with making better use of the available screen real estate for coding layout content. In its current iteration, there is a significant amount of blank space in the right side of the coding pane, which begs for an additional column, to display more options without requiring scrolling (which I have found to annoy users). In my estimation, the text size used in the coding layout is very generous, and larger than what I am used to seeing elsewhere. However, the value in having a second column of options outweighs that, in my mind. Regardless, perhaps having an option whether to force two columns would be another approach, perhaps in conjunction with the separate tabs idea. Regardless, the current iteration of the Review tab UI is the best ever, and we're looking forward to future improvements to make the most painful, expensive phase of the ediscovery process as simple and streamlined as possible for reviewers. Jason
  5. This one is actually easy. Do this: (1) pull up some items from your case; (2) highly some for export and then right-click and select Export > selection... (you'll actually abort this operation, so don't worry about naming or the destination folder not being empty); (3) check the box for "Add to export set"; (4) then select that radio button for "Add to existing set," and then select the export set you want to delete from the drop-down menu; (5) when you select this radio button, the previously grayed-out "Remove" button will be highlighted, which you can then click to delete the export set containing the error. Intella will give you warning prompt to make sure you have selected the correct export set to remove. Once you proceed, Intella will irreversibly delete the export set, so just make sure you have selected the right one to delete. Hope that helps! Jason
  6. I had been thinking a bit about this question and wanted to throw out an alternative approach. Of course, it's correct that Lucene does not directly support proximity searches between phrases. However, as has been previously mentioned in a pinned post, it does allow you to identify the words included in those phrases, as they appear in overall proximity to each other. Thus, your need to search for "Fast Pace" within 20 words of "Slow Turtle" should first be translated to: "fast pace slow turtle"~20 . This search will identify all instances where these 4 words, in any order, appear within a 20 word boundary anywhere in your data set. Then, with this search in place, you can perform an addition search, applied via an Includes filter, to include your two specific phrases: "fast pace" AND "slow turtle" By doing this, you should be left with a very close approximation of the exact search you initially intended, with your results filtered to only show your exact phrase hits, but within the overall proximity boundary previously specified. Hope that helps!
  7. I think that would work! As long as we would still be able to accommodate the scenario you described in Item No. 6 in your list, that sound like it would be a very simple solution. Jason
  8. So I read this several times today to make sure I understand everything that being described, and I think all sounds fantastic. This solves the majority of the problems that have been described in this thread. One additional issue that came to mind when discussing internally and thinking about scenarios we have grappled with previously. In culling data and creating batched review sets, it's fairly common to run into a situation where, as a results of the entire content of a ZIP archive being include, or due to false positive search hits, that a large portion, or even the entirety of a batch can be determined as being non-responsive without a full, document by document review. In these situations, we are frequently asked if there is a way to bulk-code the documents as non-responsive and make the batch as complete. This presents a problem because, although we train users to query for items into the main previewer where the bulk tag can then be applied, the limitation is that this action is not recorded back in the Review UI because: (1) check marks are not added to the now-coded documents; and (2) the batch is not marked complete as a result. What this results in looks like the following screenshots. First, in the All Batches view, despite the appearance of the progress, every single document has, in fact, been tagged (and I know there is difference between tagged and coded) with either the responsive or non-responsive tag, from the same coding palette. Of course, end users can't begin to understand how "completed" work could look like this, and ask all kinds of questions that we can't really answer other than to say, ignore what you are seeing, everything is actually fine despite what you see. That doesn't go over well with lawyers! By the same token, people LOVE the progress and status data, it's the only such data Connect provides us, so it would obviously be ideal if it could be as accurately as possible and avoid ever being misleading. In the next screenshot, which was taken inside of the Wave 03 Email-7 batch, you can see what we are talking about. As it turns out, all of the spreadsheets are non-responsive, and there are literally hundreds of them, which exist across over two dozen batches. Since they can be identified as non-responsive at a glance, without reviewing at all, we can't spend the time coding each one individually. Therefore, we have to query for items and bulk tag in the main previewer for sake of efficiency. Unfortunately, we have found this to be a very common scenario across dozens of litigation matters, and have to have a way to address it. With all that explained, regarding No. 3 in your list, are you using the term "Closed" to mean the batch will be marked as" Completed" via the green button? In other words, this could be used by someone assigned this permission to address item batches that are not technically "completed" under the current definition? Another question is, when you are saying Closed, will the percentage also move to 100, or will it stay where it is, but just display the green Completed button? I know that our preference would definitely be for 100% due to the issues already described. On the subject of batching, generally, the only other thing we're in desperate need of is the ability to batch documents in sort orders OTHER THAN Family Date. This is particularly the case with incoming load file productions, which may or may not contain adequate metadata for Intella to calculate perfect family dates, or when such a production is not IN any particular date order, which definitely happens all the time. This puts us in a position of not being able to batch the document in bates numbered order, which then breaks up document families and creates an extremely difficult situation for us to resolve. Hopefully that was instructive, and I'm looking forward to seeing these features make it into a future version of Connect! Jason
  9. With regard to 64 GB RAM, which I have installed on a high-end Dell rackmount physical workstation with dual Xeon E5-2600 v4 processors, for 32 total cores, I have not been able to realize the performance I had hoped for in a machine dedicated to processing performance. Not that it was bad - far from it! It's just that I was thinking that better use could be made from the RAM and number of processor cores. In reality, despite having 10K RPM internal, enterprise class, SAS rotational drives and a 15K RPM system drive, it seems like the disk IO simply cannot supply enough throughput to make effective use of that degree of CPU and RAM. I wish I had instead opted for SSD drives, which were more expensive at the time than they are now. The only way I have improved performance with this setup was when we filled out the remaining internal drive bays with 12 Gb/sec 10K RPM drives (vs. their 6 Gb predecessors). I think the information that Primoz has provided is directly in line with my own experiences, and generally cautioning that the investment in massive RAM and CPU may not result in the kind of performance increases you might hope for (like I did). That said, if I was in your situation, I would go for the fastest SSDs I could get, probably go with the less expensive processor, and 32 GB RAM, and do some benchmarking vs. your current machines, while monitoring RAM usage. If can need more, you can presumably expand if you can configure in such a way that you have open slots. Hope that helps some with your decision. Good luck! Jason
  10. I think what Todd is likely referring to is a Relativity-centric concept rooted in the so-called search term report (STR), which calculates hits on search terms differently than Intella. I know I have communicated about this issue in the past via a support ticket, and created such a report manually in Intella, which is at least possible with some additional effort involving keyword lists, exclusion of all other items in the list, and recording the results manually. What the STR does is communicate the number of documents identified by a particular search term, and no other search term in the list. It is specifically defined as this: Unique hits - counts the number of documents in the searchable set returned by only that particular term. If more than one term returns a particular document, that document is not counted as a unique hit. Unique hits reflect the total number of documents returned by a particular term and only that particular term. I have been aware of this issue for years, and although I strongly disagree regarding the value of such data as presented in the STR (and have written about extensively to my users), the fact is that, in ediscovery, groupthink is extremely common. The effect is that a kind of "requirement" is created that all practitioners must either use the exact same tools, or that all tools are required to function exactly the same (which I find to be in stark contrast to the forensics world). I actually found myself in a situation where, in attempting to meet and confer with an opposing "expert," that they were literally incapable of interpreting the keyword search results report we had provided because it was NOT in the form of an STR. In fact, they demanded we provide one, and to such an extent that we decided that the most expedient course of action was just to create a new column that provided those numbers (whether they provided any further insight or not). So in responding to Jon's question, I believe the answer is NO. In such a case, within the paradigm of the STR, a document that contains 5 different keywords from the KW list would actually be counted ZERO times. Again, what the STR does is communicate the number of documents identified by a particular search term, and no other search term in the list. I think it's a misleading approach with limited value, and is a way to communicate information outside of software. Further, and perhaps why it actually exists, is that it sidesteps the issue of hit totals in columns that add up to more more documents than the total number identified by all search criteria. In other words, it doesn't address totals for documents that contain more than one keyword. This is in contrast to the reports Intella creates, where I am constantly warning users not to start totaling the columns to arrive at document counts, as real world search results almost inevitably contain huge numbers of hits for multiple terms per document. Instead, I point them to both a total and unique count, which I manually add to the end of an Intella keyword hit report, and advise them that full document families will increase this number if we proceed to a review set based on this criteria. Hopefully that clarified the issue and provided a little more context to the situation! Jason
  11. I don't really have an answer based on what you have described, but here is what I would do: Perform some proximity searches to see what those identify (e.g. "JR 0000"~3) Go to the Words tab for some of the documents at issue and see if the search text appears there, and if so, in what fields Take a close look at the native files and investigate the presence of formulas that might be causing the issue Make sure the items aren't categorized as having some kind of processing error I see that you mentioned Connect. Although it shouldn't be an issue, if possible, I would attempt to duplicate the issue in the Intella desktop software As a last resort, if you're seeing the text but not finding anything, you might want to export their extracted text and see if they contain the text you're after. If so, you could either search those with another tool, or with Intella, and then tag those items in the original database via MD5, etc. If you're still coming up empty after all that, the support team would probably be interested in examining a sample file to investigate further. Hope that helps! Jason
  12. So Intella doesn't have any rules-based features to accommodate exactly what you're asking. However, if I'm understanding you correctly, the simplest solution is to revise the coding palette to make the Privilege parent tag as not required. That way, if the doc is non-responsive (aka Not Relevant), they can apply that tag, then move on. It IS a good idea to make that tag required. Another good practice is to make that tag with radio buttons so that multiple selections are not possible. It never ceases to amaze us how many documents are coded as both responsive and non-responsive at the completion of a review. An example of what I'm talking about is shown below. In my experience, mandatory tags should be used very sparingly, as they quickly frustrate reviewers, which appears to be the case here. Conversely, it's up to reviewers to code documents accurately. Thus, if a document is tagged as non-responsive, no additional effort is warranted, and they move on to the next document. If they happen to see that the document also contains privileged content, even if non-responsive, they could tag it. However, for sake of efficiency, the presumption has to be made that a document, if not explicitly tagged as privileged, is inherently understood to be not privileged. Thus, there is no need to require the addition of a tag that states the obvious. Unsure would be the only exception in this scenario. Again, it's a primary responsibility of the reviewer to code documents accurately, and there is no substitute for their attention to detail. Also, since it sound like this is a review for a legal proceeding, you might want to take the privilege tag a step further and provide two options (assuming that the standard categories for ediscovery would apply here): Privileged - attorney-client, and Privileged - work product. That way, if they are required to create a privilege log down the road, they will have captured their specific assessment of the privilege type at the time it was clear in their mind. The can then be included in a tag group as part of a CSV export of the metadata, and provide a giant head start in the creation of their future privilege log. Hopefully that explanation will be somewhat helpful for you!
  13. This is a particularly significant and ongoing issue for my users, as well. Although the simplified Review UI has been universally well-received by reviewers, and is now our default approach for even single-user document reviews, a lack of flexibility has created several challenges. It's difficult for a reviewer to understand why they can't return to a document batch they previously reviewed and make tagging changes based on new information that has come to light, which is totally normal given the constantly-moving goalpost in litigation and ediscovery (as well as the other contexts mentioned previously). In fact, it's not uncommon for certain documents to be subject to multiple coding changes as new facts and information becomes available during the course of discovery. I have recently provided Jon with some extremely detailed explanations with respect to these issues, which I hope will be helpful in making Connect more flexible for these types of workflows.
  14. Hello! I though that your questions could be best addressed visually, so I took a few screenshots of the load file export dialogues that address the settings you're interested in. With regard to providing TIFF and PDF, you simply need to check the option for "Also include PDF versions of images" in the load file options dialogue. I always specify another location for these files, as the scenario you're describing is not uncommon with productions in litigation matters. This provides maximum flexibility in your scenario. With regard to file naming, although not incorrect at a technical level, the options you are describing aren't really the industry standard with regard to ediscovery, and will likely cause confusion. It's just not what they're expecting to see, so it will likely raise questions, and you don't want to have to provide that type of explanation. Check out what I did in the file naming and numbering dialogue, which includes the syntax for what you're after. Specify your prefix, use the syntax to specify the number of digits, and move forward. Everything happens in that one text field, under the advanced setting. Hope that helps! P.S. Be sure to to create an export set with each production, and save the log in at least CSV format as SOP. Jason Covey
  15. So you are correct that Intella cannot process PDF Portfolios. It can neither extract the individual PDFs that make up the Portfolio, or the native file attachments to the individual PDF-converted emails (if that's the manner in which the PDFs were created). Although there are some workarounds, they are pretty complicated depending on how far you want to take things in order to restore proper functionality. Before you set off on such a journey, not knowing the context of the production, if metadata was to be provided with the production, you would certainly be better off to go back to the producing party and ask them to produce again in a more accessible format. Assuming that's not an option, you'll want to check out these two Adobe Acrobat plug-ins from EverMap: http://evermap.com/AutoPortfolio.asp and http://evermap.com/AutoSplit.asp The former provides the most advanced functionality for working with PDF Portfolios, whereas the latter's is limited, but also includes a number of other features. The main problem you're going to run into involves metadata. If you need to transform the production into a fully functional ESI data set in Intella, it requires the tedious creation of a custom load file. Although I've done it a few times, if you don't have extensive experience with from-scratch load file creation, it wouldn't be realistic to go down that road. Nonetheless, with enough effort, some creative RegEx searching and data manipulation, it IS possible. A middle ground approach might be this: use one of the two aforementioned tools to extract the PDF-converted emails and any native attachments to a folder. Although the file naming options aren't unlimited, you can achieve something that retains the document order/hierarchy with numeric prefixes. Hopefully the producing party was kind enough to create the portfolio in some kind of chronological order, which would then be preserved by this process. With that done, you could then just process the resulting files into Intella as a folder source, where proper sorting will be achieved by file name. Of course, this won't give you accurate family dates or file types or permit permit full functionality of Intella's Tree view or parent-child tracking, all of which would require the load file route. Although in a perfect world, Intella would support every possible file type. However, in this case, I'm really on the fence about whether this is worth the effort give that: (1) it's a very rare production "format"; and (2) it's arguably not a legitimate production format in that it makes essential metadata inaccessible. That being the case, I would rather see the dev team working on what I think are some much higher priority features. Still, in your case, in light of the amount of work that's required to make a PDF Portfolio production of email functional within an ESI platform, as well as the lack of accessible metadata (you basically have to extract it from the body text of each individual "email," you would be in a strong permission to ask the opposing party to re-produce the data in a format that is reasonably accessible. And the larger the Portfolio or email or volume, the stronger that position would be. Hope that helps! Jason Covey
×
×
  • Create New...