Jump to content

Bulldawg

Members
  • Posts

    13
  • Joined

  • Last visited

Everything posted by Bulldawg

  1. Jon and Alex, Thank you both. I cannot believe I missed the export words option. I did look at the content analysis facet, but the problem I have with that is it does not take into consideration the OCR'd files I've imported. It is showing one fewer hit than I had for SSN before I imported the OCR'd files, but it still shows that same count, which is about 200 than the regular expression search, after the OCR'd files import.
  2. We have a case in which we've identified PII was stolen. I have the e-mails and documents in Intella and I have used the (experimental) regular expression searches to identify social security numbers and potential credit card numbers contained within the items. My next step is to dedupe the results. Not deduplicate items, but regular expression hits. For instance, the same social security number may appear in multiple different e-mails, but that's only one person to notify their information has been breached. My thought is to simply export a list of the text that matches the regular expression, but I do not see a way to do this. I would then remove duplicate SSNs using an external tool, like Excel. Is it possible to export the words that hit on a regular expression search? My fall back option is to export the entire index of words and use grep (or something similar) to pull out the SSNs and credit card numbers and then dedupe that using another tool. I have not been able to determine a way to do this. Does anyone know how to pull the words out of the index? Thanks for your help.
  3. I ran a backup with 1.7.1 this morning. That took 1:48 (an hour and forty eight minutes) to an empty folder on the NAS. I ran a backup with 1.7.2 just now--same case, no modifications, just open the case and close it, backing up to a new, empty folder on the NAS. That backup only took 0:12. For the record, a backup to a local RAID 5 array took about 0:07. It is interesting that it only reached peak performance when copying itemcontent.dat. The smaller files, even big ones like strings.dat (I assume the one in index\locations since it's the biggest of the strings.dat) topped out at about 500 Mbit/s, but when copying itemcontent.dat, it reached near 1 Gbit/s (actually about 900 Mbit/s). I saw similar differences in speed with the local backup. itemcontent.dat is about 25 GB in this case. Despite still being slower than the local backup, 1.7.2 backs up much faster than 1.7.1 when backing up to an SMB share. The difference in speed is probably mostly down to TCP and SMB overhead.
  4. John, That's great. Would you mind sharing a few details about what's causing it? I'm trying to troubleshoot all the software and devices on my network that are showing slowness. Any insights into what Vound is doing would be appreciated. - I'm seeing the same issue with EnCase 7.08.01, although EnCase's backup process creates tens of thousands of little files, so I suspect their problem is a little different than yours. - I'm able to image a drive to the CIFS (SMB) share from a TD3. It starts out slow, about 30 MB/s, but picks up speed later to end with an average of 70-90 MB/s. Although that's a bit slower than it should be, it eliminates the copy process if I image to a bare drive, so I'm considering using my TD3 to image straight to my evidence share on the NAS in the future. I haven't tried imaging through a computer directly to a share yet. As to your comment that you've not seen the problem to the same extent I'm seeing, I'll point out again that performance does vary widely, even backing up the same case to new empty folders on the share. After the NAS has been on for a few days, the slowness is less pronounced.
  5. Wireshark has definitely shown SMB overhead is a problem. During a Windows Explorer file copy there is a lot of [TCP segment of a reassembled PDU], which I believe is the last TCP packet in a chunk of data. I see these packets about once every 0.000001-.0000002 of a second. During an Intella backup, I see these same packets, but they are about every 0.0005 second apart. If I'm reading this right, the Windows Explorer file copy is transferring data about 250-500 times faster than the Intella backup. For some reason, this particular copy did seem even slower than normal. In between the [TCP segment of a reassembled PDU], I consistently see this pattern: Protocol Length Info SMB 258 Write AndX request, FID: 0x4b14, 4096 bytes at offset 2916352 TCP 60 microsoft-ds > 64249 [ACK] Seq=1411882 Ack=111323736 Win=65535 Len=0 SMB 105 Write AndX request, FID:0x41b14, 4096 bytes This pattern repeats, with little variation between each and every data PDU on the slow backup. I do see some of the TCP and less frequent "Write AndX" in the faster copy, but they are much fewer and don't appear between each data packet. If anyone is better than I am at interpreting this or knows what the problem is, please let me know. My Googling isn't turning up much.
  6. Time to dust off my packet analysis book. I've never been much of a Wire Shark expert, but should be interesting to watch. A user on the EnCase forum suggested I try a product called ProcessActivityView from Nirsoft. It allows me to monitor all the files a process opens, closes, writes, reads, etc. I've learned a lot about EnCase and Intella just watching them work through ProcessActivityView. For instance, the reason EnCase takes so long to open a case is that it's reading exactly 16KB of data from what looks like every file in the evidence cache. In this particular case, that's about 70,000 files. Even off an SSD that takes a while. I've also learned that the EnCase backup process writes tens or even hundreds of thousands of files. SMB overhead may indeed be an issue there. Intella is much nicer to the file system during backup. It's just copying the contents of the case to a backup location. Thankfully, most of the case is in itemcontent.dat, so it's only a few hundred files to copy. I'm going to keep going on this until I figure it out and will report back. In the mean time, if anyone has any other ideas, I'm open.
  7. So, more interesting is this-- When I created an iSCSI target on the NAS and backed up to that, I got the full expected speed--limited only by the speed of the hard drive the case is stored on. Why am I getting full speed connected to the same NAS on the same RAID array through an iSCSI target and only a small fraction of full speed when connected through a shared folder?
  8. Thanks, that's what I'll do. What I meant by inefficient was if I were searching one keyword at a time and needing to repeat all the steps for each keyword. I often need to produce e-mails in files that are split by what keyword was the hit. It would be quicker if I could search something like "keyword where parent.email.to:address@example.com"
  9. I'm having some issues with backing up directly to a network share. By "issues" I mean it's taking too long. I'm working with my NAS vendor as well, but even backing up to another computer on the same network is much slower than I think it should be. Hardware: Examiner PC with two 1 Gbit ports teamed with link aggregation. Case is stored on single internal SATA drive, not the OS drive. Switch supports link aggregation and is properly configured (Juniper EX2200). My NAS is a Synology RS3413xs+ with a six drive RAID 6 array. 1. Backing up to a local RAID 5 array takes about 8 minutes for this 17 GB case. 2. Backing up to the NAS takes about 11 hours for the same case 3. Backing up to a shared drive on another computer takes about 40 minutes. The network activity seems artificially limited to about 12 Mbit/s during the backup. I can copy the locally created backup from #1 above to the NAS in about 10 minutes. Like I said, I'm working with the NAS vendor as to why this is happening, but is Intella doing anything that could cause such dramatic slowdowns when backing up over a network share? I'm also experiencing exactly the same problem with EnCase 7.08.01 in this setup. I'll be asking EnCase for help with this too, but someone must have an answer to this.
  10. I'm wondering if there is an efficient way to return hits on e-mails if the e-mail includes a particular sender or receiver AND an attachment to the e-mail contains a search term. I could accomplish this by searching for the search term, viewing all the parent items and then searching those parent items for the desired addresses, but with a long keyword list this is inefficient. I may end up doing it anyway, but I want to make sure I'm not missing something obvious. Is there any way to return hits on the parent e-mail when the child attachment contains search terms?
  11. I am interested, but am unsure how I'm going to allow secure access between my lab network and the corporate network without causing any security issues. I'll wait to get it so I can see how it works and ask some more pointed questions about this.
  12. Sorry to dredge up an old topic, but I have the same need as Hagrid. I have a case with over 170,000 e-mails spread over 7 years. E-mails sent after 4:30 PM or on weekends are going to be very important to the case, and I do not see how I can use Intella to efficiently search on these terms. For a bit of background, it is very common for someone who is committing fraud to work extra hours when the business is closed to hide his or her fraud. One of the first things I look for in a lot of fraud cases is for transactions that occured when the business was closed. I'd like to do the same with e-mails. With this many e-mails (and this isn't even a big case), manually tagging e-mails on the weekend or after 4:30 is not going to work. That's too time consuming. This is a task at which computer excell, so please add this enhancement to Intella ASAP. Thanks.
×
×
  • Create New...