Jump to content

Chris

Administrators
  • Posts

    208
  • Joined

  • Last visited

  • Days Won

    10

Posts posted by Chris

  1. Still thank you for the description of your process, always nice to know how people are using Intella :)

     

    We are asked regularly for functionality to produce cross tables, e.g. custodians x number of keyword hits. I think we should do something to better facilitate that.

     

    As for responsiveness of keyword lists: two factors determine how long it takes to draw a Cluster Map out of it:

    • Size and complexity of the keyword list. Know that wildcard searches are effectively translated to a Boolean OR of all matching terms, so terms with wildcards that match a lot of terms can take some time to evaluate.
       
    • Number of clusters in the graph. The time needed for calculating a layout is roughly quadratic w.r.t. the number of clusters.

    It often occurs that people use the keyword list "as a whole", i.e. for their task it's not important which term caused the hit, just knowing that a document matches one of the terms in the list suffices. That's why we added a "Combine queries" checkbox at the bottom, turning the keyword list into one giant OR query, producing only a single cluster.

  2. Hi Adam,

     

    Intella has the ability to search with keyword lists. There is a special facet dedicated to this, called Keyword Lists (i.e. this functionality is not inside the Search box at the top). You simply browse to a text file with keywords and it is then available as a keyword list in the case.

     

    However, reading your earlier post, I think you want to be able to edit this list as well, am I right? Once the list is added to the case, it is static. So if you want to change the list, that can only be done by removing the list and adding the updated file back in.

     

    Also these lists are in no way connected to the search history of the field at the top. That's an interesting idea though...

  3. Hello Kevin,
     
    Apologies for the late reply!
     
    At the moment this cannot be done, but I indeed think we need to make that possible in the future.
     
    For now, what you can do is gather the Passware-processed items in a folder and add that as a new source. Then only these items will be indexed.
     
    The drawback is that you'll have the items twice: once still encrypted and in their original location and once decrypted and in an artificial location. Not perfect, but certainly faster to index.

     

  4. Hi Adam,

     

    I have attached two samples of the experiments I did myself with Gephi a while ago, using Intella's social graph export.

     

    The first is a visualization of all emails in the well-known Enron case that mention the term FERC (= Federal Energy Regulatory Commission). I tuned the layout settings to find cliques of people and applied a coloring scheme that statistically tries to determine such cliques. As you can see, such visualizations immediately raise questions such as: what determines these cliques (perhaps it reflects organizational structure, or perhaps the term was relevant at different points in time with different employees involved), why are certain cliques loosely connected and through whom?

     

    The second example is a visualization of all email correspondents in a test PST containing emails from mailing lists. In some mailing lists the direct communication structure is clearly visible, in others everyone seems to communicate with a single contact only: the list server.

     

    I have much more insightful visualizations, but unfortunately these are made from private data sets and therefore cannot be shared.

     

    Does this explain what you can achieve with social graph visualizations? Do you see a use for those? We love to hear your feedback as it helps us shape our solutions.

    ENRON, FERC, degree filtered.png

    mailing lists.png

    • Like 1
  5. Hi Adam,

     

    The 1.7 version improves on this by showing a tooltip on the rows, revealing the full value.

     

    Of course this only helps with the interactive version, not with the saved version. I will see what we can do about that. Your suggestion sounds good to me.

  6. Hello Adam,

     

    Some Timeline improvements have already been made and will be part of the Intella 1.7 release: deduplication, visualization of cellphone-related information (calls, SMS messages, ...), the use of email Sender headers is now optional (emails can have both a From and a Sender header), improved rendering + a legend, better tooltips. I like your idea of exporting out the image together with links to their items! We may add that in a future release, I see how this makes sense.

     

    The social graph export is a step towards a full-blown social graph visualization that we want to add in the future. The social graph differs from the timeline in that it abstracts from the actual items (emails, calls, ...) and concentrates only on who contacts whom.

     

    The Gephi visualization should work, though Gephi may have a bit of a learning curve. Have you looked into their tutorials? If all nodes are on a line, perhaps these items did not have any social data associated with them. Can you check the content of the produced file with a text editor?

  7. Hello Walt,

     

    That sounds like a good solution to me. In the future release we will add support for OCR-ing documents and images. Until then there are some workarounds, but the problem is often that the original item and its extracted text become separate units of information. Your solution resolves that issue.

  8. Hello all,
     
    I am pleased to announce that Intella Connect Beta 2 is now available for download. Forum users who would like to be involved in testing are invited to reply to this topic or send me a private message. I will reply with a private message with instructions.
     
    Some highlights of Beta 2 (the actual list of changes is much longer):
    • The results table can now be deduplicated.
    • Keyword hits are highlighted in the Previewer, with hit markers showing the position in the document.
    • Added a Thumbnails view as an alternative search result display.
    • Improved security with SSL/HTTPS support and Digest Authentication.
    • Added several facets and other search functionalities.
    • Allows sharing of two cases simultaneously.
    • Lots of user interface improvements.

    We welcome all feedback!

  9. Hello Adam,

     

    When you check the "Keep location structure" checkbox, the PST will have a structure that is equal to what you see in the Location facet: different folders for each source PST/OST/NSF/..., followed by the folder structure of that original evidence file.

     

    So the folder structure should be retained and you should also be able to see what custodian the mail came from.

     

    If it does not work like this in your case, perhaps you can start a support ticket and send us log files and screenshots?

  10. Hello Paul,

     

    The time zone is not derived from the evidence file(s). It is initialized to the time zone of the investigator's PC instead.

     

    Note that you can easily change the time zone of a source after indexing it, using the Source Editor (type CTRL+E or use the Sources menu).

  11. is there a way to save keyword search queries like in the desktop version? Usually, I would right click the top right search box and select "save queries" to get a CSV of the hit counts for each keyword. A client wanted to do this, but I cannot find if that functionality is included.

     

    That functionality is indeed not yet available.

     

    Note that you can already save a query by clicking on the Save button in the Searches Navigator. It will store it on the server and will immediately be available to other investigators connected to the same case.

  12. Hello Adam,

     

    About the Includes and Excludes: when you include a facet or keyword, only results matching that criterion will be shown in the Cluster Map and Details table. E.g. when you Include the PDF type, only PDFs will be shown in the results. Nothing needs to be excluded to make that happen.

     

    I think your method is fine, but it really depends on getting all the details right. That is hard to judge from a distance.

     

    I think there is an easier way. Suppose you want to search for all mail from john.doe@host.com but want to exclude the Gmail and Hotmail domains. The following query would do that in one go:

     

    (from:john.doe@host.com OR sender:john.doe@host.com ) NOT agent:*@gmail.com NOT agent:*@hotmail.com

     

    As for "from" and "sender" not working, while "authors & email addresses" does work: please note that from/sender/to/cc/bcc has only been added in Intella 1.6.3 and requires that the case was either made or reindexed with that version. If you open a case made with an older version without reindexing it, they will return zero results. This is because these search options need some extra databases that a (re)index makes.

     

    Your suggestions for the Email Address facet are already on the wishlist for a future release.

     

    Thanks again for your feedback!

  13. Another query/wish

     

    When previewing an email it seems the formatting and spacing is generally thrown out the window, which means if I'm previewing an email chain that has been forwarded/replied to etc many times I end up with a single paragraph of solid text, when I open that email in Outlook I can easily read the email chain and separate the individual emails, but the preview version is very difficult to do this.

     

    Is there any way the preview of the emails can be made to preserve the spacing and layout to more closely represent the true view?

     

    Edit : I just found some emails that are previewing correctly so obviously this is not something that happens with every email.

     

    Also any feedback on the suggestions so far, are they possible/not possible, plan on being included in a future update/not planned etc?

     

    The Contents tab shows the extracted text, not a native rendering of the email. This has the advantage that all text is shown, including any text that has been intentionally obscured (e.g. white text on a white background), but clearly it loses some of the formatting. Usually the paragraph structure is kept intact though, it's mostly table structures that are lost in my experience. Also sloppy use of the HTML standard by the original mail client can cause this. If you can provide us with sample emails, we can see whether it is something we can improve.

     

    A native rendering of the email, as we already have for most common document formats, is indeed on the wishlist. At this moment I cannot predict yet when we will have this.

     

    And lastly I had previously been told about the 'empty document' filter option to assist identify emails with no body of text. This appears not to be working on Ver 1.6.3. Is anyone else noticing this or is it just me?

     

    Oh and love the new search ability for individual to/from/cc etc fields.

    Can you describe the emails that fail to be classified as empty documents? E.g. do they show a Contents tab, and if so, what is shown it it?

  14. Hello,

     

    I tried a sample document containing the text "是否可以安排在星期一前完成" and it is (correctly) not returned when searching for "安 NOT 安排". What Intella version are you using? Can you send us a complete sample document?

     

    The issue is most likely in the way Chinese, Japanese and Korean documents are indexed. As these languages do not require whitespace or other characters to separate words, searching on words becomes hard. This is "solved" by breaking up the text in so-called bi-grams, basically all pairs of two characters that occur in the document, and processing them as if these are words. If you look at the Previewer's Words tab, you will see what "words" are extracted from this text. This method does not give perfect results, but often produce a reasonable result.

  15. We will see what we can do once we tackle the shortcuts in our code. One reason to use modifier keys like Alt and Ctrl is that the other keys used in the key stroke have very common uses, e.g. moving a cursor around. However, as the Previewer is largely a read-only component, maybe we can simplify it without causing other usability issues. We'll see.

  16. Hello Adam,

     

    You can use the Empty Documents category in the Features facet to find all "document-like items" that have no document body. You can intersect this with the Email category if necessary, or just sort on the Type column.

     

    Note that you can use Alt+Left and Alt+Right in the Previewer to go back and forth one item. The button tooltips tell you what the hotkeys are. A hotkey for flagging makes sense though.

  17. Hello Adam,

     

    The hotkeys are indeed available in the Previewer only at the moment.

     

    I like the idea of having more hotkeys available in other places. The challenge we then need to work on is how to make it clear that this functionality is available (who reads user manuals?). Perhaps a few fast tag buttons can be placed somewhere in the Details table with tool tips mentioning the hot keys. Ideas more than welcome!

     

    As for your earlier suggestion on an extended HTML report: I made a note of that as well!

×
×
  • Create New...