Redacting documents - share your ideas!


Hi all!
We are currently in process of building a feature to perform redacting of e-mails, documents and all the items that you can view in Intella.
In essence, you will be able to redact a PDF version of any item and export it with redaction marks covering selected areas. The original items will remain untouched.
Now is the time for you to share your ideas and suggestions regarding redaction. What are your expectations?
  • which parts of e-mails or other documents would you expect to be able to redact. Will it be only the content or also meta-data?
  • what would be your preferred or expected work flow of redacting documents?
As an example, we could propose the following work flow:
A user selects 3 items from Items table and groups them into a Redaction set. The set now contains newly created PDF versions of the chosen items. Then, for that set, the user chooses that he/she would like to redact basic headers (From, To, Subject, Date) and content of those items. The Redaction set is shown in a Facet. By choosing the items belonging to the set, he/she performs the redaction for each item in the set, as required.
Finally, the user exports the 3 redacted items using an export wizard. The items are exported as new PDF documents with the redacted text removed.
What do you think of the work flow described above? We are looking forward to hearing your opinion and suggestions. 
To better illustrate the new functionality, we prepared a sneak peek of the Redaction tab.


Great News!


The workflow you suggest sounds reasonable.


I think only content is required to be redacted, I cant currently perceive (for my purpose) meta data would need redaction.


I have been using the Adobe XI redaction and that works well (certainly worth looking at for ideas).


It would be good if you can search for text and globally replace with a redaction mark, say redact all "Ms Lizza Wong"


Make sure the redactions are 'burnt in' I saw some redaction last month that were floating blocks of colour that if you scrolled up and down fast enough you could cause lag and see the text underneath. 


Its bound to be asked - when is the ETA of this feature?

Best regards

Looks great and I like the work flow. I would also echo jcoynes comment about having a search and "replace all" option for the redact set. Great way to quickly remove a set of data, perhaps the ability to integrate that with a keyword list to redact multiple words/phrases easily and quickly.

Adam, so you would be interested in a fully automated process? We thought about that, but we are curious to learn how often would you use that?


If you think about it, this operation could be a bit risky. If you have a phrase "test" present in some email and an image (flat bitmap), then automated redaction might miss the image. So I assume that regardless of the level of automation, there will be always some manual work involved in order to oversee the results. Am I right?

Hi Lukasz


I have been pushing Connect quite heavily to my clients and giving lots of demo's and redaction has been top of the list for functionality requests from most of them. While I would never want to have only a fully automated process, as you've rightly pointed out text in pictures can easily be missed, however as many of my clients are lawyers there are times when having the ability for some automated processes would be extremely valuable.


For example redacting witness statements to remove protected witnesses names/address/phone numbers etc. Many LPP matters contain privileged data which can be chemical formula's and the like which hold no evidentry value and would need to be removed.


There will always be a manual inspection of course to check pictures/tables and other area's which may not be covered, but automating what can be automated would be a fantastic addition with the normal redaction. Couple that with the ability to use the keyword list and you have a great way to very quickly redact large data sets.


I'm not sure what the plan on the redaction implementation is but I think it would be a good idea to make the redaction a two step process, ie the first stage of redaction simply highlights the text you want to redact (be it from automated or manual) with a bright yellow background or something, then you have the option to review prior to committing the redaction, or you can choose to apply without review.

Yes, redaction has been requested previously, will be good to have.


Further developing on AdamS idea, there should be two separate features with similar interface: highliting and redacting.

So for workflow:


Do some kw search with optional facet filtering, select all (or some documents), right click ->higlight results.

this will create the PDFs linked to originals with say 50% transparent yellow and add them to the (new) special property "highlighted" selectable from the facet list.


Clear all kw searches, lists, etc. (required!), select some documents from the highlighted facet, right click -> redact the highlited parts

that will change the 50% yellow to 100% black (asuming B/W documents, it will not work on colorfull html), then cut the text, replace it with say * (including spaces) and save the PDF, add to the new special property "redacted".


Add "Export only redacted data" checkbox in the appropriate dialogue box or put some other safenet on exporting.

