Jump to content

Proximity Searches in Intella - A Better Understanding

Recommended Posts


We receive numerous support tickets from our customers in regards to advice for using Proximity searches. The user manual provides the basic syntax and there is additional information at these Forum posts.

There is also a webinar on using proximity searches in Intella here. 



In most cases we are provided with examples of the syntax which the customer has used. In some cases the syntax is very complex and, often the syntax is incorrect.


Some customers ask us whether the syntax is correct or ask why their proximity search is not working. This is something that we cannot answer on an individual basis. The point of this document is to provide examples to help our customers to get a better understanding of proximity search syntax so that they can create the correct search syntax for the search that they want to perform.


Note: Most of this information applies to all versions of Intella which support Proximity searching. There is a known issue with hit highlighting in versions prior to 1.9.1. We recommend that you update to version 1.9.1 if you encounter this issue.



What is a proximity search?

Proximity searches are search syntax specifically crafted to find items based on words that are within a specified maximum distance from each other in the item’s text. For example, if I wanted to find all items that have the words 'desktop' and 'application' within 10 words of each other then I would use the following:


“desktop application”~10


A proximity search differs from a phrase search in that it does not matter whether 'desktop' is before or after the term 'application' in the text. For example, documents containing either of the passages of text below will be respondent to the proximity search above.


"You must turn on your desktop computer before you can open an application."


"I have copied the shortcut for the application onto the desktop."



Using the Correct Proximity Syntax

As mention above we receive proximity search syntax from customers. A lot of the time we see that the customer has created search strings such as the examples provided below:

  1. (Baxter Jason) ~20 (article) OR (paper) OR (presentation) OR (public) OR (report)
  2. "national OR fire OR service"~30 (truck) OR (department)

These examples have been sanitized and shortened however, the original search strings contained several lines of OR statements. This makes the search string complex, cumbersome, prone for errors and difficult to troubleshoot.


Example 1
If we look at the first example above, we can see immediately that there are several issues which make this syntax incorrect. One issue is that the terms to be searched are not encased in double quotes. Another issue is that the number of words to be within (~20 in this case) is not at the end of the proximity search syntax as there are several OR statements after this number.


The user manual shows a basic example of the syntax “desktop application”~10. Note that the structure is to have two (or more) search terms encased in double quotes followed by the number of words that the terms must be within.


The proximity string can be made more useful for larger queries by adding more search terms. The additional search terms need to be separated by the OR operator and encased in parentheses. For example, the first example above could be rewritten this way: "(Baxter OR Jason) (article OR paper OR presentation OR public OR report)"~20. Because the user is looking for one of two terms within 20 words of one of several other terms, we have grouped the keywords by placing them in parentheses and separating the terms with the OR operator, e.g: (Baxter OR Jason) and (article OR paper OR presentation OR public OR report).


Note: All of the search terms are still encased in double quotes, followed by the number of words that the terms must be within. This syntax will return any items where Baxter or Jason is within 20 words of article, paper, presentation, public or report.


Example 2
Again we see that there are issues with the search syntax in example 2. This time double quotes are used however, they do not encase all of the search terms. Also, we see a similar trend to example 1 where there are several search terms within parentheses and separated by the OR operator. We see a lot of samples like this and wonder whether this format of proximity search has come from another tool.


The way I read this example is as follows: Find all items that have national, fire, or service within 30 words of truck or department. The syntax can be rewritten this way:  "(national OR fire OR service) (truck OR department)"~30. Again we use the parentheses to group the search terms into the two groups and make sure that all terms are encased in double quotes.




  • Because the double quotes need to encase all of the search terms, you cannot have a search phrase within a proximity search. A search phrase would require double quotes and you can't have nested double quotes within a proximity search. That said, you can use phrases in keyword lists (see below).
    UPDATE: See the post from 5 July 2021 below. From version 2.5 of Intella you will be able to use phrases when creating proximity searches, so this will no longer be a limitation. We will post some examples of the search syntax for phrases when version 2.5 has been released. 
  • In the past we have been provided with proximity search strings where the syntax contained over 40 words separated by the OR operator. As mentioned above, this format is not correct. Even if we corrected the syntax, 40 words in a proximity search makes the search string complex, cumbersome, prone for errors and difficult to troubleshoot.
  • We have also received extremely long search syntax where all search terms contained wildcards. Such complex queries with many wildcards are known to have very poor performance, especially for hit highlighting in the Previewer window.




There are a couple a methods one could use to manage complex proximity searches that contain a large number of search terms separated by the OR operator. One is to break down the search string and two is to use keyword lists.


Breaking down the search string
A complex search string can be broken down into several shorter proximity search strings. The shorter search strings are then placed into a keyword list. E.g.


“Baxter article”~20
“Baxter paper”~20
“Baxter presentation”~20
“Baxter public”~20
“Baxter report”~20

Intella will be able to process the list of shorter proximity searches more efficiently than one large complex search string.


With a small amount of Excel work you can create a keyword list that includes all of your shortened proximity searches in a single list


Using keyword lists
The idea behind using keyword lists is to reduce the number of items that your proximity search needs to search across. Two keyword lists can be created, one list which contains the search terms in the left group of a proximity search and a second list which contains all the other terms in the right group, e.g.


Keyword list 1        Keyword list 2
Baxter                    article
Jason                     paper


Next, run the two keyword lists and Tag the overlapping cluster. This cluster will contain the items that have search terms from both keyword lists.


Set this Tag as an 'Include' search and run the proximity search. This provides faster searching as you are not searching over the entire dataset. However, be aware that hit highlighting can still be slow or hang Intella if the proximity search is complex and contains wildcards.


The advantage of using keyword lists is that you can use the following types of searches and operators:

  • Wild cards (article*, paper* etc)
  • Phrases ("national fire", "fire service" etc)
  • Other search operators
  • Like 1
  • Thanks 1
Link to comment
Share on other sites

  • 1 year later...



Since phrases are not currently supported in proximity searches (fingers crossed that's on the way!), the idea of grouping terms is intriguing.


Your example of  

"(Baxter OR Jason) (article OR paper OR presentation OR public OR report)"~20 

only uses the OR operator.


If what I needed to find was actually any item with BOTH Baxter AND Jason within 20 words of any of the others, would an AND operator in the first group suffice?:

"(Baxter AND Jason) (article OR paper OR presentation OR public OR report)"~20 

Link to comment
Share on other sites


If what I needed to find was actually any item with BOTH Baxter AND Jason within 20 words of any of the others, would an AND operator in the first group suffice?:

"(Baxter AND Jason) (article OR paper OR presentation OR public OR report)"~20 


Unfortunately, it will not work. The AND operator has no meaning within the phrase and proximity searches.  

Link to comment
Share on other sites

  • 2 years later...
  • 5 months later...

I would appreciate some help getting around a proximity search limitation involving a phrase and a term.

An example (one I was provided to search) is: “contact list w/2 Wilson”

They would like all items containing the phrase "contact list" within 2 words of the term Wilson.

Seems simple enough, but I can not get my head around not being able to use nested double quotes < ""contact list" Wilson"~2 >.

Suggestions? Thanks

BTW: I am using Intella Pro 2.4.2

Link to comment
Share on other sites

I had not read the two linked help pages at the top of this topic before posting. Is the following the best work around:

proximity search 1: "Wilson contact"~2

normal include search 2: "contact list"



Link to comment
Share on other sites

Hi, I would start with the search for "contact list". I would tag the results, then run a proximity search for "wilson contact"~2 only over those results. 

The limitation is that you could get false positives with this approach. For example, there may be a document that has the phrase "contact list", which is respondent for the first search. But, that same document may also have the word 'contact' within two words of 'wilson' (e.g. something like "I asked him to contact Wilson...."). In this case the proximity search will have a positive hit on this, but, it is not what you are looking for (e.g. "contact list" within two words of Wilson). That said, I think this situation would be very rare given what you are looking for. And. it would be easy to review such a low number of false positives if they occur.

Link to comment
Share on other sites

Hi, some good news, and not so good news. The good news is that we have actually improved proximity searches to do what you need to do. In this example I have searched for  'personal information' within 5 words of 'protected' and this is the result.


The not so good news is that this is a very recent change and it wont be available until the next release (version 2.5). We are aiming for this release in September provided all tasks and testing has been completed. 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Create New...