Jump to content
Jorge Zeledon-Castillo

Proximity Search with Phrases

Recommended Posts

Hello Fellow Intella Peeps,

Question regarding how other users are getting around the limitation of not being able to search for two Phrases within a certain number of words in Intella.  In other application you could simply type:

"Fast Pace" within20 "Slow Turtle"

I have ways of pulling the documents with all the words, but not an easy way to avoid false results.  Just curious as this is a very common request, and Intella using the quotes to run a proximity search prevents performing this type of search on a phrase.

Share this post


Link to post
Share on other sites

Hi,

You can take a look at the regular expression-based Content Analysis feature that allows to search for complex patterns like in your case.

The regular search does not support proximity searches with phrases and word distances. Only single terms and character distances are allowed:

"Pace Slow"~20

("Pace" and "Slow" within 20 characters in the extracted text)

 

Share this post


Link to post
Share on other sites
On 1/28/2019 at 1:14 PM, Alex said:

The regular search does not support proximity searches with phrases and word distances. Only single terms and character distances are allowed:


"Pace Slow"~20

("Pace" and "Slow" within 20 characters in the extracted text)

This is a word distance not a character distance.

Share this post


Link to post
Share on other sites

I had been thinking a bit about this question and wanted to throw out an alternative approach.  Of course, it's correct that Lucene does not directly support proximity searches between phrases.  However, as has been previously mentioned in a pinned post, it does allow you to identify the words included in those phrases, as they appear in overall proximity to each other.  Thus, your need to search for "Fast Pace" within 20 words of "Slow Turtle" should first be translated to:  "fast pace slow turtle"~20 .  This search will identify all instances where these 4 words, in any order, appear within a 20 word boundary anywhere in your data set.  

Then, with this search in place, you can perform an addition search, applied via an Includes filter, to include your two specific phrases: "fast pace" AND "slow turtle"  

By doing this, you should be left with a very close approximation of the exact search you initially intended, with your results filtered to only show your exact phrase hits, but within the overall proximity boundary previously specified.

Hope that helps!    

  • Thanks 1

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

×
×
  • Create New...