Jorge Zeledon-Castillo Posted January 25, 2019 Report Posted January 25, 2019 Hello Fellow Intella Peeps, Question regarding how other users are getting around the limitation of not being able to search for two Phrases within a certain number of words in Intella. In other application you could simply type: "Fast Pace" within20 "Slow Turtle" I have ways of pulling the documents with all the words, but not an easy way to avoid false results. Just curious as this is a very common request, and Intella using the quotes to run a proximity search prevents performing this type of search on a phrase. Quote
Alex Posted January 28, 2019 Report Posted January 28, 2019 Hi, You can take a look at the regular expression-based Content Analysis feature that allows to search for complex patterns like in your case. The regular search does not support proximity searches with phrases and word distances. Only single terms and character distances are allowed: "Pace Slow"~20 ("Pace" and "Slow" within 20 characters in the extracted text) Quote
Michael Lees Posted February 18, 2019 Report Posted February 18, 2019 On 1/28/2019 at 1:14 PM, Alex said: The regular search does not support proximity searches with phrases and word distances. Only single terms and character distances are allowed: "Pace Slow"~20 ("Pace" and "Slow" within 20 characters in the extracted text) This is a word distance not a character distance. Quote
Alex Posted February 18, 2019 Report Posted February 18, 2019 Sorry, I was wrong. Yes, this is a word distance. Quote
jasoncovey Posted February 22, 2019 Report Posted February 22, 2019 I had been thinking a bit about this question and wanted to throw out an alternative approach. Of course, it's correct that Lucene does not directly support proximity searches between phrases. However, as has been previously mentioned in a pinned post, it does allow you to identify the words included in those phrases, as they appear in overall proximity to each other. Thus, your need to search for "Fast Pace" within 20 words of "Slow Turtle" should first be translated to: "fast pace slow turtle"~20 . This search will identify all instances where these 4 words, in any order, appear within a 20 word boundary anywhere in your data set. Then, with this search in place, you can perform an addition search, applied via an Includes filter, to include your two specific phrases: "fast pace" AND "slow turtle" By doing this, you should be left with a very close approximation of the exact search you initially intended, with your results filtered to only show your exact phrase hits, but within the overall proximity boundary previously specified. Hope that helps! 1 Quote
jon.pearse Posted February 24, 2019 Report Posted February 24, 2019 Thanks for you example Jason! If anyone wants to see the pinned post that Jason was talking about, you can see it here. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.