Jump to content

Regex for advanced searching


AdamS

Recommended Posts

I have a client who is wanting some fairly complex searches run and at the moment this is causing me some issues.

 

I'll give you an example, below is what they've asked for

"job method" OR "risk assessment" AND client OR power OR corporation -withing 100 characters of 1280 OR placename OR car 7, 8 or 9

 

I have broken the search down thusly

"job method OR "risk assessment"

Then

Client OR power OR corporation

Then

ignore the within 100 characters

Then 

1280 OR placename OR "car 7"~1 OR "car 8"~2 OR "car 9"~3

 

Then using the venn diagram I select the ball that intersects all 3 searches and that is my final result.

 

That is one of the simpler searches they are asking for but the main problem is the proximity search they want which I cant do under those search terms.

 

This led me to start looking a little more , seriously at regex, however the link provided in the help manual to look into Regex that Intella supports is a dead link.

 

http://lucene.apache.org/core/4_3_0/queryparser/org/apache/lucene/queryparser/classic/packagesummary.html#Regexp_Searches

 

Any good resources I should be looking at for some guidance here?

Link to comment
Share on other sites

Hello,

 

Current implementation of the regular expression search allows to specify patterns for individual tokens (words), so it's like an advanced version of the wildcard syntax at the most. I doubt if it would help to execute complex proximity searches like in your example.

 

We are currently working on an alternative implementation that will be a part of the Content Analysis functionality. This will allow to specify regular expressions that work on entire item content, without limitations of the token-based approach. Hopefully, this would solve the problem of complex proximity queries, among other things. This feature will be included into the next release.   

 

Thanks for reporting the dead link, it will be removed from the next version of the User Manual.

 

Good guides and tutorial on regular expression syntax can be found at http://regexone.com/.

Link to comment
Share on other sites

  • 3 weeks later...

Hi Adam (and others),

 

Just a note to say that along with a bunch of other features, customisable Regex queries will be available in the next release of Intella (1.9.2).

 

Regex will be in the Content Analysis facet and we have included a testing page where Regex queries can be tested.

 

Jon

Link to comment
Share on other sites

×
×
  • Create New...