Regular text words and special characters search
One of the more frequent questions that hit our Support Team are related to searching keywords separated by special characters.
Typical Support Question
Dear Support, we need your advice on creating a search that will only find documents that contains exactly this phrase:
happy-day
We tried with following queries but none of them seems to produce the result we want:
happy-day - It looks like it's evaluated as happy AND day.
happy\-day - It looks like it's evaluated as happy AND day.
Answer
Note that during indexing, some of special characters will be filtered out and will never make it into the index.
The rules of handling specific characters depend on the context where they occur. For instance, the punctuation
characters like dots ('.') or dashes ('-') are significant within numbers, email addresses or host names, while being
ignored (i.e. interpreted as whitespaces) between regular text words. In the latter case, escaping those characters
will not make them searchable.
Exactly the same happens in the case of happy-day phrase where dash is interpreted as white space so representation
inside the index is the same as for happy day - that is a reason that all instances of happy day and happy-day
are returned when you search with the phrase search "happy day". Actually this is the closet you can get to what you want.
Two search queries you provided are represented as:
happy day
which is actually the same as
happy AND day
In general you can think of this as if all special characters between text words, also in search terms, would be replaced with spaces
so it's actually not possible to search for special characters between regular text words.
Now the question arise where special characters can actually be used:
- [+-.,%$] are significant in numerals. Example search term: -100.0
- [.-@] in email addresses and host names. Example search term: info@vound-software.com