Jump to content

NOT operator in Keyword Search


kevinma

Recommended Posts

I tried to use the NOT operator with keywords to find out all the emails contain the word "subscribe" but not "unsubscribe".

 

Keywords: subscribe NOT unsubscribe

However, this technique does not work in some of the Simplified Chinese characters. For instance, using the following keywords.

 

Simplified Chinese Keywords: 安 NOT 安排

Simplified Chinese sentence: 是否可以安排在星期一前完成

 

The results still highlight the word 安, however the email content contains the whole sentence ...安排在 ....

 

The problem maybe those characters are the same in simplified Chinese and traditional Chinese.

Please try it in Google Translate at http://translate.goo...%AE%89%E6%8E%92

 

Traditional Chinese Keywords: 電子 NOT 電子郵件

Link to comment
Share on other sites

Hello,

 

I tried a sample document containing the text "是否可以安排在星期一前完成" and it is (correctly) not returned when searching for "安 NOT 安排". What Intella version are you using? Can you send us a complete sample document?

 

The issue is most likely in the way Chinese, Japanese and Korean documents are indexed. As these languages do not require whitespace or other characters to separate words, searching on words becomes hard. This is "solved" by breaking up the text in so-called bi-grams, basically all pairs of two characters that occur in the document, and processing them as if these are words. If you look at the Previewer's Words tab, you will see what "words" are extracted from this text. This method does not give perfect results, but often produce a reasonable result.

Link to comment
Share on other sites

×
×
  • Create New...