You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be great to be able to specify custom min_gram parameter for different token types. My use case is a text field that contains words in multiple languages. I use ICU tokenizer with Edge N-gram filter and I am forced to set min_gram to 1 to be able to support ideograms. What I want is to have min_gram of 2-3 for alphanumeric tokens and min_gram of 1 for ideographic tokens.
The text was updated successfully, but these errors were encountered:
This has been open for quite a while, and hasn't had a lot of interest. For now I'm going to close this as something we aren't planning on implementing. We can re-open it later if needed.
It would be great to be able to specify custom
min_gram
parameter for different token types. My use case is a text field that contains words in multiple languages. I use ICU tokenizer with Edge N-gram filter and I am forced to setmin_gram
to 1 to be able to support ideograms. What I want is to havemin_gram
of 2-3 for alphanumeric tokens andmin_gram
of 1 for ideographic tokens.The text was updated successfully, but these errors were encountered: