You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently we have three analyzers (fi, sv, en) all based on NLTK SnowballStemmer. This leads to duplicated code. Adding more languages would imply adding more analyzers.
We could instead have just one SnowballAnalyzer that takes a language parameter. Then it could be configured like this:
analyzer=snowball(finnish)
or why not
analyzer=snowball(french)
A similar approach would work for libvoikko (#37) which supports several languages:
analyzer=voikko(fi)
The parameter would be passed directly to the algorithm so all languages supported by the stemmers/lemmatizers would be available.
The text was updated successfully, but these errors were encountered:
Currently we have three analyzers (fi, sv, en) all based on NLTK SnowballStemmer. This leads to duplicated code. Adding more languages would imply adding more analyzers.
We could instead have just one SnowballAnalyzer that takes a language parameter. Then it could be configured like this:
or why not
A similar approach would work for libvoikko (#37) which supports several languages:
The parameter would be passed directly to the algorithm so all languages supported by the stemmers/lemmatizers would be available.
The text was updated successfully, but these errors were encountered: