-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom preprocessing in Live Test #3
Comments
The method used to recognize/parse learned word n-grams was improved. The new code it is easier to follow and clearer, and semantically more accurate. Furthermore, single words and word n-grams tokanization and visualization in the Live Test Tool was also improved by this new code. Before this modification, there was a bug in the Live Test tool raised by the addition of the custom preprocessing support (#3). Some words were visualized in the wrong way, especially those ended with question marks, for instance the sentence "Great atmosphere then ?????". In addition, now the user can put an arbitrarily number of spaces between words and they will be ignored when recognizing word n-grams. For instance, before this modification, if the user entered: "machine learning" It was not recognized as a bigram due to those extra spaces.
Hi @enthussb! I've added this feature in the new version, and also took the opportunity to incorporate some other things that were pending, namely, what's new on this version is:
Update your package version using the Let me know if everything worked OK ☕ |
@all-contributors would you add @enthussb for ideas to the README file? it helped to make this project better by suggesting this cool feature 👍 |
I've put up a pull request to add @enthussb! 🎉 |
@sergioburdisso I updated the package and ran the code. Everything is working fine, although my accuracy has been reduced quite a bit. I guess it might be due to the latest n-gram and tokenization changes. Could you please have a look at that? |
I was about to tell you to perform a hyperparameter optimization using the |
Okay no problem 👍, till then I can work on the previous version where I had achieved great accuracy! |
I've just finished making those changes and released the new version (0.5.9). I've also updated the notebook adding a section for "Hyperparameter Optimization". Try performing hyperparameter optimization similar to what I did in that notebook and let me know. In case you are still getting bad accuracy, please share some more details, like part of the actual code, the actual accuracy before and after the changes, etc. It would be much easier to try to help that way. I hope you achieve great accuracy again 😢 🤞 🍀 |
Sure I will check the updated package and revert. |
@sergioburdisso
It would be a great feature to have custom preprocessing in the Live Test.
This will enable us to visually understand the words, sentences, and paragraphs that helped the model to classify a particular document after custom preprocessing.
The text was updated successfully, but these errors were encountered: