Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise warning to advice to retrain project after vocabulary update #485

Open
juhoinkinen opened this issue Apr 26, 2021 · 0 comments
Open
Milestone

Comments

@juhoinkinen
Copy link
Member

In the previous Finto AI model update-round the the same mistake was made twice: a (base) project training was interrupted but not immediately noticed as there existed an old model with the same project id. Noticing the mistake was not easy from the suggestion or evaluation results either, because the old model produced sensible suggestions coming from the new vocabulary. The vocabulary had of course been loaded before the training (updating the vocabulary was introduced in #274/#383).

Annif could emit a warning when suggesting with a model, whose vocabulary has been modified since the model has been trained.

Implementation could rely on comparing the timestamps of the model/vocabulary files in the project/vocabulary directories, which would be straightforward. However, the timestamps of the model files could be greater than (after) the timestamps of the vocabulary files even when the model has not been retrained at least in two cases, and these could lead the warning to be missing:

  • in case of learn of a learning backend has been used
  • if (re)training has been interrupted, but the backend have created some temporary files in the project directory (like fasttext-train9ic43tsy.txt) that remain
@osma osma added this to the Long term milestone Feb 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants