-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to fine-tune a model that has once been fine-tuned, thus incrementally improving it gradually? #29
Comments
It should be possible to further finetune a model that has already been finetuned, but the files in the model directory that are present from the first finetuning interfere with the second finetuning process. You could open the finetuned model's directory by clicking Open model directory and then remove all the newly created files (which are produced by the finetuning) except for the actual model file (the file with a size of several hundred megabytes, the largest file in the folder). After that you should be able to finetune again. But repeatedly finetuning will probably degrade quality eventually, that's why I haven't enabled the possibility of doing so out of the box. I suspect you would get better results with just finetuning the base English -> Catalan model again with the extended data set. |
Dear Tommi Nieminen, |
It might be possible to change the displayed language pair for the models, although you might have to change things in multiple places. For OPUS-MT models (the older models), this can probably be done simply by changing the language codes in the model directory, e.g. change the dir en-ca to en-ia. With the newer Tatoeba models there is also a model configuration file (the *.yml file that has the same name as the model directory) that you might have to modify in addition to changing the directory name. |
Is it possible to fine-tune a model that has once been fine-tuned, thus incrementally improving it gradually? If I have several already fine-tuned models, is it possible to use them simultaneously, or do I have to compose two new .tmx into one .tmx file before the fine-tuning? The question is concerning the fine-tuning in the OPUS- CAT as described in the article OPUS-CAT: "Desktop NMT with CAT integration and local fine-tuning."
Tommi Nieminen
The University of Helsinki, Yliopistonkatu 3, 00014 University of Helsinki, Finland
[email protected]
https://aclanthology.org/2021.eacl-demos.34.pdf
I have tried to use this excellent interactive software to make the translator English -> Interlingua (see www.interlingua.com) by fine-tuning English -> Catalan, see my paper in
https://www.interlingva.cz/Le_experimentos_con_le_machina_OPUS_CAT.pdf which is written in Interlingua (= simplified Latin, similar to Catalan ). The translation showed surprisingly good results. I repeatedly wanted to fine-tune this model again, with additional phrase pairs EN->IA, but the process was started and never has finished. I tried it with more combinations CZ (my mother language), IA, EN, and I had never experienced a successful ending when the source model was fine-tuned before. I would appreciate any advice on subsequently improving the translators with Interlingua. See also the other papers on the page https://www.interlingva.cz/Le_Experimentos_con_traduction_automatic.pdf . Many thanks Bohdan Smilauer
The text was updated successfully, but these errors were encountered: