-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finetuning says: "not enough parallel segments in the tmx" #88
Comments
It needs at least 1000 TUs |
Hi, the finetuning needs a bit of data to work on, so there's a minimum requirement of 1000 translation units (pairs of source and target language segments). This is an arbitrary number, and you probably need more than 1000 to have a noticeable effect. If you still want to try it with 600 translation units, you can change the FinetuningSetMinSize setting in the OpusCatMTEngine.exe.config file. |
Hi, thanks for the answers @ALL. I actually tried instead the function to upload a source and a corresponding target file derived from the same TM and it worked, it improved the translations even with this small size. But I might also try this other setting, thank you. |
Hello HMueller007 and all What I sometimes do to get around this is to import a simple two column TB (glossary) into memoQ for the same job as the translation job I'm doing and then export all that to the TB for the same job. The segments are small of course but they are very relevant to the job and as Opus does not have any TB function at present to instruct the MT engine, this feels like an intuitive way to proceed. This often gets the TB to exceed the minimum number of segments restriction setting |
@SafeTex That's a good tip, will try this, thanks. |
Hi,
when I want to fine-tune the model with a TMX from a (Wordfast) project it says: "not enough parallel segments in the TMX".
It has more than 600 bilingual segments (so about 1300 segments in total if you count source and target language segments separately) from a finished project. Is this really not enough? How many do you need?
The text was updated successfully, but these errors were encountered: