-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add non-translatable functionality #70
Comments
These are proper nouns that have probably never occurred in the training material, so the NMT system has no clear examples on how to handle them. Ideally the system still learns to identify unseen proper nouns (probably based on features such as capitalization and certain trigger words) and also learns to copy them into the translation in the same form. But the process is fuzzy (by necessity, since proper noun translation is pretty fuzzy, consider e.g. organization names that ARE translated, like the UN etc.) Here the model has learnt a weird mixed behavior, where it corrupts the proper noun while still keeping it in Swedish. Some kind of named entity recognition combined with an option where you could specify whether entities need to translated or copied into the translation might be a good idea, I'll mark this as a potential improvement (it also has some synergies with the terminology support). |
Hello Tommi and all Just in case you don't know, memoQ also has a "non translatable" feature that is separate from its TB (termbase) I know that's asking a lot (again) but if I don't mention it and send you such a file, then there's even less chance of Opus being able to handle such a file. But as it's a text file, I guess that translators could remove the header and tags if that is what it takes to load such a file in one go into Opus Regards |
Hello Tommi and all
In my present job with a lot of Swedish proper nouns for organizations, associations etc. Fiskmö changes words that it can't understand but does not actually translate them, as in:
While I can understand that if the MT engine could translate say 90% of such proper nouns, it might be programmed or tempted to do so, but it's much more debatable here, as fFIskmö has not translated any part of the word(s).
Would it not be better for Fiskmö to leave the word then? On what basis does it change a word without ever translating it? It seems strange, especially in the second example, "Guldsmedsbranschens Leverantörsförening" > "Goldsmedsbrakensförening," for reasons evident to you as a Swedish speaker.
What do you make of this Tommi and others please?
Thanks
The text was updated successfully, but these errors were encountered: