-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad translations using marian-decoder #29
Comments
You need to preprocess the string first using the provided sentence piece model of the source language. Our models don;t support internal sentence piece segmentation. This needs to be done before piping input to the decoder. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi, I've loaded the models from the following directory: https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/ru-en
When I tried some of them I often get translation like: "▁Y O O O O O O O O O O O O O O O O O O O O" or "I 'm b@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@"
Then I tried to load the model from the Hugging Face site: but get pretty similar outputs while using Hugging Face framework gives good translations. Probably something wrong with config.
I launch it using the Marian library. For example:
So what can be wrong?
Probably I somehow should do preprocessing and postprocessing?
The text was updated successfully, but these errors were encountered: