Bad translations using marian-decoder #29

koren-v · 2020-10-06T13:13:13Z

Hi, I've loaded the models from the following directory: https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/ru-en
When I tried some of them I often get translation like: "▁Y O O O O O O O O O O O O O O O O O O O O" or "I 'm b@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@"
Then I tried to load the model from the Hugging Face site: but get pretty similar outputs while using Hugging Face framework gives good translations. Probably something wrong with config.
I launch it using the Marian library. For example:

 echo "привет" | ./marian-decoder -c /path/to/opus_models/opus-2019-12-05-ru-en/decoder.yml

So what can be wrong?

Probably I somehow should do preprocessing and postprocessing?

The text was updated successfully, but these errors were encountered:

jorgtied · 2020-11-25T22:09:02Z

You need to preprocess the string first using the provided sentence piece model of the source language. Our models don;t support internal sentence piece segmentation. This needs to be done before piping input to the decoder.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bad translations using marian-decoder #29

Bad translations using marian-decoder #29

koren-v commented Oct 6, 2020 •

edited

Loading

jorgtied commented Nov 25, 2020

Bad translations using marian-decoder #29

Bad translations using marian-decoder #29

Comments

koren-v commented Oct 6, 2020 • edited Loading

jorgtied commented Nov 25, 2020

koren-v commented Oct 6, 2020 •

edited

Loading