-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tag positions in Opus and "\tag" in NET Regular Expressions #72
Comments
Thanks, I'll keep the \tag convention in mind, it seems pretty useful. The tag functionality in OPUS-CAT currently should position tags according to the word alignments it generates. I haven't checked, but the behavior where tags are added to the end is probably the fallback behavior. So something seems to be interfering with the tag restoration. What model are you using when this happens? |
Hello Tommi I'm using a trained Swedish to English model and all tags are always put at the end. Thanks |
Hello Tommi and all
Opus seems to put all the tags in a source segment at the end of the target segment.
I can understand that where words or phrases are tagged, it must be very hard for any MT engine to reposition the tags correctly in the target;
But I'd like to look at the case of where a source segment opens and closes with a tag, while Opus puts both these tags at the end of the target segment. Can this be improved?
Also, I could not do anything today about this in Phrase (formerly MemSource) except to move the tags manually.
However, in memoQ, I can deal with these simpler cases as memoQ has added "\tag" to its NET Regular Expressions engine. So:
Find in target: ^(.+)(\tag)(\tag)$
Replace with: $2$1$3
worked and in semi-automatic mode, I was able to deal with the majority of cases;
All that to ask you if Opus could perhaps protect tags at the very start and end of segments in the future and to inform you, if you did not know, of "\tag" in memoQ, which you might think useful for Opus in the future.
Regards
SafeTex
The text was updated successfully, but these errors were encountered: