Tag positions in Opus and "\tag" in NET Regular Expressions #72

SafeTex · 2023-03-17T21:33:57Z

Hello Tommi and all

Opus seems to put all the tags in a source segment at the end of the target segment.

I can understand that where words or phrases are tagged, it must be very hard for any MT engine to reposition the tags correctly in the target;

But I'd like to look at the case of where a source segment opens and closes with a tag, while Opus puts both these tags at the end of the target segment. Can this be improved?

Also, I could not do anything today about this in Phrase (formerly MemSource) except to move the tags manually.

However, in memoQ, I can deal with these simpler cases as memoQ has added "\tag" to its NET Regular Expressions engine. So:

Find in target: ^(.+)(\tag)(\tag)$
Replace with: $2$1$3

worked and in semi-automatic mode, I was able to deal with the majority of cases;

All that to ask you if Opus could perhaps protect tags at the very start and end of segments in the future and to inform you, if you did not know, of "\tag" in memoQ, which you might think useful for Opus in the future.

Regards

SafeTex

TommiNieminen · 2023-03-23T10:11:00Z

Thanks, I'll keep the \tag convention in mind, it seems pretty useful. The tag functionality in OPUS-CAT currently should position tags according to the word alignments it generates. I haven't checked, but the behavior where tags are added to the end is probably the fallback behavior. So something seems to be interfering with the tag restoration. What model are you using when this happens?

SafeTex · 2023-03-23T10:23:48Z

Hello Tommi

I'm using a trained Swedish to English model and all tags are always put at the end.
I even had a job where commas and full stops were tagged, due to a perceived difference in font size by an OCR scan, and even these tags ended up at the end of segments
How can I overcome this?

Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tag positions in Opus and "\tag" in NET Regular Expressions #72

Tag positions in Opus and "\tag" in NET Regular Expressions #72

SafeTex commented Mar 17, 2023

TommiNieminen commented Mar 23, 2023

SafeTex commented Mar 23, 2023

Tag positions in Opus and "\tag" in NET Regular Expressions #72

Tag positions in Opus and "\tag" in NET Regular Expressions #72

Comments

SafeTex commented Mar 17, 2023

TommiNieminen commented Mar 23, 2023

SafeTex commented Mar 23, 2023