Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RQUEST] TMX Generator - Bitextor #40

Open
drkhateeb opened this issue Feb 14, 2021 · 1 comment
Open

[RQUEST] TMX Generator - Bitextor #40

drkhateeb opened this issue Feb 14, 2021 · 1 comment

Comments

@drkhateeb
Copy link

Hello developers
I suggest creating a GUI for this code for creating a tool to harvest multilanguage websites to create a TMX to train MT's
Kindly check this
Bitextor generates translation memories from multilingual websites.
https://github.com/bitextor/bitextor
you may extract parallel text from this Medical website
https://www.mayoclinic.org/
Eng-Arabic and other languages and train NMT, to increase the translation accuracy for testing
https://webisearch.com/
Regards--

@jorgtied
Copy link
Member

jorgtied commented Sep 9, 2021

Integrating bitextor here would be a major task and I would rather like to keep the bitext harvesting procedures outside of the core translation service. It could be an interesting feature but sounds very complex to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants