-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Wrapper around training #508
Comments
What exactly are you looking for? For the training with artificial data, there already is a Python package (https://github.com/tesseract-ocr/tesstrain/tree/main/src, For the training with real data, there currently mostly is a Makefile. If I remember the discussions in some PRs correctly, one collaborator has some plans about moving everything to Python and providing it in one package, but there are no results for this at the moment. That being said, I see no real value in pytesseract adding functionality like this. |
Hi @stefan6419846 , thank you for sharing these information. The documentation of training is confusing and scattered between 3 repos (tesseract, tessdoc and tesstrain). It documentes only Makefiles. It's worth documenting the python options. Thanks again. Closing this issue. |
tessdoc documents the training process with the Python package in a basic manner without any actual references to |
It would be great if pytesseract offers a wrapper around the training functionalities of Tesseract (https://github.com/tesseract-ocr/tesstrain)
Since the training is not done often in Tesseract, the option can be added as a package extras, e.g. installed as
pip install pytesseract[training]
The text was updated successfully, but these errors were encountered: