Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_languages #551

Closed
Larbo53 opened this issue Jun 15, 2024 · 10 comments
Closed

get_languages #551

Larbo53 opened this issue Jun 15, 2024 · 10 comments

Comments

@Larbo53
Copy link

Larbo53 commented Jun 15, 2024

Hi,

the print(pytesseract.get_languages(config='')) command
returns an empty list.
I use python3.9
I deleted pytesseract, restarted my macbook, then reinstalled pytesseract. I still have the same problem. How can I install the list of languages?
Thanks for your feedback.
Sincerely

@stefan6419846
Copy link
Contributor

This seems to be a Tesseract issue, not a pytesseract one.

Please verify that Tesseract can find any language files using tesseract --list-langs from the terminal. If this does not yield any languages, please install the language files the same way you installed Tesseract (using the same source usually ensures that the hard-coded data directory is valid) or download them manually and use the TESSDATA_PREFIX environment variable to point to them.

@Larbo53
Copy link
Author

Larbo53 commented Jun 15, 2024

I've just reinstalled tesseract with the pip command and the problem persists.
how do i find and install the language file, and in which directory should it be stored?
I'm using python3.9 and macos Monterey v 12.75.
Thanks for your help.
Sincerely

@stefan6419846
Copy link
Contributor

pytesseract is just a wrapper around Tesseract, which needs to be installed separately. Please refer to the Tesseract project for further installation instructions: https://github.com/tesseract-ocr/tesseract?tab=readme-ov-file#installing-tesseract

@Larbo53
Copy link
Author

Larbo53 commented Jun 15, 2024

I just found the language files in 'usr/local/bin/tesseract-lang/4.1.0/share/tessdata'.
In which file do I have to enter this path for it to work properly?
Thanks for your help.
Sincerely

@stefan6419846
Copy link
Contributor

In the best case, your Tesseract installation already picks this up. Otherwise, you have to set the environment variable TESSDATA_PREFIX accordingly - either in your global environment or inside your Python script with os.environ["TESSDATA_PREFIX"] = ....

@Larbo53
Copy link
Author

Larbo53 commented Jun 15, 2024

I've just seen that the Tesseract version is 5.2. Maybe that's where the problem lies.
Thank you.

@Larbo53
Copy link
Author

Larbo53 commented Jun 15, 2024

os.environ["TESSDATA_PREFIX"] = 'usr/local/bin/tesseract-lang/4.1.0/share/tessdata/'
error message :
Failed loading language 'eng' Tesseract couldn't load any languages! Could not initialize tesseract.')

@stefan6419846
Copy link
Contributor

In this case it seems like the English language data is not available, which AFAIK is always required.

@Larbo53
Copy link
Author

Larbo53 commented Jun 15, 2024

I uninstall tesseract, then reinstall it.
Thank you

@Larbo53
Copy link
Author

Larbo53 commented Jun 16, 2024

hi,

by reinstalling everything, tesseract is now operational.
Thanks a lot for your help.
Best regards.

@Larbo53 Larbo53 closed this as completed Jun 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants