-
Notifications
You must be signed in to change notification settings - Fork 716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using image_to_osd with PIL Image directly does not return results due to temporary file name clashing #408
Comments
Hi @klavdijS |
Hi @int3l. |
PIL (Pillow) itself does non-lossless conversions for the most part -- this is why there is no way to perfectly avoid such errors. |
I have the same problem and did some digging with the tmp files: I created a watch on the tmp folder and copy the tmp files to a tess_test folder on arrival:
in my flask app I get the following error:
As you can see the But the created PIL image does work directly in tesseract:
Im pytesseract with the generated tmp image directly I get the same error:
klavdijS's solution worked for me! Thank you using: pytesseract==0.3.9, Pillow==9.0.1, Python 3.10.2 and tesseract 5.0.1 on Linux server 5.13.19-2-MANJARO |
Oh, I was looking into the history of pytesseract and it seems to be a regression from 2019. But also, this seems like a deeper problem. The other one is hinted by the actual message: |
I tried your suggestion but no difference:
|
Ok, thank you for reporting this issue, you can test the master revision if you want. |
Hey, firstly thanks for the great wrapper around tesseract, makes the usage much more convenient.
Operating system: macOS Monetery 12.1
Tesseract version: 5.0.1
Upon trying to execute image_to_osd with an opened PIL Image, I always get the same result:
pytesseract.pytesseract.TesseractError: (1, 'UZN file /var/folders/z7/6mpq4jhn3g96kcp8wd5_fzrm0000gn/T/tess__pdgxjh0 loaded. Estimating resolution as 146 UZN file /var/folders/z7/6mpq4jhn3g96kcp8wd5_fzrm0000gn/T/tess__pdgxjh0 loaded. Warning. Invalid resolution 0 dpi. Using 70 instead. Too few characters. Skipping this page Error during processing.
OSD is never returned.
A short script for reproduction:
out1.png is a random document with text on it.
Upon further investigation it looks like tesseract tries to open and execute the temporary file which is meant to be used for saving the process output (as seen from the pasted error message).
Temporary saved files in Finder:
Problematic is the save context manager which creates the temporary files (starting line 188):
Currently, I fixed it by changing the input_file_name variable to the following:
input_file_name = f"{f.name}_input" + extsep + extension
This is also my proposed solution, I believe it does not break anything.
I can create a pull request if needed.
The text was updated successfully, but these errors were encountered: