OCR Engine Support: Pororo (`kakaobrain/pororo`, potentially used together with EasyOCR) #2

junhoyeo · 2023-10-29T06:09:11Z

The text was updated successfully, but these errors were encountered:

black7375 · 2023-10-29T07:19:01Z

To summarize the tweet again, (Currently, X can't see the list of answers without login)

EasyOCR is excellent in text detection.
Pororo is superior to EasyOCR in text recognition.
- But Pororo works only with English and Korean.
Pre-processing is another way to increase the recognition rate, and the application method must also vary depending on the characteristics of the OCR engine.
- For example, below v3.05 of Tesseract is advantageous for dark backgrounds, but after v4.0 that it is advantageous for bright backgrounds.
- Technologies such as normalization, binarization, and skeletonization may be good in document images, but they are not suitable for photographic images. (Shades of small letters become clumpy and indistinguishable with high probability)
- One of the few pretreatment that works well with most OCR engines is grayscale.
- If the size of the ROI(region of interest) is too small, it is better to scale up.
- I am convinced that the Tesseract's OSD(Orientation and script detection),estimate perspective transformations or dewarping will improve performance.
  However, it is expected to be difficult to apply only when detecting and appropriate.
  The easy way is to apply to each ROI.

junhoyeo mentioned this issue Oct 29, 2023

🔍 List of OCR Engines #6

Open

This was referenced Nov 1, 2023

EasyPororoOCR Integration #8

Merged

Fix typo: cv. -> cv2. black7375/korean_ocr_using_pororo#1

Merged

junhoyeo closed this as completed in #8 Nov 2, 2023

Provide feedback