Don't break overall processing on a bad image #10216

UserUnknownFactor · 2023-06-20T08:53:34Z

When processing multiple images PaddleOCR currently breaks if even a single image is malformed. This is to prevent that.

paddle-bot · 2023-06-20T08:53:38Z

Thanks for your contribution!

CLAassistant · 2023-06-20T08:53:40Z

All committers have signed the CLA.

shiyutang

There are unnecessary code need to be removed.

tools/infer/predict_system.py

shiyutang

LGTM

shiyutang · 2023-07-19T07:50:00Z

We are now holding a contribution to Paddleseg and Paddle OCR activity, you are welcome to join:#10223

* Don't break overall processing on a bad image * Add preprocessing common to OCR tasks Add preprocessing to options

* Update PP-OCRv4_introduction.md * Update PP-OCRv4_introduction.md (#10616) * Update PP-OCRv4_introduction.md * Update PP-OCRv4_introduction.md * Update PP-OCRv4_introduction.md * Update README.md * Cherrypicking GH-10217 and GH-10216 to PaddlePaddle:Release/2.7 (#10655) * Don't break overall processing on a bad image * Add preprocessing common to OCR tasks Add preprocessing to options * Update requirements.txt (#10656) added missing pyyaml library * [TIPC]update xpu tipc script (#10658) * fix-typo (#10642) Co-authored-by: Dennis <[email protected]> Co-authored-by: shiyutang <[email protected]> * 修改数据增强导致的DSR报错 (#10662) (#10681) * 修改数据增强导致的DSR报错 * 错误修改回滚 * Update algorithm_overview_en.md (#10670) Fixed simple spelling errors. * Implement recoginition method ParseQ * Document update for new recognition method ParseQ * add prediction for parseq * Update rec_vit_parseq.yml * Update rec_r31_sar.yml * Update rec_r31_sar.yml * Update rec_r50_fpn_srn.yml * Update rec_vit_parseq.py * Update rec_vit_parseq.yml * Update rec_parseq_head.py * Update rec_img_aug.py * Update rec_vit_parseq.yml * Update __init__.py * Update predict_rec.py * Update paddleocr.py * Update requirements.txt * Update utility.py * Update utility.py --------- Co-authored-by: xiaoting <[email protected]> Co-authored-by: topduke <[email protected]> Co-authored-by: dyning <[email protected]> Co-authored-by: UserUnknownFactor <[email protected]> Co-authored-by: itasli <[email protected]> Co-authored-by: Kai Song <[email protected]> Co-authored-by: dvorst <[email protected]> Co-authored-by: Dennis <[email protected]> Co-authored-by: shiyutang <[email protected]> Co-authored-by: Dec20B <[email protected]> Co-authored-by: ncoffman <[email protected]>

* Update recognition_en.md (#10059) ic15_dict.txt only have 36 digits * Update ocr_rec.h (#9469) It is enough to include preprocess_op.h, we do not need to include ocr_cls.h. * 补充num_classes注释说明 (#10073) ser_vi_layoutxlm_xfund_zh.yml中的Architecture.Backbone.num_classes所赋值会设置给Loss.num_classes，由于采用BIO标注，假设字典中包含n个字段（包含other）时，则类别数为2n-1;假设字典中包含n个字段（不含other）时，则类别数为2n+1。 * Update algorithm_overview_en.md (#9747) Fix links to super-resolution algorithm docs * 改进文档`deploy/hubserving/readme.md`和`doc/doc_ch/models_list.md` (#9110) * Update readme.md * Update readme.md * Update readme.md * Update models_list.md * trim trailling spaces @ `deploy/hubserving/readme_en.md` * `s/shell/bash/` @ `deploy/hubserving/readme_en.md` * Update `deploy/hubserving/readme_en.md` to sync with `deploy/hubserving/readme.md` * Update deploy/hubserving/readme_en.md to sync with `deploy/hubserving/readme.md` * Update deploy/hubserving/readme_en.md to sync with `deploy/hubserving/readme.md` * Update `doc/doc_en/models_list_en.md` to sync with `doc/doc_ch/models_list_en.md` * using Grammarly to weak `deploy/hubserving/readme_en.md` * using Grammarly to tweak `doc/doc_en/models_list_en.md` * `ocr_system` module will return with values of field `confidence` * Update README_CN.md * 修复测试服务中图片转Base64的引用地址错误。 (#8334) * Update application.md * [Doc] Fix 404 link. (#10318) * Update PP-OCRv3_det_train.md * Update knowledge_distillation.md * Update config.md * Fix fitz camelCase deprecation and .PDF not being recognized as pdf file (#10181) * Fix fitz camelCase deprecation and .PDF not being recognized as pdf file * refactor get_image_file_list function * Update customize.md (#10325) * Update FAQ.md (#10345) * Update FAQ.md (#10349) * Don't break overall processing on a bad image (#10216) * Add preprocessing common to OCR tasks (#10217) Add preprocessing to options * [MLU] add mlu device for infer (#10249) * Create newfeature.md * Update newfeature.md * remove unused imported module, so can avoid PyInstaller packaged binary's start-time not found module error. (#10502) * CV套件建设专项活动 - 文字识别返回单字识别坐标 (#10515) * modification of return word box * update_implements * Update rec_postprocess.py * Update utility.py * Update README_ch.md * revert README_ch.md update * Fixed Layout recovery README file (#10493) Co-authored-by: Shubham Chambhare <[email protected]> * update_doc * bugfix --------- Co-authored-by: ChuongLoc <[email protected]> Co-authored-by: Wang Xin <[email protected]> Co-authored-by: tanjh <[email protected]> Co-authored-by: Louis Maddox <[email protected]> Co-authored-by: n0099 <[email protected]> Co-authored-by: zhenliang li <[email protected]> Co-authored-by: itasli <[email protected]> Co-authored-by: UserUnknownFactor <[email protected]> Co-authored-by: PeiyuLau <[email protected]> Co-authored-by: kerneltravel <[email protected]> Co-authored-by: ToddBear <[email protected]> Co-authored-by: Ligoml <[email protected]> Co-authored-by: Shubham Chambhare <[email protected]> Co-authored-by: Shubham Chambhare <[email protected]> Co-authored-by: andyj <[email protected]>

…Paddle:Release/2.7 (PaddlePaddle#10655) * Don't break overall processing on a bad image * Add preprocessing common to OCR tasks Add preprocessing to options

paddle-bot bot added contributor status: proposed labels Jun 20, 2023

shiyutang requested changes Jul 17, 2023

View reviewed changes

tools/infer/predict_system.py Show resolved Hide resolved

shiyutang self-assigned this Jul 17, 2023

UserUnknownFactor force-pushed the fix_imageload branch 2 times, most recently from db36b1a to 3860d61 Compare July 17, 2023 08:55

UserUnknownFactor requested a review from shiyutang July 17, 2023 08:58

UserUnknownFactor force-pushed the fix_imageload branch 3 times, most recently from 1674d06 to ef348b3 Compare July 17, 2023 09:31

shiyutang requested changes Jul 18, 2023

View reviewed changes

tools/infer/predict_system.py Show resolved Hide resolved

tools/infer/predict_system.py Outdated Show resolved Hide resolved

Don't break overall processing on a bad image

72d3851

UserUnknownFactor force-pushed the fix_imageload branch from ef348b3 to 72d3851 Compare July 18, 2023 13:03

shiyutang approved these changes Jul 19, 2023

View reviewed changes

shiyutang merged commit 1dad0a9 into PaddlePaddle:release/2.6 Jul 19, 2023
1 check passed

shiyutang added the Contributor PR is merged label Jul 19, 2023

paddle-bot bot removed the status: proposed label Jul 19, 2023

UserUnknownFactor mentioned this pull request Aug 16, 2023

Add preprocessing common to OCR tasks #10217

Merged

shiyutang pushed a commit that referenced this pull request Aug 16, 2023

Cherrypicking GH-10217 and GH-10216 to PaddlePaddle:Release/2.7 (#10655)

b17c2f3

* Don't break overall processing on a bad image * Add preprocessing common to OCR tasks Add preprocessing to options

shiyutang pushed a commit that referenced this pull request Aug 21, 2023

Cherrypicking GH-10217 and GH-10216 to PaddlePaddle:dygraph (#10654)

b3912fc

* Don't break overall processing on a bad image * Add preprocessing common to OCR tasks Add preprocessing to options

shiyutang pushed a commit that referenced this pull request Oct 16, 2023

Don't break overall processing on a bad image (#10216)

535d3b4

UserUnknownFactor deleted the fix_imageload branch May 8, 2024 09:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't break overall processing on a bad image #10216

Don't break overall processing on a bad image #10216

UserUnknownFactor commented Jun 20, 2023

paddle-bot bot commented Jun 20, 2023

CLAassistant commented Jun 20, 2023 •

edited

Loading

shiyutang left a comment

shiyutang left a comment

shiyutang commented Jul 19, 2023

Don't break overall processing on a bad image #10216

Don't break overall processing on a bad image #10216

Conversation

UserUnknownFactor commented Jun 20, 2023

paddle-bot bot commented Jun 20, 2023

CLAassistant commented Jun 20, 2023 • edited Loading

shiyutang left a comment

Choose a reason for hiding this comment

shiyutang left a comment

Choose a reason for hiding this comment

shiyutang commented Jul 19, 2023

CLAassistant commented Jun 20, 2023 •

edited

Loading