Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question about PaddleOCR.page_num #10965

Open
warmpine opened this issue Sep 22, 2023 · 1 comment
Open

A question about PaddleOCR.page_num #10965

warmpine opened this issue Sep 22, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@warmpine
Copy link

这是一个有关PaddleOCR.page_num的问题,通过阅读源码,我的理解是:这个page_num是初始化PaddleOCR时指定的页码选项,比如说page_num设置为2,就只会识别前两幅图片的内容。

我观察了下源码里page_num没有重新归零的设定,这样的话,就会导致我在复用PaddleOCR对象时出现问题,比如说我第一次.ocr()时传进来一个2页的PDF,或者2张图片,此时page_num会被赋值为2,那么如果第二次传进来一个3页的PDF,或者3张图片的话,这个page_num就会生效,ocr()函数只会识别前2张图片,请问我的理解对吗?

相关代码:

if self.page_num > len(img) or self.page_num == 0:

        if isinstance(img, list):
            if self.page_num > len(img) or self.page_num == 0:
                self.page_num = len(img)
            imgs = img[:self.page_num]
        else:
            imgs = [img]
@BrownTen
Copy link

是的,没错! 我也遇到了同样的问题,修改一下源代码就好了。

@SWHL SWHL added the bug Something isn't working label Jun 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants