Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

if (bbox_intersection_area(ba, bb) / bbox_area(ba)) > 0.8: ZeroDivisionError: float division by zero #495

Open
arjungandeeva opened this issue Apr 4, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@arjungandeeva
Copy link

I'm encountering a ZeroDivisionError: float division by zero error in camelot-py when using the functions bbox_intersection_area and bbox_area. This error occurs under certain conditions, likely when the bounding box area (ba) is zero.

@arjungandeeva arjungandeeva added the bug Something isn't working label Apr 4, 2024
@cktse
Copy link

cktse commented Apr 10, 2024

I did a quick fix/hack to circumvent the error by skipping over the area check if ba is singular (area is zero):

~/.pyenv/versions/3.11.3/lib/python3.11/site-packages/camelot/utils.py: Line 375:

            if bbox_area(ba) > 0 and bbox_intersect(ba, bb):
                # if the intersection is larger than 80% of ba's size, we keep the longest
                if (bbox_intersection_area(ba, bb) / bbox_area(ba)) > 0.8:
                    if bbox_longer(bb, ba):
                        rest.discard(ba)

@bosd
Copy link

bosd commented Aug 6, 2024

Hey!

As #343, we try to build a maintained fork at pypdf_table_extraction.

Do you want to check that code and open an issue / PR thereto include this fix?

@cktse
Copy link

cktse commented Aug 12, 2024

I just took another look at the branches -- looks like this has already been fixed as part of "Release camelot-fork 0.20.1", which is already included in your fork: Release camelot-fork 0.20.1

@bosd
Copy link

bosd commented Aug 12, 2024

Thanks for checking 👍

@cktse
Copy link

cktse commented Aug 12, 2024

Great to see camelot lives on!

BTW is this fork going to be packaged on pip under a separate name? Think the current package is stale from the main branch.

@bosd
Copy link

bosd commented Aug 12, 2024

BTW is this fork going to be packaged on pip under a separate name? Think the current package is stale from the main branch.

Yes, it is published here https://pypi.org/project/pypdf-table-extraction/

We're currently working on a new release, bymerging the open pr's from this repo, and rebranding the package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants