Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Divison by 0 #10

Closed
demcbs opened this issue Jun 19, 2020 · 4 comments
Closed

Divison by 0 #10

demcbs opened this issue Jun 19, 2020 · 4 comments
Labels
bug Something isn't working

Comments

@demcbs
Copy link

demcbs commented Jun 19, 2020

I am eager to use the SS3 classifier for text classification task in my master's thesis.
Unfortunately when I run it I get a division by zero error message, see image. My text seems fairly clean (although not yet cleaned exactly the right way) to me, so I am not sure what is causing this.

Is there anything you suspect might be going wrong which I could try? Or anywhere where the data criteria are listed (I've looked but maybe I've overlooked)?

I included the data structure (pandas series), some of what my data looks like and the error.

Many thanks!
image

image

image

@sergioburdisso sergioburdisso added the bug Something isn't working label Jun 19, 2020
@IveJ
Copy link

IveJ commented Jun 19, 2020 via email

@demcbs
Copy link
Author

demcbs commented Jun 19, 2020

I created no n-grams, I put in the unstructured text, so that's the default n-gram size, I suppose.

sergioburdisso added a commit that referenced this issue Jun 19, 2020
- Fix ZeroDivisionError bug (#10)
@sergioburdisso
Copy link
Owner

sergioburdisso commented Jun 19, 2020

First of all, thanks for creating this issue and reporting this bug, @demcbs. Also, thank @IveJ for your comments 👍

I've found what the problem was, First, I created a small script to replicate the error, the smallest script that I could come up with was:

    test_x = ["this is the first document", "this is the second document"]
    test_y = [0, 1]
    clf = SS3()
    clf.train(test_x, test_y)

I found out that the problem was caused by the integer "0" label, it was triggering a condition as True when it shouldn't. (more details are given in the commit message 236a942). I've already released a new version (0.6.2) with the patch fixing this issue, so updating the package (pip install -U pyss3) should solve the problem 😊

@demcbs
Copy link
Author

demcbs commented Jun 20, 2020

Thanks for the quick fix Sergio!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants