Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ratio and Words dont work properly when provided as a variable #81

Open
MarlNox opened this issue Oct 21, 2020 · 2 comments
Open

Ratio and Words dont work properly when provided as a variable #81

MarlNox opened this issue Oct 21, 2020 · 2 comments

Comments

@MarlNox
Copy link

MarlNox commented Oct 21, 2020

I'm building a flask app, where the user defines the ratio or the number of words to be given as output from the summarizer.
The input text is stripped of text artifacts, then recomposed and fed to the the summarizer.
Even though the values pass to the app from the frontpage, ratio and word number do not seem to be functioning.
The output text always comes up very short and does not change in length when I change the ratio.
I've tried using gensim aswell, with the same results.
Any suggestions?

@fbarrios
Copy link
Contributor

Hi! Can you share the example text and how you are calling the library? If you cannot share the text here you can also email it personally.

@MarlNox
Copy link
Author

MarlNox commented Oct 24, 2020

Hi! Can you share the example text and how you are calling the library? If you cannot share the text here you can also email it personally.

Hi fbarrios,
To be more descriptive, the issue has happened with any type of text ive input, it just provides one or two sentences as a summary. Also, I've experimented with different methods to clean the code with the same results. For more info check the code below:

@app.route("/summarize", methods=["GET", "POST"])
def summarize():
    text1 = request.form['text']
    percent = request.form['percentage']
    numri = request.form['numberOfWords']
    if numri == 0:
        nr1 = int(numri)
        texty = str(text1)
        textu = re.sub(r'\n\s*\n', '\n', texty, flags=re.MULTILINE)
        b_list = textu.split()
        text = " ".join(b_list)
        sent = nltk.sent_tokenize(text)
        if len(sent) < 2:
            summary1 = "please pass more than 3 sentences to summarize the text"
        else:
            summary = summy(text, words=nr1)
            summ = nltk.sent_tokenize(summary)
            summary1 = (" ".join(summ[:2]))
            result = {
                "result": summary1
            }
            result = {str(key): value for key, value in result.items()}
            return jsonify(result=result)
    else:
        nr = float(percent)
        texty = str(text1)
        textu = re.sub(r'\n\s*\n', '\n', texty, flags=re.MULTILINE)
        b_list = textu.split()
        text = " ".join(b_list)
        sent = nltk.sent_tokenize(text)
        if len(sent) < 2:
            summary1 = "please pass more than 3 sentences to summarize the text"
        else:
            print(nr)
            summary = summy(text, ratio=nr)
            summ = nltk.sent_tokenize(summary)
            summary1 = (" ".join(summ[:2]))
            result = {
                "result": summary1
            }
            print(result)
            result = {str(key): value for key, value in result.items()}
            return jsonify(result=result)

Do you think it's related to txt formatting, an internal bug that may be caused by invoking the app within flask, or something else?
I'm able to confirm that the accurately sent to the backend, so I'm note sure. If you think it's a formatting issue, any suggestions on effective ways to clean up the text?

Thanks,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants