Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TruthfulQA] update rouge-score version or add a way to suppress tokenizer logging #1692

Closed
skramer-dev opened this issue Apr 9, 2024 · 4 comments · Fixed by #2090
Closed

Comments

@skramer-dev
Copy link

When running truthfulqa evaluation i get a lot of unnecessary info/debug prints from the rouge scorer (usually hundreds if not thousands of lines of logging) each looking like this:

2024-04-09:15:38:18,284 INFO     [rouge_scorer.py:83] Using default tokenizer.

Eval harness has a dependency to the package rouge-score > 0.0.4
In the latest version on PyPi the Initialization of the scorer prints an info log each time it is called with the default tokenizer.

if tokenizer:
  self._tokenizer = tokenizer
else:
  self._tokenizer = tokenizers.DefaultTokenizer(use_stemmer)
  logging.info("Using default tokenizer.")

Truthfulqa uses a loop to re-initialize the scorer each time it wants to score something which leads to a lot of prints that are supposed to be debug prints but are still stuck at info level due to the outdated version of rouge-scorer (taken from utils.py).

rouge_scores = [rouge([ref], [completion]) for ref in all_refs]

That behaviour was changed 8 months ago with commit 6e232b36782739b234922e2a65e92d7c2651758b. (see history of https://github.com/google-research/google-research/blob/master/rouge/rouge_scorer.py).
The latest package of the rouge-scorer on Pypi however is from july 2022 and doesn't have these changes yet.

Is there a way to get a more updated version of rouge-scorer into the dependencies or a different way to suppress those prints or reduce the amount of them happening (like moving the initialization of the rouge scorer so it isn't called with each ietration of the loop)?

@LSinev
Copy link
Contributor

LSinev commented Apr 10, 2024

@skramer-dev
Copy link
Author

grafik
Ran with verbosity warning, but still:
grafik

@joecummings
Copy link

I am also finding this logging output cumbersome!

@haileyschoelkopf
Copy link
Contributor

Adding a fix for this in #2090

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants