Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#85 Hotfix/RobertaTokenizerFast object has no attribute max_len #86

Conversation

kirzharov
Copy link
Contributor

Link to the issue: #85

@kirzharov kirzharov force-pushed the hotfix/RobertaTokenizerFast-object-has-no-attribute-max_len branch from e89eaa3 to 37e68da Compare December 4, 2020 01:05
@kirzharov
Copy link
Contributor Author

I checked score function locally and get results for precision, recall, and F1 measure with transformers versions:

  • 4.0.0
  • 3.5.1
  • 2.8.0

Looks like everything works as before for transformers version 4.0.0

@felixgwu
Copy link
Collaborator

felixgwu commented Dec 5, 2020

Hi @kirzharov
Thank you for this pull request. I've verified that old versions work as before. However, it doesn't pass our test with transformers 4.0.0. It's possible that some other arguments have to be changed as well.

@kirzharov
Copy link
Contributor Author

Hi @felixgwu, thanks! Did you mean about only score function or others? I'll check 👍

@felixgwu
Copy link
Collaborator

felixgwu commented Dec 5, 2020

When I ran python -m unittest discover, almost all the tests failed. It might be that they changed some default settings and the embeddings changed, but I haven't looked into the details.

@kirzharov
Copy link
Contributor Author

@felixgwu Ok, thanks! I'll check today 👍

@felixgwu
Copy link
Collaborator

felixgwu commented Dec 5, 2020

Hi @kirzharov, we have fixed the bug and merged your PR. FYI, the mismatch comes from the fact that huggingface's fast tokenizers are inconsistent with the old tokenizers and create different tokens. We disable the fast tokenizers currently. Thanks again for your help!

@kirzharov
Copy link
Contributor Author

@felixgwu, awesome, thanks! It's great that you managed to fix this quickly. 🔥 I also noticed difference with new Fast tokenizers and use_fast=False 👍

@kalyankumarp
Copy link

Hi,
Do I have to wait for the 0.3.7 version of bert score to be able to use transformers 4.0.0? I tried 0.3.6 bert score with 4.0.0 transformers. I am still getting the same error.

Thanks,
Kalyan.

@felixgwu
Copy link
Collaborator

felixgwu commented Dec 7, 2020

Hi @kalyankumarp,
We have released veresion 0.3.7. Can you please share the error messages if it still doesn't work?

@kalyankumarp
Copy link

Hi Felix,

It's working. Thanks for the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants