#85 Hotfix/`RobertaTokenizerFast` object has no attribute `max_len` #86

kirzharov · 2020-12-04T00:41:56Z

Link to the issue: #85

kirzharov · 2020-12-04T03:00:24Z

I checked score function locally and get results for precision, recall, and F1 measure with transformers versions:

4.0.0
3.5.1
2.8.0

Looks like everything works as before for transformers version 4.0.0

felixgwu · 2020-12-05T04:17:59Z

Hi @kirzharov
Thank you for this pull request. I've verified that old versions work as before. However, it doesn't pass our test with transformers 4.0.0. It's possible that some other arguments have to be changed as well.

kirzharov · 2020-12-05T06:29:54Z

Hi @felixgwu, thanks! Did you mean about only score function or others? I'll check 👍

felixgwu · 2020-12-05T06:48:37Z

When I ran python -m unittest discover, almost all the tests failed. It might be that they changed some default settings and the embeddings changed, but I haven't looked into the details.

kirzharov · 2020-12-05T11:36:10Z

@felixgwu Ok, thanks! I'll check today 👍

felixgwu · 2020-12-05T23:08:26Z

Hi @kirzharov, we have fixed the bug and merged your PR. FYI, the mismatch comes from the fact that huggingface's fast tokenizers are inconsistent with the old tokenizers and create different tokens. We disable the fast tokenizers currently. Thanks again for your help!

kirzharov · 2020-12-06T03:33:33Z

@felixgwu, awesome, thanks! It's great that you managed to fix this quickly. 🔥 I also noticed difference with new Fast tokenizers and use_fast=False 👍

kalyankumarp · 2020-12-07T06:03:56Z

Hi,
Do I have to wait for the 0.3.7 version of bert score to be able to use transformers 4.0.0? I tried 0.3.6 bert score with 4.0.0 transformers. I am still getting the same error.

Thanks,
Kalyan.

felixgwu · 2020-12-07T06:11:34Z

Hi @kalyankumarp,
We have released veresion 0.3.7. Can you please share the error messages if it still doesn't work?

kalyankumarp · 2020-12-07T21:34:28Z

Hi Felix,

It's working. Thanks for the fix.

Tiiiger#85 switch to model_max_length for future versions

9ffea67

kirzharov mentioned this pull request Dec 4, 2020

score function failed with the new transformers v4.0.0 #85

Closed

Tiiiger#85 switch to new version with more convenient way

37e68da

kirzharov force-pushed the hotfix/RobertaTokenizerFast-object-has-no-attribute-max_len branch from e89eaa3 to 37e68da Compare December 4, 2020 01:05

felixgwu mentioned this pull request Dec 5, 2020

[BUG_FIX] RobertaTokenizerFast object has no attributes max_len #84

Closed

felixgwu merged commit 37e68da into Tiiiger:master Dec 5, 2020

felixgwu mentioned this pull request Jul 15, 2021

Slow tokenizer is used by default #105

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#85 Hotfix/`RobertaTokenizerFast` object has no attribute `max_len` #86

#85 Hotfix/`RobertaTokenizerFast` object has no attribute `max_len` #86

kirzharov commented Dec 4, 2020

kirzharov commented Dec 4, 2020

felixgwu commented Dec 5, 2020

kirzharov commented Dec 5, 2020

felixgwu commented Dec 5, 2020

kirzharov commented Dec 5, 2020

felixgwu commented Dec 5, 2020

kirzharov commented Dec 6, 2020

kalyankumarp commented Dec 7, 2020

felixgwu commented Dec 7, 2020

kalyankumarp commented Dec 7, 2020

#85 Hotfix/RobertaTokenizerFast object has no attribute max_len #86

#85 Hotfix/RobertaTokenizerFast object has no attribute max_len #86

Conversation

kirzharov commented Dec 4, 2020

kirzharov commented Dec 4, 2020

felixgwu commented Dec 5, 2020

kirzharov commented Dec 5, 2020

felixgwu commented Dec 5, 2020

kirzharov commented Dec 5, 2020

felixgwu commented Dec 5, 2020

kirzharov commented Dec 6, 2020

kalyankumarp commented Dec 7, 2020

felixgwu commented Dec 7, 2020

kalyankumarp commented Dec 7, 2020

#85 Hotfix/`RobertaTokenizerFast` object has no attribute `max_len` #86

#85 Hotfix/`RobertaTokenizerFast` object has no attribute `max_len` #86