fix: Set model_max_length in the Tokenizer of `DefaultPromptHandler` #5596

bogdankostic · 2023-08-18T13:15:54Z

Related Issues

fixes "Token indices sequence length is longer than the specified maximum sequence length for this model" with Cohere Command #5589

Proposed Changes:

This PR sets the model_max_length parameter of the tokenizer of DefaultPromptHandler.

How did you test it?

I added a unit test + added a test case in an integration test.

Notes for the reviewer

Without this change, users are getting a warning message from the transformers library that the sequence length is too long for the model they are using in case they are using models supporting larger sequence length than 1024.

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test:.
I documented my code
I ran pre-commit hooks and fixed any issue

coveralls · 2023-08-18T13:32:15Z

Pull Request Test Coverage Report for Build 5903109928

0 of 0 changed or added relevant lines in 0 files are covered.
1 unchanged line in 1 file lost coverage.
Overall coverage decreased (-0.002%) to 48.257%

Files with Coverage Reduction	New Missed Lines	%
utils/context_matching.py	1	95.7%

Totals
Change from base Build 5901803206:	-0.002%
Covered Lines:	11460
Relevant Lines:	23748

💛 - Coveralls

bogdankostic added 2 commits August 18, 2023 15:09

Set model_max_length in tokenizer in prompt handler

9a03223

Add release note

4e904e9

bogdankostic requested review from a team as code owners August 18, 2023 13:15

bogdankostic requested review from dfokina and masci and removed request for a team August 18, 2023 13:15

github-actions bot added topic:tests topic:promptnode labels Aug 18, 2023

masci approved these changes Sep 1, 2023

View reviewed changes

bogdankostic merged commit 1144039 into main Sep 1, 2023
55 checks passed

bogdankostic deleted the model_max_length_prompthandler branch September 1, 2023 09:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Set model_max_length in the Tokenizer of `DefaultPromptHandler` #5596

fix: Set model_max_length in the Tokenizer of `DefaultPromptHandler` #5596

bogdankostic commented Aug 18, 2023

coveralls commented Aug 18, 2023

fix: Set model_max_length in the Tokenizer of DefaultPromptHandler #5596

fix: Set model_max_length in the Tokenizer of DefaultPromptHandler #5596

Conversation

bogdankostic commented Aug 18, 2023

Related Issues

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

coveralls commented Aug 18, 2023

Pull Request Test Coverage Report for Build 5903109928

💛 - Coveralls

fix: Set model_max_length in the Tokenizer of `DefaultPromptHandler` #5596

fix: Set model_max_length in the Tokenizer of `DefaultPromptHandler` #5596