Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix token usage with jump forward #174

Merged
merged 3 commits into from
Feb 10, 2024
Merged

Fix token usage with jump forward #174

merged 3 commits into from
Feb 10, 2024

Conversation

comaniac
Copy link
Collaborator

@comaniac comaniac commented Feb 9, 2024

close #173

This PR fixes the incorrect token usage when jump forward is enabled. Specifically, we introduce a new field orig_prompt_tokens, which will be set when the first jump forward happens so that we could know the original number of prompt tokens. When returning a response (a chunk in streaming or a complete response), we use the following equations to correct the token usage:

completion_tokens = curr_prompt_token - orig_prompt_tokens + completion_tokens
prompt_tokens = orig_prompt_tokens

@comaniac comaniac requested review from merrymercy and hnyls2002 and removed request for merrymercy February 9, 2024 18:59
@hnyls2002
Copy link
Collaborator

@comaniac Hi, I suggest initializing the orig_prompt_tokens when constructing the Req so that we can simplify the code.

@comaniac
Copy link
Collaborator Author

Thanks that makes sense. Meanwhile, can you add some comments back to explain why we need to calculate the token usage in this way?

@comaniac comaniac merged commit 4d303c4 into main Feb 10, 2024
@comaniac comaniac deleted the cody/usage branch February 10, 2024 04:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect token usage with jump forward
2 participants