Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added the ability to Modify the Context Length #210

Merged
merged 2 commits into from
Feb 21, 2024

Conversation

psych0v0yager
Copy link
Contributor

@psych0v0yager psych0v0yager commented Feb 20, 2024

Fixes issue #159

You can now specify how much context you want the model to have.

For example Mixtral 8x7b AWQ:

python -m sglang.launch_server --model-path /path/to/bagel_mixtral_AWQ --port 30000 --tp 2

Rank 1: max_total_num_token=135505, max_prefill_num_token=32768, context_len=32768, model_mode=[]
Rank 0: max_total_num_token=135505, max_prefill_num_token=32768, context_len=32768, model_mode=[]

With the adjustment

python -m sglang.launch_server --model-path /path/to/bagel_mixtral_AWQ --port 30000 --tp 2 --context-length 8192

Rank 0: max_total_num_token=135505, max_prefill_num_token=22584, context_len=8192, model_mode=[]
Rank 1: max_total_num_token=135505, max_prefill_num_token=22584, context_len=8192, model_mode=[]

Copy link
Collaborator

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@comaniac comaniac linked an issue Feb 21, 2024 that may be closed by this pull request
@comaniac comaniac merged commit 9de9a46 into sgl-project:main Feb 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

initialise model with max_model_len
2 participants