Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama : minor sampling refactor (2) #9386

Merged
merged 3 commits into from
Sep 9, 2024
Merged

llama : minor sampling refactor (2) #9386

merged 3 commits into from
Sep 9, 2024

Conversation

slaren
Copy link
Collaborator

@slaren slaren commented Sep 9, 2024

  • Avoid copy in llama_sample_dist
  • Remove lambdas from llama_sampler_chain
  • Reduce overhead in logit bias sampler when there are no biases
  • Include call to llama_sampler_accept in llama_sampler_sample

@@ -613,7 +613,7 @@ struct server_context {

gpt_params params;

llama_batch batch;
llama_batch batch = {};
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also fixes a crash in the server when loading a model fails and llama_batch_free is called on an uninitialized batch.

@github-actions github-actions bot added android Issues specific to Android examples server labels Sep 9, 2024
@github-actions github-actions bot added the testing Everything test related label Sep 9, 2024
@slaren slaren merged commit 5fb5e24 into master Sep 9, 2024
52 checks passed
@slaren slaren deleted the sl/sampling-re-2 branch September 9, 2024 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
android Issues specific to Android examples server testing Everything test related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants