Skip to content

Tags: a-h/llama.cpp

Tags

b1448

Toggle b1448's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
samplers : Min-P sampler implementation [alternative to Top P/Top K] (g…

…gerganov#3841)

* Introduce the new Min-P sampler by @kalomaze
   The Min-P sampling method was designed as an alternative to Top-P, and aims to ensure a balance of quality and variety. The parameter *p* represents the minimum probability for a token to be considered, relative to the probability of the most likely token.

* Min-P enabled and set to 0.05 default

---------

Co-authored-by: Georgi Gerganov <[email protected]>
Co-authored-by: cebtenzzre <[email protected]>

b1446

Toggle b1446's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
ggml : move FP16 <-> FP32 code to ggml-impl.h (ggerganov#3861)

* ggml : move FP16 <-> FP32 stuff to ggml-impl.h

ggml-ci

* tests : fix ARM build

* ggml : explicitly initialize deprecated type traits

* ggml : add math.h to ggml-impl.h

* ggml : remove duplicate static assert macros

* ggml : prefix lookup tables with ggml_

ggml-ci

* ggml-impl : move extern "C" to start of file

b1445

Toggle b1445's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Extend llama_kv_cache_seq_rm to allow matching any sequence (ggergano…

…v#3843)

* Extend llama_kv_cache_seq_rm to allow matichng any sequence

* Replace llama_kv_cache_tokens_rm with llama_kv_cache_clear

Use llama_kv_cache_clear for cache clearing

Change calls to llama_kv_cache_tokens_rm that want to delete by position to use llama_kv_cache_seq_rm functionality

b1444

Toggle b1444's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
make : remove unnecessary dependency on build-info.h (ggerganov#3842)

b1443

Toggle b1443's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
llama : fix kv shift bug (ggerganov#3835)

ggml-ci

b1442

Toggle b1442's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
ggml : quantization refactoring (ggerganov#3833)

* ggml : factor all quantization code in ggml-quants

ggml-ci

* ggml-quants : fix Zig and Swift builds + quantize tool

ggml-ci

* quantize : --pure option for disabling k-quant mixtures

---------

Co-authored-by: cebtenzzre <[email protected]>

b1440

Toggle b1440's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
metal : try cwd for ggml-metal.metal if bundle lookup fails (ggergano…

…v#3793)

* Try cwd for ggml-metal if bundle lookup fails

When building with `-DBUILD_SHARED_LIBS=ON -DLLAMA_METAL=ON -DLLAMA_BUILD_SERVER=ON`,
`server` would fail to load `ggml-metal.metal` because `[bundle pathForResource:...]`
returns `nil`.  In that case, fall back to `ggml-metal.metal` in the cwd instead of
passing `null` as a path.

Follows up on ggerganov#1782

* Update ggml-metal.m

---------

Co-authored-by: Georgi Gerganov <[email protected]>

b1437

Toggle b1437's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
llama : allow quantizing k-quants to fall back when tensor size incom…

…patible (ggerganov#3747)

* Allow quantizing k-quants to fall back when tensor size incompatible

* quantizing: Add warning when tensors were incompatible with k-quants

Clean up k-quants state passing a bit

b1436

Toggle b1436's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
llama : add option for greedy sampling with probs (ggerganov#3813)

* llama : add option for greedy sampling with probs

* llama : add comment about llama_sample_token_greedy() missing probs

* sampling : temp == 0.0 -> no probs, temp < 0.0 -> probs

b1435

Toggle b1435's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
common : print that one line of the syntax help *also* to standard ou…

…tput (ggerganov#3823)