Tags · a-h/llama.cpp

b1448

samplers : Min-P sampler implementation [alternative to Top P/Top K] (g…

…gerganov#3841)

* Introduce the new Min-P sampler by @kalomaze
   The Min-P sampling method was designed as an alternative to Top-P, and aims to ensure a balance of quality and variety. The parameter *p* represents the minimum probability for a token to be considered, relative to the probability of the most likely token.

* Min-P enabled and set to 0.05 default

---------

Co-authored-by: Georgi Gerganov <[email protected]>
Co-authored-by: cebtenzzre <[email protected]>

Oct 31, 2023
238657d
zip
tar.gz

b1446

ggml : move FP16 <-> FP32 code to ggml-impl.h (ggerganov#3861)

* ggml : move FP16 <-> FP32 stuff to ggml-impl.h

ggml-ci

* tests : fix ARM build

* ggml : explicitly initialize deprecated type traits

* ggml : add math.h to ggml-impl.h

* ggml : remove duplicate static assert macros

* ggml : prefix lookup tables with ggml_

ggml-ci

* ggml-impl : move extern "C" to start of file

Oct 30, 2023
207b519
zip
tar.gz

b1445

Extend llama_kv_cache_seq_rm to allow matching any sequence (ggergano…

…v#3843)

* Extend llama_kv_cache_seq_rm to allow matichng any sequence

* Replace llama_kv_cache_tokens_rm with llama_kv_cache_clear

Use llama_kv_cache_clear for cache clearing

Change calls to llama_kv_cache_tokens_rm that want to delete by position to use llama_kv_cache_seq_rm functionality

Oct 29, 2023
6e08281
zip
tar.gz

b1444

make : remove unnecessary dependency on build-info.h (ggerganov#3842)

Oct 29, 2023
2046eb4
zip
tar.gz

b1443

llama : fix kv shift bug (ggerganov#3835)

ggml-ci

Oct 29, 2023
71a09da
zip
tar.gz

b1442

ggml : quantization refactoring (ggerganov#3833)

* ggml : factor all quantization code in ggml-quants

ggml-ci

* ggml-quants : fix Zig and Swift builds + quantize tool

ggml-ci

* quantize : --pure option for disabling k-quant mixtures

---------

Co-authored-by: cebtenzzre <[email protected]>

Oct 29, 2023
d69d777
zip
tar.gz

b1440

metal : try cwd for ggml-metal.metal if bundle lookup fails (ggergano…

…v#3793)

* Try cwd for ggml-metal if bundle lookup fails

When building with `-DBUILD_SHARED_LIBS=ON -DLLAMA_METAL=ON -DLLAMA_BUILD_SERVER=ON`,
`server` would fail to load `ggml-metal.metal` because `[bundle pathForResource:...]`
returns `nil`.  In that case, fall back to `ggml-metal.metal` in the cwd instead of
passing `null` as a path.

Follows up on ggerganov#1782

* Update ggml-metal.m

---------

Co-authored-by: Georgi Gerganov <[email protected]>

Oct 28, 2023
82a6646
zip
tar.gz

b1437

llama : allow quantizing k-quants to fall back when tensor size incom…

…patible (ggerganov#3747)

* Allow quantizing k-quants to fall back when tensor size incompatible

* quantizing: Add warning when tensors were incompatible with k-quants

Clean up k-quants state passing a bit

Oct 28, 2023
bd6d9e2
zip
tar.gz

b1436

llama : add option for greedy sampling with probs (ggerganov#3813)

* llama : add option for greedy sampling with probs

* llama : add comment about llama_sample_token_greedy() missing probs

* sampling : temp == 0.0 -> no probs, temp < 0.0 -> probs

Oct 28, 2023
ee1a0ec
zip
tar.gz

b1435

common : print that one line of the syntax help *also* to standard ou…

…tput (ggerganov#3823)

Oct 28, 2023
1774611
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

b1448

b1446

b1445

b1444

b1443

b1442

b1440

b1437

b1436

b1435

Tags: a-h/llama.cpp