Skip to content

Releases: thomasantony/llamacpp-python

v0.1.14

10 Apr 22:06
Compare
Choose a tag to compare
  • Fixes #19
  • Updates llama.cpp submodule

v0.1.13

08 Apr 03:38
Compare
Choose a tag to compare
  • Adds support for "infinite text generation" using context swapping (similar to the main example in llama.cpp)

v0.1.12

07 Apr 04:55
Compare
Choose a tag to compare
  • Makes unit tests more consistent and usable (still not running in workflows as the weights are too large)
  • Updates llama.cpp submodule

v0.1.11

31 Mar 16:28
Compare
Choose a tag to compare
  • Breaking change but makes model loading practically instantaneous thanks to memory-mapped I/O
  • Requires re-generating the weight files using the new convert script (or use the migration script from llama.cpp)

v0.1.10

30 Mar 11:01
Compare
Choose a tag to compare
  • Adds back get_tokenizer() and add_bos() that were broken in previous release

v0.1.9

28 Mar 23:58
Compare
Choose a tag to compare
  • Updates the bindings to work with the new llama.cpp API from ggerganov/llama.cpp#370
  • Adds two separate interfaces - LlamaInference which is similar to the bindings in v0.1.8 and the lower level LlamaContext (currently untested)
  • The old bindings are still present in PyLlama.cpp but is currently not compiled and will be removed at a later date

v0.1.8

20 Mar 02:57
Compare
Choose a tag to compare
  • Adds a "tokenizer" object for use with oobabooga/text-generation-webui

v0.1.7

19 Mar 23:08
Compare
Choose a tag to compare
  • Switches from poetry to scikit-build as the build tool due to problems with cross-compiling on CI
  • Adds CI builds for macOS arm64 wheels
  • Adds windows wheel files to PyPI built on CI

v0.1.6

19 Mar 03:23
8851626
Compare
Choose a tag to compare
  • Fixes windows builds on CI (hopefully).
  • Removes torch and sentencepiece as dependencies. They have to be manually installed now if you want to use llamacpp-convert

v0.1.5

18 Mar 22:12
Compare
Choose a tag to compare

Includes new llamacpp-cli and llamacpp-chat entrypoints. There's possibly still some kind of bug that makes the performance of llamacpp-chat a bit worse than if you passed in the arguments directly into llamacpp-cli