Skip to content

Tags: RWKV/rwkv.cpp

Tags

master-39ed572

Toggle master-39ed572's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Various improvements (#131)

* Implement model head offloading

* Guess the tokenizer from n_vocab

* Make PyTorch optional for inference

* Add function to offload layers

* Add rwkv_eval_sequence_in_chunks

master-0df970a

Toggle master-0df970a's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Decrease memory padding for serial and sequential contexts (#132)

master-6caa45e

Toggle master-6caa45e's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Python API restructurization & code style improvements (#130)

* Replace tabs with 4 spaces

* Refactor tests

* Rename Python scripts directory to "python"

* Create a separate package for the official Python API

* Move Python inference example to a separate file

* Add missing const

* Refactor extras

* Split rwkv.cpp into smaller files

* Clean up cpp code

* Rename rwkv package to rwkv_cpp

* Add missing type hints

* Rewrite automatic library lookup

* Add compatibility warning

* Fix MacOS build

* Fix MacOS build

master-8db73b1

Toggle master-8db73b1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Update ggml (#128)

* Fix quantize.py doc

* Add Q5 format compatibility test

* Update ggml

* Add documentation about limitations of sequence mode

* Fix most compiler warnings

* Clean up CMakeLists.txt

* Assert contiguity instead of assuming it

* Update README.md

* Fix warnings

* Try to fix compilation error

* Attempt to fix Ubuntu build

* Attempt to fix Ubuntu build

* Restore all build jobs

* Allow sequence lengths of up to 64 out of the box by forking ggml

master-d6c691e

Toggle master-d6c691e's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
add other language bindings (#126)

* add other language bindings

* Update README.md

---------

Co-authored-by: Alex <[email protected]>

master-2d3cdd7

Toggle master-2d3cdd7's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
only append to cpu string if not initialized (#125)

* only append to cpu string if not initialized

* Fix code style

---------

Co-authored-by: Alex <[email protected]>

master-84f34c5

Toggle master-84f34c5's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Implement basic CLBlast support (#110)

* Get this thing building

Unzip the OpenCL SDK and CLBlast distribution into the repo root,
then enable RWKV_CLBLAST and regenerate makefiles to pick them up.

Currently builds and runs.

* Really offload tensors to OpenCL rather than cuBLAS

* Fix CLBlast builds in CMake release mode

Somehow the path handling is different here which requires me to
be quite a bit more annoying about it.

* Remove `brew update`

* Try building without sanitizer (maybe it would work this time?)

---------

Co-authored-by: saharNooby <[email protected]>

master-f685aa4

Toggle master-f685aa4's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Fix "'NoneType' object has no attribute 'cast'" error when model is f…

…reed (#117)

master-25ee75e

Toggle master-25ee75e's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Expose n_vocab, n_embed, n_layer to the Python interface (#118)

master-84634c0

Toggle master-84634c0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Elide logits if the logits pointer parameter is NULL (#107)

* Completely skip calculation of logits if nobody cares

This speeds up sequence mode evaluations by up to 20% if you ingest
a large prompt and then only retrieve the logits at the very end.

Note that you must pass a NULL pointer to the logits parameter in
order to take advantage of this optimization.

* logits_out=NULL documentation