Skip to content

Commit

Permalink
doc: add references to hugging face GGUF-my-repo quantisation web too…
Browse files Browse the repository at this point in the history
…l. (ggerganov#7288)

* chore: add references to the quantisation space.

* fix grammer lol.

* Update README.md

Co-authored-by: Julien Chaumond <[email protected]>

* Update README.md

Co-authored-by: Georgi Gerganov <[email protected]>

---------

Co-authored-by: Julien Chaumond <[email protected]>
Co-authored-by: Georgi Gerganov <[email protected]>
  • Loading branch information
3 people committed May 16, 2024
1 parent 172b782 commit ad52d5c
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 1 deletion.
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -712,6 +712,9 @@ Building the program with BLAS support may lead to some performance improvements

### Prepare and Quantize

> [!NOTE]
> You can use the [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space on Hugging Face to quantise your model weights without any setup too. It is synced from `llama.cpp` main every 6 hours.

To obtain the official LLaMA 2 weights please see the <a href="#obtaining-and-using-the-facebook-llama-2-model">Obtaining and using the Facebook LLaMA 2 model</a> section. There is also a large selection of pre-quantized `gguf` models available on Hugging Face.

Note: `convert.py` does not support LLaMA 3, you can use `convert-hf-to-gguf.py` with LLaMA 3 downloaded from Hugging Face.
Expand Down
4 changes: 3 additions & 1 deletion examples/quantize/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# quantize

TODO
You can also use the [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space on Hugging Face to build your own quants without any setup.

Note: It is synced from llama.cpp `main` every 6 hours.

## Llama 2 7B

Expand Down

0 comments on commit ad52d5c

Please sign in to comment.