Fast-LLM powered by Candle 🦀

Yo! I really have no clue what I'm doing here, but here's me learning to rust by making candle's quantised llm examples into its own package.

None of the work here is original and all attributions should go to Laurent & Nicolas who made this gem of a library and with ready-to-use examples.

What does it do?

It allows you to run popular GGUF checkpoints on the Hugging Face Hub via Candle. Works on Macs with Metal or on CPU (although CPU is much much slower).

This is an alpha release and I expect quite a lot of this to change in the short term.

How do you run this bad boi?

Step 1: git clone https://github.com/Vaibhavs10/fast-llm.rs/

Step 2: cd fast-llm.rs

Step 3: cargo run --features metal --release -- --which 7b-mistral-instruct-v0.2 --prompt "What is the meaning of life according to a dog?" --sample-len 100

Note: you can remove the --features metal to run inference on CPU.

Check how to install Rust and how to use the CLI if you need to.

Which models are supported?

Mistral 7B
Llama 7B/ 13B/ 34B
CodeLlama 7B/ 13B/ 34B
Mixtral 8x7B

You can also bring your own GGUF checkpoint by passing a --model.

More details

Installing Rust

Just follow the official instructions.

How to use the CLI

When you use cargo run, command-line arguments go to cargo. Use -- to send them to the fast-llm binary. The following will compile the code in release mode (a cargo option), and then list all the options fast-llm supports.

cargo run --release -- --help

By default, fast-llm sends your prompt to the LLM, prints the response and quits. You can use interactive or chat mode too:

cargo run --release -- --prompt interactive. Runs in interactive mode. You can ask multiple independent queries, previous context is not retained.
cargo run --release -- --prompt chat. Runs in chat mode. Carries conversation history, just like when using ChatGPT or HuggingChat. In this mode you'll get best results with one of the Instruct versions of the models, Mistral, Zephyr, or OpenChat, as all these models are designed for chat.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
src		src
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fast-LLM powered by Candle 🦀

What does it do?

How do you run this bad boi?

Which models are supported?

More details

Installing Rust

How to use the CLI

About

Releases

Packages

Languages

Jzice/fast-llm.rs

Folders and files

Latest commit

History

Repository files navigation

Fast-LLM powered by Candle 🦀

What does it do?

How do you run this bad boi?

Which models are supported?

More details

Installing Rust

How to use the CLI

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages