HF-LLM.rs is a CLI tool for accessing Large Language Models (LLMs) like Llama 3.1, Mistral, Gemma 2, Cohere and much more hosted on Hugging Face. It allows you to interact with various models, provide input, and receive responses in a terminal environment.
Also, you can find the list of models supported by this CLI here—the list keeps getting updated as new models are released on the Hugging Face Hub. You might require a Pro subscription to access some models.
- Model Selection: Choose from a variety of models available & deployed on Hugging Face infrastructure.
- Input Prompt: Provide an input prompt to generate responses.
- Streaming Output: Receive responses in real-time as the model generates output.
- Chat Mode: Start an interactive chat session with the LLM.
-
Clone the repository:
git clone https://github.com/vaibhavs10/hf-llm.rs.git
-
Navigate to the project directory:
cd hf-llm.rs
-
Build the project:
cargo build --release
-
Verify the installation:
cargo run --release -- --help
To use HF-LLM.rs, follow these steps:
cargo run --release -- -m "meta-llama/Meta-Llama-3.1-70B-Instruct" -p "How to make a dangerously spicy ramen?"
You can also use the chat mode to start an interactive chat session with the LLM.
cargo run --release -- -m "meta-llama/Meta-Llama-3.1-70B-Instruct" -c
Note: Make sure to run huggingface-cli login
to log into your Hugging Face account to access some of the models.
This project is licensed under the MIT License. See the LICENSE file for more details.