-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ImportError: Using bitsandbytes
8-bit quantization requires Accelerate: pip install accelerate
and the latest version of bitsandbytes: pip install -i https://pypi.org/simple/ bitsandbytes
#13569
Comments
Hey @AnandUgale, great to run into you again on here! 🚀 It looks like you've stumbled upon an intriguing challenge. I'm diving into the details now and will circle back with a more comprehensive response soon. Stay tuned! |
This isn't really a llama index issue, it's a huggingface issue 😅 if you are in a notebook, you might have to restart your notebook after installing |
To address the
For integrating
|
Bug Description
ImportError: Using
bitsandbytes
8-bit quantization requires Accelerate:pip install accelerate
and the latest version of bitsandbytes:pip install -i https://pypi.org/simple/ bitsandbytes
Environment
Packages installed with CUDA 11.8:
Version
0.10.37
Steps to Reproduce
import torch
from llama_index.llms.huggingface import HuggingFaceLLM
Optional quantization to 4bit
from transformers import BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
)
llm = HuggingFaceLLM(
model_name="meta-llama/Meta-Llama-3-8B-Instruct",
model_kwargs={
"token": hf_token,
"torch_dtype": torch.bfloat16, # comment this line and uncomment below to use 4bit
# "quantization_config": quantization_config
},
generate_kwargs={
"do_sample": True,
"temperature": 0.6,
"top_p": 0.9,
},
tokenizer_name="meta-llama/Meta-Llama-3-8B-Instruct",
tokenizer_kwargs={"token": hf_token},
stopping_ids=stopping_ids,
)
Relevant Logs/Tracbacks
The text was updated successfully, but these errors were encountered: