RuntimeError: Expected is_sm80 || is_sm90 to be true, but got false. #161

RakshitAralimatti · 2024-06-21T06:09:25Z

Getting this error when trying to quantize llama3-8b-instruct in an T4 GPU.
torch version - 2.1.1

wenhuach21 · 2024-06-21T06:14:11Z

Thank you for trying AutoRound. Could you kindly attach more log?

RakshitAralimatti · 2024-06-21T06:19:33Z

thanks for your response.

Code i am running -
`from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "meta-llama/Meta-Llama-3-8B-Instruct"
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

from auto_round import AutoRound

bits, group_size, sym = 4, 128, False

autoround = AutoRound(model, tokenizer, bits=bits, group_size=group_size, sym=sym, device=None)
autoround.quantize()
output_dir = "4bit_autoRound"
autoround.save_quantized(output_dir) `

Error I am getting -
Loading checkpoint shards: 100%|██████████| 4/4 [00:03<00:00, 1.13it/s] Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 2024-06-21 05:22:46 INFO utils.py L527: Using GPU device 2024-06-21 05:22:46 INFO autoround.py L464: using torch.float16 for quantization tuning 2024-06-21 05:22:52 INFO autoround.py L851: switch to cpu to cache inputs 2024-06-21 05:23:09 INFO autoround.py L1306: quantizing 1/32, model.layers.0 Traceback (most recent call last): File "/home/sandlogic/LINGO/LINGO_PROJECTS/ModelQuantization/Intel_AutoRound/llama3_4bit_Autoround.py", line 12, in <module> autoround.quantize() File "/home/sandlogic/LINGO/LINGO_ENV/ModelQuantization/lib/python3.10/site-packages/auto_round/autoround.py", line 575, in quantize self.quant_blocks( File "/home/sandlogic/LINGO/LINGO_ENV/ModelQuantization/lib/python3.10/site-packages/auto_round/autoround.py", line 1316, in quant_blocks q_input, input_ids = self.quant_block( File "/home/sandlogic/LINGO/LINGO_ENV/ModelQuantization/lib/python3.10/site-packages/auto_round/autoround.py", line 1208, in quant_block self.scale_loss_and_backward(scaler, loss) File "/home/sandlogic/LINGO/LINGO_ENV/ModelQuantization/lib/python3.10/site-packages/auto_round/autoround.py", line 1470, in scale_loss_and_backward scale_loss.backward() File "/home/sandlogic/LINGO/LINGO_ENV/ModelQuantization/lib/python3.10/site-packages/torch/_tensor.py", line 492, in backward torch.autograd.backward( File "/home/sandlogic/LINGO/LINGO_ENV/ModelQuantization/lib/python3.10/site-packages/torch/autograd/__init__.py", line 251, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: Expected is_sm80 || is_sm90 to be true, but got false. (Could this error message be improved? If so, please report an enhancement request to PyTorch.)

Cuda version - 11.8
torch version - 2.1.1
python version - 3.10

wenhuach21 · 2024-06-21T06:27:38Z

Thanks for the information. AutoRound requires gradient backward, however, the root cause should be related to torch or sdpa-attention. please refer to pytorch/pytorch#98140 or try other solutions online.

wenhuach21 · 2024-06-21T06:33:24Z

or huggingface/accelerate#2799

PabloButron mentioned this issue Jul 24, 2024

Request for Apple Metal Device Support #200

Open

wenhuach21 closed this as completed Aug 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: Expected is_sm80 || is_sm90 to be true, but got false. #161

RuntimeError: Expected is_sm80 || is_sm90 to be true, but got false. #161

RakshitAralimatti commented Jun 21, 2024

wenhuach21 commented Jun 21, 2024

RakshitAralimatti commented Jun 21, 2024 •

edited

Loading

wenhuach21 commented Jun 21, 2024 •

edited

Loading

wenhuach21 commented Jun 21, 2024

RuntimeError: Expected is_sm80 || is_sm90 to be true, but got false. #161

RuntimeError: Expected is_sm80 || is_sm90 to be true, but got false. #161

Comments

RakshitAralimatti commented Jun 21, 2024

wenhuach21 commented Jun 21, 2024

RakshitAralimatti commented Jun 21, 2024 • edited Loading

wenhuach21 commented Jun 21, 2024 • edited Loading

wenhuach21 commented Jun 21, 2024

RakshitAralimatti commented Jun 21, 2024 •

edited

Loading

wenhuach21 commented Jun 21, 2024 •

edited

Loading