[Question] Have you finished LoRA/QLoRA training? #206

Iceland-Leo · 2023-06-01T04:40:12Z

Question

Nice work!
I just wonder when could you finish LoRA/QLoRA training? It may help a lot to finetune.

Iceland-Leo · 2023-06-03T04:41:05Z

Does anyone complete the lora/qlora training on llava?

haotian-liu · 2023-06-03T04:47:35Z

HI, thank you for your interest in our work. We are implementing the LoRA/QLoRA training and we are now verifying the correctness with training some model checkpoints. Will update soon if the performance is verified. Thanks!

XipengY · 2023-06-07T04:40:04Z

@haotian-liu Looking forward to the implementation of LoRA/QLoRA training, could you put the code in the develop branch, and we can try to develop and verify it together.

haotian-liu · 2023-06-11T23:15:18Z

Hi @XipengY @Iceland-Leo

LoRA support (preview) and an initial checkpoint I trained is released here. Please let me know if there is something unclear in the instruction.

Also, if any of you are interested in contributing to the hyperparameter search, please let me know.

Note that QLoRA support is partially finished, but distributed training is not supported yet.

devrituraj · 2023-06-13T17:33:27Z

Hi...

I find that LoRA Fine-tuning is slower (almost half - 5.72 s/it) as compared to full training (2.77 s/it). Same LR - 2e-5 and Batch Size -4, Grad-accum- 1 was considered for both the setups. Script was run using torchrun with deepspeed config(zero3.json) as an argument. Are there any similar observations? Intuitively, LoRA should be faster!

btw, wonderful work!

haotian-liu · 2023-06-13T17:36:56Z

Hi @devrituraj

Thank you for trying this out and for providing the feedback. Which GPUs (and how many) are you using? And can you provide your commands? And do you notice a reduction in the GPU memory consumption?

For smaller models I noticed a smaller performance benefit. It would be better to know more about the specific configurations on your side. Thanks!

aprilehannibal · 2023-06-15T06:28:48Z

@haotian-liu How many hours do you spend when finetune 13B with lora?

XipengY · 2023-06-16T07:47:20Z

@haotian-liu ，It's great work support deepspeed and lora, significantly reduces GPU memory, but I changed tf32 and bf16 to False, it also fails to train, do you have the same problem?

YerongLi · 2023-07-21T09:57:12Z

Does anyone complete the lora/qlora training on llava?

I think lora works now, but quantization does not work for me, does anyone complete the QLora training?

quantization reports this error:

  File "/scratch/yerong/LLaVA/llava/train/train.py", line 657, in train                                                                              [276/971]
    model = LlavaLlamaForCausalLM.from_pretrained(                                                                                                            
  File "/scratch/yerong/.conda/envs/llava/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2694, in from_pretrained                         
    dispatch_model(model, device_map=device_map, offload_dir=offload_folder, offload_index=offload_index)                                                     
  File "/scratch/yerong/.conda/envs/llava/lib/python3.10/site-packages/accelerate/big_modeling.py", line 371, in dispatch_model                               
    attach_align_device_hook_on_blocks(                                                                                                                       
  File "/scratch/yerong/.conda/envs/llava/lib/python3.10/site-packages/accelerate/hooks.py", line 506, in attach_align_device_hook_on_blocks                  
    add_hook_to_module(module, hook)                                                                                                                          
  File "/scratch/yerong/.conda/envs/llava/lib/python3.10/site-packages/accelerate/hooks.py", line 155, in add_hook_to_module                                  
    module = hook.init_hook(module)                                                                                                                           
  File "/scratch/yerong/.conda/envs/llava/lib/python3.10/site-packages/accelerate/hooks.py", line 253, in init_hook                                           
    set_module_tensor_to_device(module, name, self.execution_device)                                                                                          
  File "/scratch/yerong/.conda/envs/llava/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 267, in set_module_tensor_to_device                
    raise ValueError(f"{tensor_name} is on the meta device, we need a `value` to put in on {device}.")                                                        
ValueError: weight is on the meta device, we need a `value` to put in on cuda:0.

haotian-liu · 2023-07-24T00:23:07Z

@aprilehannibal I do not see too much of a difference between LoRA and full-finetuning, both around 1.5-2 hours on finetune lightning.

haotian-liu · 2023-07-24T05:46:16Z

@YerongLi

I have updated the QLora support. Please pull the latest repo and reinstall the packages using pip install -e .

See example script here.

lilia147852 · 2024-06-18T13:18:00Z

@haotian-liu
excuse me can i implement QLoRA for LLAVA v1.5 ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Have you finished LoRA/QLoRA training? #206

[Question] Have you finished LoRA/QLoRA training? #206

Iceland-Leo commented Jun 1, 2023

Iceland-Leo commented Jun 3, 2023

haotian-liu commented Jun 3, 2023

XipengY commented Jun 7, 2023

haotian-liu commented Jun 11, 2023

devrituraj commented Jun 13, 2023

haotian-liu commented Jun 13, 2023

aprilehannibal commented Jun 15, 2023 •

edited

Loading

XipengY commented Jun 16, 2023

YerongLi commented Jul 21, 2023 •

edited

Loading

haotian-liu commented Jul 24, 2023

haotian-liu commented Jul 24, 2023

lilia147852 commented Jun 18, 2024 •

edited

Loading

[Question] Have you finished LoRA/QLoRA training? #206

[Question] Have you finished LoRA/QLoRA training? #206

Comments

Iceland-Leo commented Jun 1, 2023

Question

Iceland-Leo commented Jun 3, 2023

haotian-liu commented Jun 3, 2023

XipengY commented Jun 7, 2023

haotian-liu commented Jun 11, 2023

devrituraj commented Jun 13, 2023

haotian-liu commented Jun 13, 2023

aprilehannibal commented Jun 15, 2023 • edited Loading

XipengY commented Jun 16, 2023

YerongLi commented Jul 21, 2023 • edited Loading

haotian-liu commented Jul 24, 2023

haotian-liu commented Jul 24, 2023

lilia147852 commented Jun 18, 2024 • edited Loading

aprilehannibal commented Jun 15, 2023 •

edited

Loading

YerongLi commented Jul 21, 2023 •

edited

Loading

lilia147852 commented Jun 18, 2024 •

edited

Loading