full-parameter or lora? #3

Nastu-Ho · 2024-06-17T11:17:47Z

Will full-parameter fine-tuning be better?

mmaaz60 · 2024-06-17T14:53:23Z

As per our experiments, we did not notice any accuracy gains when using Phi3-mini-4K LLM or Vicuna. However, full fine-tuning was better in the case of LLaMA-3 as compared to LoRA.

Nastu-Ho · 2024-06-18T02:46:57Z

Hi @Nastu-Ho,

As per our experiments, we did not notice any accuracy gains when using Phi3-mini-4K LLM or Vicuna. However, full fine-tuning was better in the case of LLaMA-3 as compared to LoRA.

Thank you for your reply.
One thing I'm curious about is whether a stronger LLM can continue to bring improvements in mvbench.
The problem I am currently encountering is that my model has been replaced with a stronger LLM (qwen2, mistral) and has not been significantly improved like videochat2.
So I'm curious about the performance of your method using different LLMs (such as LLaMA3, mistral) on mvbench?

mmaaz60 · 2024-06-18T03:49:48Z

Hi @Nastu-Ho,

We do not have the MVBench results using LLaMA-3 LLM, however, I can share the numbers with Vicuna 7B and 13B that we observe during experiments and this may give us some clues about the trend.

With Vicuna 7B and 13B, we obtain 53.10 and 58.67 average scores on MVBench. These experiments show that using a better/stronger LLM improves the MVBench performance. However, we do not have any ablations with LLaMA-3 and Mistral.

If you have any findings, please do share. Thank You.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

full-parameter or lora? #3

full-parameter or lora? #3

Nastu-Ho commented Jun 17, 2024

mmaaz60 commented Jun 17, 2024

Nastu-Ho commented Jun 18, 2024 •

edited

mmaaz60 commented Jun 18, 2024

full-parameter or lora? #3

full-parameter or lora? #3

Comments

Nastu-Ho commented Jun 17, 2024

mmaaz60 commented Jun 17, 2024

Nastu-Ho commented Jun 18, 2024 • edited

mmaaz60 commented Jun 18, 2024

Nastu-Ho commented Jun 18, 2024 •

edited