Fix Q3_K_XS for MoE models #5113

ikawrakow · 2024-01-24T15:05:42Z

I made the very same mistake as I when I was restoring k-quants quantization mixture for MoE models.
This PR fixes it and MoE models should now work with Q3_K_XS.

Co-authored-by: Iwan Kawrakow <[email protected]>

Fix Q3_K_XS for MoE models

baa70cd

ikawrakow mentioned this pull request Jan 24, 2024

Add Q3_K_XS #5060

Merged

ggerganov approved these changes Jan 25, 2024

View reviewed changes

ikawrakow merged commit faa3526 into master Jan 25, 2024
48 checks passed

ikawrakow deleted the ik/fix_q3k_xs branch January 25, 2024 15:58

jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Feb 3, 2024

Fix Q3_K_XS for MoE models (ggerganov#5113)

14f71e9

Co-authored-by: Iwan Kawrakow <[email protected]>

hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024

Fix Q3_K_XS for MoE models (ggerganov#5113)

1505297

Co-authored-by: Iwan Kawrakow <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Q3_K_XS for MoE models #5113

Fix Q3_K_XS for MoE models #5113

ikawrakow commented Jan 24, 2024

Fix Q3_K_XS for MoE models #5113

Fix Q3_K_XS for MoE models #5113

Conversation

ikawrakow commented Jan 24, 2024