Remove Q4_3 which is no better than Q5 #1218
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I hope this isn't too controversial...
Q4_3 turns out to be equal or worse than the Q5 types in all criteria we have: perplexity, file size, token generation speed.
In the interest of reducing code base complexity, remove the Q4_3 type.
It has only been introduced last week I think, so I don't think many people use it. Of course I'm ready to be proven wrong on this...
Notes:
GGML_TYPE_COUNT
is now somewhat incorrect. I didn't want to change the enum values that are used in model files, but we might moveGGML_TYPE_I8
to the now unused value 5.