-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
question about bits_pattern feature #1839
Comments
Could you give further references to this feature? I haven't seen it anywhere. It seems like you talk about weight quantization. Note that in general, only the weights of the base model are quantized. The adapter weights used by PEFT are not quantized, since it is intended that they be trained. Therefore, this method you describe should be implemented on the level of the base model, so e.g. on libraries like transformers. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
Feature request
The bits_pattern function is crucial for assigning different quantization levels to each layer of a neural network. This flexibility is essential for optimizing the performance and efficiency of models, especially in resource-constrained environments, I want to know how you plan to achieve such a feature.
Motivation
I propose implementing the bits_pattern function to allow users to specify different quantization bits for each layer in a neural network.
Your contribution
I could help with implementing the function
The text was updated successfully, but these errors were encountered: