Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization

Nrusimha, Aniruddha; Mishra, Mayank; Wang, Naigang; Alistarh, Dan; Panda, Rameswar; Kim, Yoon

Computer Science > Machine Learning

arXiv:2404.03605 (cs)

[Submitted on 4 Apr 2024 (v1), last revised 26 Aug 2024 (this version, v2)]

Title:Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization

Authors:Aniruddha Nrusimha, Mayank Mishra, Naigang Wang, Dan Alistarh, Rameswar Panda, Yoon Kim

View PDF HTML (experimental)

Abstract:We consider the problem of accurate quantization for language models, where both the weights and activations are uniformly quantized to 4 bits per parameter, the lowest bitwidth format natively supported by GPU hardware. In this context, the key challenge is activation quantization: it is known that language models contain outlier channels whose values on average are orders of magnitude higher than than other channels, which prevents accurate low-bitwidth quantization with known techniques. We systematically study this phenomena and find that these outlier channels emerge early in training, and that they occur more frequently in layers with residual streams. We then propose a simple strategy which regularizes a layer's inputs via quantization-aware training (QAT) and its outputs via activation kurtosis regularization. We show that regularizing both the inputs and outputs is crucial for preventing a model's "migrating" the difficulty in input quantization to the weights, which makes post-training quantization (PTQ) of weights more difficult. When combined with weight PTQ, we show that our approach can obtain a W4A4 model that performs competitively to the standard-precision W16A16 baseline.

Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL)
Cite as:	arXiv:2404.03605 [cs.LG]
	(or arXiv:2404.03605v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2404.03605

Submission history

From: Aniruddha Nrusimha [view email]
[v1] Thu, 4 Apr 2024 17:25:30 UTC (7,569 KB)
[v2] Mon, 26 Aug 2024 20:48:19 UTC (7,575 KB)

Computer Science > Machine Learning

Title:Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators