Skip to content

Popular repositories Loading

  1. llm-awq llm-awq Public

    Forked from mit-han-lab/llm-awq

    AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

    Python 1

  2. neural-compressor neural-compressor Public

    Forked from intel/neural-compressor

    Intel® Neural Compressor (formerly known as Intel® Low Precision Optimization Tool), targeting to provide unified APIs for network compression technologies, such as low precision quantization, spar…

    Python

  3. qlora qlora Public

    Forked from artidoro/qlora

    QLoRA: Efficient Finetuning of Quantized LLMs

    Jupyter Notebook

  4. peft peft Public

    Forked from huggingface/peft

    🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

    Python

  5. OmniQuant OmniQuant Public

    Forked from OpenGVLab/OmniQuant

    OmniQuant is a simple and powerful quantization technique for LLMs.

    Python

  6. smoothquant smoothquant Public

    Forked from mit-han-lab/smoothquant

    [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

    Python

Repositories

Showing 10 of 12 repositories

Top languages

Loading…

Most used topics

Loading…