#
post-training-quantization
Here are
33 public repositories
matching this topic...
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、reg…
Updated
Oct 6, 2021
Python
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
Updated
Aug 16, 2024
Python
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
Updated
Aug 12, 2024
Python
[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
Updated
Apr 11, 2023
Python
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
Updated
Aug 13, 2024
Python
A model compression and acceleration toolbox based on pytorch.
Updated
Jan 12, 2024
Python
This repository contains notebooks that show the usage of TensorFlow Lite for quantizing deep neural networks.
Updated
Jan 23, 2023
Jupyter Notebook
This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
Updated
Aug 16, 2024
Python
[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.
Updated
Mar 21, 2024
Python
Notes on quantization in neural networks
Updated
Dec 14, 2023
Jupyter Notebook
Post-training static quantization using ResNet18 architecture
Updated
Aug 1, 2020
Jupyter Notebook
Updated
Apr 6, 2021
Python
[CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".
Updated
Aug 1, 2024
Jupyter Notebook
quantization example for pqt & qat
Updated
Jul 24, 2023
Python
Model Quantization with Pytorch, Tensorflow & Larq
Generating tensorrt model using onnx
Pytorch implementation of our paper accepted by ECCV 2022-- Fine-grained Data Distribution Alignment for Post-Training Quantization
Updated
Sep 13, 2022
Python
[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"
Updated
Mar 11, 2024
Python
Updated
Feb 21, 2024
Python
A framework to train a ResUNet architecture, quantize, compile and execute it on an FPGA.
Updated
Jun 23, 2023
Jupyter Notebook
Improve this page
Add a description, image, and links to the
post-training-quantization
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
post-training-quantization
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.