ptq

Here are 14 public repositories matching this topic...

ModelTC / llmc

This is the official PyTorch implementation of "LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models"

Updated Jul 17, 2024
Python

Xilinx / brevitas

Star

Brevitas: neural network quantization in PyTorch

fpga deep-learning pytorch neural-networks xilinx quantization hardware-acceleration qat brevitas ptq

Updated Jul 17, 2024
Python

ambideXtrous9 / Quantization-of-Models-PTQ-and-QAT

Star

Quantization of Models : Post-Training Quantization(PTQ) and Quantize Aware Training(QAT)

keras pytorch quantization qat tflite pytorch-implementation tflite-models quantization-aware-training ptq

Updated Jul 16, 2024
Jupyter Notebook

Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. This project provides researchers, developers, and engineers advanced quantization and compression tools for deploying state-of-the-art neural networks.

machine-learning deep-neural-networks deep-learning neural-network tensorflow optimizer pytorch quantization qat network-quantization network-compression edge-ai ptq

Updated Jul 18, 2024
Python

lix19937 / tensorrt-insight

Star

deep insight tensorrt

asp tensorrt qat ptq

Updated Jul 5, 2024
C++

MAGICS-LAB / OutEffHop

Star

[ICML 2024] Outlier-Efficient Hopfield Layers for Large Transformer-Based Models

transformer outliers attention attention-mechanism outlier-removal outlier hopfield-neural-network ptq outlier-treatment modern-hopfield-networks modern-hopfield-model icml-2024 softmax-1 quantized-friendly no-op-outlier

Updated Jun 15, 2024
Python

Bobo-y / flexible-yolov5

Star

More readable and flexible yolov5 with more backbone(gcn, resnet, shufflenet, moblienet, efficientnet, hrnet, swin-transformer, etc) and (cbam，dcn and so on), and tensorrt

sparsity backbone pytorch resnet object-detection gcn tensorrt neck qat shufflenet yolov3 cbam hrnet dcnv2 yolov5 moblienet swin-transformer triton-server ptq

Updated May 8, 2024
Python

OmidGhadami95 / EfficientNetV2_Quantization_CK

Star

EfficientNetV2 (Efficientnetv2-b2) and quantization int8 and fp32 (QAT and PTQ) on CK+ dataset . fine-tuning, augmentation, solving imbalanced dataset, etc.