Lists (1)
Sort Name ascending (A-Z)
Stars
Code for studying the super weight in LLM
For releasing code related to compression methods for transformers, accompanying our publications
⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~
Applied AI experiments and examples for PyTorch
A repository dedicated to evaluating the performance of quantizied LLaMA3 using various quantization methods..
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (p…
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
An Open-source Toolkit for LLM Development
Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.
A framework for few-shot evaluation of language models.
Awesome LLM compression research papers and tools.
PyHessian is a Pytorch library for second-order based analysis and training of Neural Networks
Reorder-based post-training quantization for large language model
Torchreid: Deep learning person re-identification in PyTorch.
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).
OpenMMLab Pre-training Toolbox and Benchmark
Now we have become very big, Different from the original idea. Collect premium software in various categories.
[ECCV 2022] R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis
ALSO: Automotive Lidar Self-supervision by Occupancy estimation
Plenoxels: Radiance Fields without Neural Networks
Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks
Flops counter for convolutional networks in pytorch framework