-
XJTU
- Xi'an, China
Block or Report
Block or report helloyongyang
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLanguage
Sort by: Recently starred
Starred repositories
This is the official PyTorch implementation of "LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models", and also an efficient LLM compression too…
FinQwen: 致力于构建一个开放、稳定、高质量的金融大模型项目,基于大模型搭建金融场景智能问答系统,利用开源开放来促进「AI+金融」。
基于ChatGLM-6B、ChatGLM2-6B、ChatGLM3-6B模型,进行下游具体任务微调,涉及Freeze、Lora、P-tuning、全参微调等
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
Awesome LLM compression research papers and tools.
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
mbehm / transformers
Forked from huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity
A latent text-to-image diffusion model
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
Pretrained models on CIFAR10/100 in PyTorch
MobileNetV1, MobileNetV2, VGG based SSD/SSD-lite implementation in Pytorch 1.0 / Pytorch 0.4. Out-of-box support for retraining on Open Images dataset. ONNX and Caffe2 support. Experiment Ideas lik…
A PyTorch implementation of SSDLite on COCO
Pytorch implementation of RetinaNet object detection.
An Efficient Framework for Fast UAV Exploration
Prune DNN using Alternating Direction Method of Multipliers (ADMM)
[ICLR 2020] Contrastive Representation Distillation (CRD), and benchmark of recent knowledge distillation methods
Pytorch implementation of various Knowledge Distillation (KD) methods.