Skip to content
View helloyongyang's full-sized avatar
Block or Report

Block or report helloyongyang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A tool for model sparse based on torch.fx

Python 5 1 Updated Jun 3, 2024

CUDA Library Samples

Cuda 1,397 291 Updated Jun 26, 2024

This is the official PyTorch implementation of "LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models", and also an efficient LLM compression too…

Python 117 9 Updated Jul 1, 2024

本项目旨在分享大模型相关技术原理以及实战经验。

HTML 7,518 733 Updated Jun 30, 2024

FinQwen: 致力于构建一个开放、稳定、高质量的金融大模型项目,基于大模型搭建金融场景智能问答系统,利用开源开放来促进「AI+金融」。

Jupyter Notebook 184 20 Updated Jun 11, 2024

Must-read Papers on LLM Agents.

1,406 70 Updated Jun 30, 2024

基于ChatGLM-6B、ChatGLM2-6B、ChatGLM3-6B模型,进行下游具体任务微调,涉及Freeze、Lora、P-tuning、全参微调等

Python 2,559 284 Updated Dec 12, 2023

Solve puzzles. Learn CUDA.

Jupyter Notebook 5,310 303 Updated Jun 26, 2024

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

1,854 130 Updated Jun 30, 2024

Awesome LLM compression research papers and tools.

848 48 Updated Jun 25, 2024

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Python 614 48 Updated Mar 19, 2024

Offline Quantization Tools for Deploy.

Python 108 14 Updated Dec 28, 2023

GPTQ inference Triton kernel

Jupyter Notebook 267 20 Updated May 18, 2023

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 13 14 Updated Mar 16, 2023

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Python 15,602 1,852 Updated Jun 27, 2024

TensorRT 7 C++ (almost) minimal examples

C++ 71 6 Updated Nov 11, 2023

PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity

Cuda 94 24 Updated Jun 24, 2024

A latent text-to-image diffusion model

Jupyter Notebook 66,482 9,964 Updated Jun 18, 2024

PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.

Python 1,438 222 Updated Mar 28, 2024

Pretrained models on CIFAR10/100 in PyTorch

Python 282 50 Updated Mar 3, 2023

Classification with PyTorch.

Python 1,666 563 Updated Jun 18, 2024

MobileNetV1, MobileNetV2, VGG based SSD/SSD-lite implementation in Pytorch 1.0 / Pytorch 0.4. Out-of-box support for retraining on Open Images dataset. ONNX and Caffe2 support. Experiment Ideas lik…

Python 1,380 524 Updated Mar 11, 2023

A PyTorch implementation of SSDLite on COCO

Python 83 21 Updated Nov 18, 2020

Pytorch implementation of RetinaNet object detection.

Python 2,123 667 Updated Apr 29, 2023

An Efficient Framework for Fast UAV Exploration

C++ 861 186 Updated Aug 2, 2023

Prune DNN using Alternating Direction Method of Multipliers (ADMM)

Python 96 18 Updated Oct 15, 2019

[ICLR 2020] Contrastive Representation Distillation (CRD), and benchmark of recent knowledge distillation methods

Python 2,109 388 Updated Oct 16, 2023

Pytorch implementation of various Knowledge Distillation (KD) methods.

Python 1,546 262 Updated Nov 25, 2021
Next