- Beijing
- https://ryantd.github.io/
Block or Report
Block or report ryantd
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLanguage
Sort by: Recently starred
Starred repositories
Vector Search Engine base on BRPC + FAISS
The official PyTorch implementation of Google's Gemma models
A simple program to calculate and visualize the FLOPs and Parameters of Pytorch models, with handy CLI and easy-to-use Python API.
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Make huge neural nets fit in memory
Provide Python access to the NVML library for GPU diagnostics
Building blocks for foundation models.
Code repository for the paper - "Matryoshka Representation Learning"
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
A curated list of reinforcement learning with human feedback resources (continually updated)
depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
A collection of memory efficient attention operators implemented in the Triton language.
[TMLR 2024] Efficient Large Language Models: A Survey
Awesome machine learning model compression research papers, tools, and learning material.
[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
FinOps and cloud cost optimization tool. Supports AWS, Azure, GCP, Alibaba Cloud and Kubernetes.