Skip to content
View ryantd's full-sized avatar
🏎️
🏎️

Organizations

@kubeflow
Block or Report

Block or report ryantd

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Vector Search Engine base on BRPC + FAISS

C++ 143 50 Updated Oct 21, 2019

LLM training in simple, raw C/CUDA

Cuda 20,976 2,265 Updated Jun 21, 2024

Grok open release

Python 49,106 8,320 Updated May 29, 2024

The official PyTorch implementation of Google's Gemma models

Python 5,093 481 Updated Jun 3, 2024

A simple program to calculate and visualize the FLOPs and Parameters of Pytorch models, with handy CLI and easy-to-use Python API.

Python 115 9 Updated Dec 8, 2023

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 8,596 777 Updated Jun 17, 2024

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 5,526 490 Updated May 31, 2024

Make huge neural nets fit in memory

Python 2,648 271 Updated Apr 26, 2020

Provide Python access to the NVML library for GPU diagnostics

Python 202 31 Updated Jul 27, 2023

Building blocks for foundation models.

262 10 Updated Jan 3, 2024

Code repository for the paper - "Matryoshka Representation Learning"

Jupyter Notebook 349 16 Updated Feb 19, 2024

pytorch-profiler

Python 45 8 Updated Jun 1, 2023

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 416 31 Updated Apr 22, 2024

Efficient AI Inference & Serving

Python 448 25 Updated Jan 8, 2024

Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

Python 174 15 Updated Apr 24, 2024
Python 830 80 Updated Jun 14, 2024

A curated list of reinforcement learning with human feedback resources (continually updated)

2,956 189 Updated May 26, 2024

depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.

Python 372 8 Updated Jun 18, 2024
Python 8,164 475 Updated Jan 27, 2024

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

C++ 7,532 395 Updated Jun 11, 2024

A collection of memory efficient attention operators implemented in the Triton language.

Python 171 13 Updated Jun 5, 2024

[TMLR 2024] Efficient Large Language Models: A Survey

797 66 Updated Jun 18, 2024
Python 1,093 154 Updated May 28, 2024

Awesome machine learning model compression research papers, tools, and learning material.

450 59 Updated May 8, 2024

[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

Python 10,063 1,026 Updated Jun 21, 2024

MLX: An array framework for Apple silicon

C++ 15,514 880 Updated Jun 21, 2024

Mamba SSM architecture

Python 11,254 904 Updated Jun 19, 2024

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python 2,637 193 Updated May 21, 2024

FinOps and cloud cost optimization tool. Supports AWS, Azure, GCP, Alibaba Cloud and Kubernetes.

Python 1,065 151 Updated Jun 21, 2024
Next