garrett4wade

Follow

Wei Fu garrett4wade

Follow

Ph.D. student in Tsinghua

20 followers · 5 following

Tsinghua University
Beijing, China

Achievements

Achievements

Block or Report

Block or report garrett4wade

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Stars

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 764 64 Updated Jul 6, 2024

mistralai / mistral-inference

Official inference library for Mistral models

Jupyter Notebook 9,146 801 Updated Jun 22, 2024

databricks / megablocks

Python 1,116 154 Updated May 28, 2024

FlagOpen / TACO

Python 127 5 Updated Jun 25, 2024

astral-sh / ruff

An extremely fast Python linter and code formatter, written in Rust.

Rust 28,766 933 Updated Jul 6, 2024

facebookresearch / chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 1,430 88 Updated Jun 21, 2024

facebookresearch / HolisticTraceAnalysis

A library to analyze PyTorch traces.

Python 249 33 Updated Jul 5, 2024

openpsi-project / ReaLHF

Super-Efficient RLHF Training of LLMs with Parameter Reallocation

Python 24 1 Updated Jul 4, 2024

huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

Python 23,951 4,929 Updated Jul 7, 2024

pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 5,352 484 Updated Jul 3, 2024

CompVis / taming-transformers

Taming Transformers for High-Resolution Image Synthesis

Jupyter Notebook 5,547 1,107 Updated Apr 25, 2024

valeoai / Maskgit-pytorch

Jupyter Notebook 121 13 Updated Feb 2, 2024

google-research / maskgit

Official Jax Implementation of MaskGIT

Jupyter Notebook 402 47 Updated Nov 18, 2022

maitrix-org / Pandora

Pandora: Towards General World Model with Natural Language Actions and Video States

Python 428 26 Updated May 27, 2024

Stonesjtu / pytorch_memlab

Profiling and inspecting memory in pytorch

Python 995 35 Updated Sep 12, 2023

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 21,431 2,327 Updated Jul 4, 2024

HPDL-Group / Merak

Python 68 9 Updated Dec 18, 2023

alibaba / EasyParallelLibrary

Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.

Python 253 49 Updated Mar 31, 2023

LLNL / Umpire

An application-focused API for memory management on NUMA & GPU architectures

C++ 306 52 Updated Jul 1, 2024

xai-org / grok-1

Grok open release

Python 49,149 8,310 Updated May 29, 2024

google / seqio

Task-based datasets, preprocessing, and evaluation for sequence models.

Python 544 58 Updated Jul 6, 2024

usyd-fsalab / fp6_llm

An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).

Cuda 154 12 Updated May 28, 2024

iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

C++ 2,481 553 Updated Jul 6, 2024

catie-aq / flashT5

A fast implementation of T5/UL2 in PyTorch using Flash Attention

Python 53 6 Updated Jun 26, 2024

google-deepmind / alphageometry

Python 3,809 419 Updated Jul 6, 2024

leptonai / search_with_lepton

Building a quick conversation-based search demo with Lepton AI.

TypeScript 7,513 964 Updated Jun 22, 2024

onnx / onnx

Open standard for machine learning interoperability

Python 17,228 3,624 Updated Jul 5, 2024

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 7,402 800 Updated Jul 5, 2024

cumulo-autumn / StreamDiffusion

StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation

Python 9,251 657 Updated Jun 16, 2024

state-spaces / mamba

Mamba SSM architecture

Python 11,528 943 Updated Jul 3, 2024