Skip to content
View leeeizhang's full-sized avatar
🚀
Ignition sequence start
🚀
Ignition sequence start
  • 13:10 (UTC +08:00)

Block or report leeeizhang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

Showing results

A large-scale simulation framework for LLM inference

Python 243 28 Updated Oct 1, 2024

🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with LlamaIndex, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

TypeScript 5,809 544 Updated Oct 3, 2024

Open-source observability for your LLM application, based on OpenTelemetry

Python 1,895 176 Updated Oct 3, 2024

Efficient Triton Kernels for LLM Training

Python 3,114 159 Updated Oct 3, 2024

BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.

C++ 802 160 Updated Aug 28, 2024

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Python 1,848 149 Updated Sep 25, 2024

Pipeline Parallelism for PyTorch

Python 715 86 Updated Aug 21, 2024

Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚

Python 9,344 493 Updated Oct 2, 2024

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

3,470 143 Updated Sep 25, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 5,390 393 Updated Oct 3, 2024

PyTorch native quantization and sparsity for training and inference

Python 1,117 113 Updated Oct 3, 2024

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

1,067 22 Updated Jul 31, 2024

A model compilation solution for various hardware

MLIR 362 38 Updated Sep 30, 2024

A survey of Code Agents / Foundation Models for improving development productivity. Become 10x SWE, MLE, etc.

10 Updated Aug 20, 2024

Run PyTorch LLMs locally on servers, desktop and mobile

Python 3,238 205 Updated Oct 3, 2024

TensorDict is a pytorch dedicated tensor container.

Python 816 66 Updated Oct 2, 2024

Material for gpu-mode lectures

Jupyter Notebook 2,609 261 Updated Oct 1, 2024

Development repository for the Triton language and compiler

C++ 1 1 Updated Apr 4, 2024

Shared Middle-Layer for Triton Compilation

MLIR 163 34 Updated Oct 2, 2024

Agentic components of the Llama Stack APIs

Python 3,625 468 Updated Oct 3, 2024

CUDA Templates for Linear Algebra Subroutines

C++ 5,448 923 Updated Sep 25, 2024

Applied AI experiments and examples for PyTorch

Python 138 12 Updated Sep 30, 2024

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

C++ 1,176 486 Updated Oct 3, 2024
Python 151 17 Updated Oct 1, 2024

This repository contains the experimental PyTorch native float8 training UX

Python 212 20 Updated Aug 1, 2024

搜索、推荐、广告、用增等工业界实践文章收集(来源:知乎、Datafuntalk、技术公众号)

Python 2,258 288 Updated Oct 3, 2024

A fast communication-overlapping library for tensor parallelism on GPUs.

C++ 198 13 Updated Sep 18, 2024

A list of AI autonomous agents

9,938 725 Updated Sep 28, 2024

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Python 406 19 Updated Sep 5, 2024
Next