zhuohan123

Follow

Zhuohan Li zhuohan123

Follow

🎓 CS PhD at UC Berkeley | 👨‍💻 Machine Learning System | Building @vllm-project

933 followers · 122 following

UC Berkeley
San Francisco Bay Area
17:00 (UTC -07:00)
https://zhuohan.li
@zhuohan123
in/zhuohan-li

Achievements

Achievements

Organizations

Block or Report

Block or report zhuohan123

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Stars

microsoft / vattention

Dynamic Memory Management for Serving LLMs without PagedAttention

C 159 10 Updated Aug 3, 2024

EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.

Python 6,080 1,613 Updated Aug 7, 2024

bytedance / flux

A fast communication-overlapping library for tensor parallelism on GPUs.

C++ 135 9 Updated Jul 25, 2024

EricLBuehler / mistral.rs

Blazingly fast LLM inference.

Rust 3,212 255 Updated Aug 6, 2024

OpenDevin / OpenDevin

🐚 OpenDevin: Code Less, Make More

Python 29,553 3,418 Updated Aug 7, 2024

HabanaAI / vllm-fork

Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 24 26 Updated Aug 7, 2024

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 1,433 53 Updated Aug 7, 2024

NaiboWang / EasySpider

A visual no-code/code-free web crawler/spider易采集：一个可视化浏览器自动化测试/数据采集/爬虫软件，可以无代码图形化的设计和执行爬虫任务。别名：ServiceWrapper面向Web应用的智能化服务封装系统。

JavaScript 31,230 3,723 Updated Jul 29, 2024

HPMLL / BurstGPT

A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems

Python 101 5 Updated Jul 9, 2024

lm-sys / arena-hard-auto

Arena-Hard-Auto: An automatic LLM benchmark.

Jupyter Notebook 358 39 Updated Jul 31, 2024

openai / simple-evals

Python 1,443 126 Updated Aug 6, 2024

zeux / calm

CUDA/Metal accelerated language model inference

C 351 13 Updated Jul 22, 2024

stanfordnlp / dspy

DSPy: The framework for programming—not prompting—foundation models

Python 15,298 1,185 Updated Aug 7, 2024

axonn-ai / axonn

A parallel framework for training deep neural networks

Python 34 5 Updated Jul 29, 2024

hao-ai-lab / Consistency_LLM

[ICML 2024] CLLMs: Consistency Large Language Models

Python 326 14 Updated Aug 1, 2024

young-geng / scalax

A simple library for scaling up JAX programs

Python 113 7 Updated Jun 26, 2024

xai-org / grok-1

Grok open release

Python 49,237 8,308 Updated Aug 7, 2024

mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation

Python 18,288 1,456 Updated Aug 7, 2024

kserve / kserve

Standardized Serverless ML Inference Platform on Kubernetes

Python 3,362 1,018 Updated Aug 7, 2024

OptimalScale / LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Python 8,169 818 Updated Jul 27, 2024

NVIDIA / cuda-python

CUDA Python Low-level Bindings

Python 837 66 Updated Aug 7, 2024

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 1,711 283 Updated Aug 7, 2024

LargeWorldModel / LWM

Python 7,042 545 Updated Aug 7, 2024

leptonai / search_with_lepton

Building a quick conversation-based search demo with Lepton AI.

TypeScript 7,623 960 Updated Jul 10, 2024

run-llama / llama_index

LlamaIndex is a data framework for your LLM applications

Python 34,261 4,840 Updated Aug 8, 2024

google-deepmind / alphageometry

Python 3,961 441 Updated Jul 9, 2024

IST-DASLab / marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 495 36 Updated Jul 10, 2024

sgl-project / sglang

SGLang is yet another fast serving framework for large language models and vision language models.

Python 3,968 245 Updated Aug 7, 2024

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 18,587 2,040 Updated Jul 31, 2024

stanford-futuredata / stk

Python 76 17 Updated Jul 29, 2024