Stars
An acceleration library that supports arbitrary bit-width combinatorial quantization operations
The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.
A library of algorithms for approximate nearest neighbor search in high dimensions, along with a set of useful tools for designing such algorithms.
CUDA implementation of Hierarchical Navigable Small World Graph algorithm
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) on multi-GPU Clusters
Course web site
Experimental Code for "Unleashing Graph Partitioning for Large-Scale Nearest Neighbor Search"
A cloud native embedded storage engine built on object storage.
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.
A curated list of graph-based fraud, anomaly, and outlier detection papers & resources
A throughput-oriented high-performance serving framework for LLMs
The simplest, fastest way to get business intelligence and analytics to everyone in your company 😋
A curated list of Zero Knowledge links, mostly focusing on blockchain.
It is a high-performance causal inference (statistical model) computing library based on OLAP, which solves the performance bottleneck of the existing statistical model library (R/Python) under big…
SGLang is a fast serving framework for large language models and vision language models.
Design of OpenMP-based Parallel Dynamic Louvain algorithm for community detection.
Summary of some awesome work for optimizing LLM inference
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
A large-scale simulation framework for LLM inference
Efficient and Online Dataset Growth Algorithm (with cleanness and diversity awareness) to deal with growing web data
A collection of resources on wait-free and lock-free programming
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.