Skip to content
View shixuansun's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Organizations

@RapidsAtHKUST

Block or report shixuansun

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An acceleration library that supports arbitrary bit-width combinatorial quantization operations

C++ 210 22 Updated Sep 30, 2024

Python library for data stream learning

Python 28 Updated Sep 11, 2024

The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.

2 Updated Sep 16, 2024

A library of algorithms for approximate nearest neighbor search in high dimensions, along with a set of useful tools for designing such algorithms.

C++ 109 23 Updated Oct 8, 2024

CUDA implementation of Hierarchical Navigable Small World Graph algorithm

Cuda 137 20 Updated Apr 19, 2021

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) on multi-GPU Clusters

Python 581 50 Updated Oct 15, 2024

Experimental Code for "Unleashing Graph Partitioning for Large-Scale Nearest Neighbor Search"

C++ 18 3 Updated Oct 2, 2024

A cloud native embedded storage engine built on object storage.

Rust 1,313 59 Updated Oct 15, 2024

The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.

JavaScript 24,575 2,470 Updated Oct 15, 2024
Jupyter Notebook 34 2 Updated Jun 13, 2024

A curated list of graph-based fraud, anomaly, and outlier detection papers & resources

1 Updated Sep 4, 2024

A throughput-oriented high-performance serving framework for LLMs

Cuda 591 24 Updated Sep 21, 2024

ANN search

Cuda 2 Updated Sep 24, 2024

The simplest, fastest way to get business intelligence and analytics to everyone in your company 😋

Clojure 38,417 5,096 Updated Oct 15, 2024

Low-bit LLM inference on CPU with lookup table

C++ 482 35 Updated Oct 14, 2024

A curated list of Zero Knowledge links, mostly focusing on blockchain.

287 38 Updated Oct 9, 2024

It is a high-performance causal inference (statistical model) computing library based on OLAP, which solves the performance bottleneck of the existing statistical model library (R/Python) under big…

Java 105 20 Updated Oct 15, 2024
Makefile 2 Updated Aug 1, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 5,601 428 Updated Oct 15, 2024

Design of OpenMP-based Parallel Dynamic Louvain algorithm for community detection.

C++ 4 Updated Sep 16, 2024

C++ lockless queue.

C++ 1,496 180 Updated Oct 1, 2024

Summary of some awesome work for optimizing LLM inference

28 1 Updated Oct 12, 2024

本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)

HTML 9,594 942 Updated Oct 13, 2024

CUDA Python Low-level Bindings

Python 867 71 Updated Oct 15, 2024

A large-scale simulation framework for LLM inference

Python 252 33 Updated Oct 10, 2024

Efficient and Online Dataset Growth Algorithm (with cleanness and diversity awareness) to deal with growing web data

Python 19 Updated Aug 6, 2024

本人的科研经验

5,683 341 Updated Sep 28, 2024

A collection of resources on wait-free and lock-free programming

1,783 171 Updated Feb 25, 2024

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

2,650 182 Updated Oct 15, 2024
Next