Block or Report
Block or report UranusSeven
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLists (9)
Sort Name ascending (A-Z)
Language: C++
Sort by: Most stars
Starred repositories
Port of OpenAI's Whisper model in C/C++
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
Development repository for the Triton language and compiler
Conversion between Traditional and Simplified Chinese
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Transformer related optimization, including BERT, GPT
A C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4
A Easy-to-understand TensorOp Matmul Tutorial
MSCCL++: A GPU-driven communication stack for scalable AI applications
A fast communication-overlapping library for tensor parallelism on GPUs.
Standalone Flash Attention v2 kernel without libtorch dependency
High performance Transformer implementation in C++.