#
🤡
Block or Report
Block or report warmtan
Report abuse
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLanguage: Cuda
All languages
C
C#
C++
CMake
Cuda
Emacs Lisp
F#
G-code
Go
HTML
Haskell
JavaScript
Jupyter Notebook
LLVM
MATLAB
Makefile
Mojo
Python
Scala
Shell
SystemVerilog
TeX
TypeScript
Verilog
Vue
Nothing to show
Sort by: Most stars
Starred repositories
5
stars
written in Cuda
Clear filter
🎉CUDA 笔记 / 大模型手撕CUDA / C++笔记,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, s…
This is the first fully GPU Optimized IPC framework
Numerical experiments for the paper: "MPCGPU: Real-Time Nonlinear Model Predictive Control through Preconditioned Conjugate Gradient on the GPU"