Lists (8)
Sort Name ascending (A-Z)
Starred repositories
Research prototype tool for modular formal verification of C and Java programs
Next-gen language engineering / DSL framework
an educational compiler intermediate representation
The book "Performance Analysis and Tuning on Modern CPU"
🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.
⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。
A collection of out-of-tree Clang plugins for teaching and learning
A Comprehensive Toolkit for High-Quality PDF Content Extraction
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
Exact inference for discrete probabilistic programs. (Research code, more documentation and ergonomics to come)
Neural Turing Machines (NTM) - PyTorch Implementation
An educational resource to help anyone learn deep reinforcement learning.
🎆Interactive Online Platform that Visualizes Algorithms from Code
Deep Learning papers reading roadmap for anyone who are eager to learn this amazing tech!
Development repository for the Triton language and compiler
A framework for testing compilers' type checkers
Code generator and generated types for Language Server Protocol.
VAST is an experimental compiler pipeline designed for program analysis of C and C++. It provides a tower of IRs as MLIR dialects to choose the best fit representations for a program analysis or fu…
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, s…
Implementation of a Transformer, but completely in Triton
A minimal GPU design in Verilog to learn how GPUs work from the ground up
Building a modern alternative to Salesforce, powered by the community.
Implementation of Nougat Neural Optical Understanding for Academic Documents