Starred repositories
Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure
This is a simple 2d convolution written in cuda c which uses shared memory for better performance
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
This is a Chinese translation of the CUDA programming guide
Full Support 32bit RISC-V in LLVM and CLANG for Vector Extension
Assured confidential execution (ACE) implements VM-based trusted execution environment (TEE) for RISC-V with focus on a formally verified and auditable security monitor.
PLCT实验室的 RISC-V V Spec 实现,基于llvm/llvm-project,rkruppe/rvv-llvm 和 https://repo.hca.bsc.es/gitlab/rferrer/llvm-epi-0.8
hikonga / self-llm-core
Forked from datawhalechina/self-llm《开源大模型食用指南》基于Linux环境快速部署开源大模型,更适合中国宝宝的部署教程
Working draft of the proposed RISC-V V vector extension
a clone of POCL that includes RISC-V newlib devices support and Vortex
LLVM based assembler for x86, Arm, Mips, PowerPC, Sparc and SystemZ (Rust API)
Implementing SPMD control flow in LLVM using reconverging CFGs - Vectorizing Divergent Control-Flow for SIMD Applications
Attention in SRAM on Tenstorrent Grayskull 🤘
The TT-Forge FE is a graph compiler designed to optimize and transform computational graphs for deep learning models, enhancing their performance and efficiency.
LLVM OpenCL C compiler suite for ventus GPGPU
GPGPU processor supporting RISCV-V extension, developed with Chisel HDL
ericauld / flash-attention
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention