Stars
A DAG processor and compiler for a tree-based spatial datapath.
A CMake toolchain file for iOS/iPadOS, visionOS, macOS, watchOS & tvOS C/C++/Obj-C++ development
Google Test integration with Xcode
Build solutions and precompiled libraries for ios/android/mac/windows.
A Python CLI app for downloading Apple Music songs/music videos/posts.
An introduction to ARM64 assembly on Apple Silicon Macs
Exploring the scalable matrix extension of the Apple M4 processor
An extension library of WMMA API (Tensor Core API)
Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators
Automatic Schedule Exploration and Optimization Framework for Tensor Computations
TileFlow is a performance analysis tool based on Timeloop for fusion dataflows
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
Convert CAJ (China Academic Journals) files to PDF. 转换中国知网 CAJ 格式文献为 PDF。佛系转换,成功与否,皆是玄学。
📕 parsing techniques 中文译本——《解析技术》
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
Assignments of High Performance Computing Labs course, Tsinghua University, Fall 2020
OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.