Skip to content
View squidruge's full-sized avatar

Block or report squidruge

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The University of Bristol HPC Simulation Engine

C++ 86 15 Updated Sep 18, 2024

C++编写的类C编译器,支持数组、过程调用,使用DAG中间代码优化

C++ 4 Updated Jun 1, 2022

A DAG processor and compiler for a tree-based spatial datapath.

Python 12 3 Updated Aug 24, 2022

FFTX Project

C++ 18 11 Updated May 8, 2024

A DSL for Stencil Codes

CMake 11 5 Updated Feb 28, 2024

A CMake toolchain file for iOS/iPadOS, visionOS, macOS, watchOS & tvOS C/C++/Obj-C++ development

CMake 1,872 447 Updated Jul 19, 2024

Google Test integration with Xcode

Objective-C++ 136 40 Updated Feb 28, 2024

Build solutions and precompiled libraries for ios/android/mac/windows.

C 28 7 Updated Jul 31, 2019

A Python CLI app for downloading Apple Music songs/music videos/posts.

Python 736 84 Updated Sep 14, 2024

An introduction to ARM64 assembly on Apple Silicon Macs

Assembly 4,318 284 Updated Jul 11, 2024

The ultimate Vim configuration (vimrc)

Vim Script 30,576 7,283 Updated Aug 18, 2024

Apple AMX Instruction Set

C 978 47 Updated Jun 3, 2024

Exploring the scalable matrix extension of the Apple M4 processor

C 91 4 Updated May 21, 2024

ucas hpc course code

C 13 2 Updated May 24, 2023
Assembly 10 5 Updated Sep 14, 2024

An extension library of WMMA API (Tensor Core API)

Cuda 81 14 Updated Jul 12, 2024

Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators

Python 100 10 Updated Oct 26, 2022
C 28 6 Updated Jun 15, 2022

Automatic Schedule Exploration and Optimization Framework for Tensor Computations

Python 175 30 Updated Apr 25, 2022

TileFlow is a performance analysis tool based on Timeloop for fusion dataflows

C++ 53 5 Updated Apr 12, 2024

Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.

Cuda 266 61 Updated Sep 8, 2024
Cuda 3 2 Updated Oct 20, 2023

Convert CAJ (China Academic Journals) files to PDF. 转换中国知网 CAJ 格式文献为 PDF。佛系转换,成功与否,皆是玄学。

Python 2,928 619 Updated Mar 20, 2024

CSCD70 Compiler Optimization

C++ 237 59 Updated Apr 17, 2023

📕 parsing techniques 中文译本——《解析技术》

Shell 1,523 145 Updated Oct 10, 2023

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

LLVM 28,014 11,565 Updated Sep 18, 2024

《Learn LLVM 12》的非专业个人翻译

TeX 586 79 Updated Dec 29, 2021

Assignments of High Performance Computing Labs course, Tsinghua University, Fall 2020

C++ 6 1 Updated Apr 4, 2021
Python 3 Updated Oct 10, 2022

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.

C 6,286 1,485 Updated Sep 16, 2024
Next