Skip to content
View wudu98's full-sized avatar

Block or report wudu98

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Inference Vision Transformer (ViT) in plain C/C++ with ggml

C++ 225 18 Updated Apr 11, 2024

Try to track the available stencil implementations

1 Updated Jul 17, 2024

深度学习系统笔记,包含深度学习数学基础知识、神经网络基础部件详解、深度学习炼丹策略、模型压缩算法详解。

Python 366 52 Updated Oct 16, 2024
C++ 1 Updated May 14, 2024

Fast SGEMM emulation on Tensor Cores

Cuda 7 Updated Aug 19, 2024

Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)

Cuda 584 222 Updated Aug 19, 2024

Tensors and Dynamic neural networks in Python with strong GPU acceleration

C++ 10 5 Updated Jun 2, 2024

Main repo to keep scripts, dockerfiles, wiki, etc

Shell 15 1 Updated Mar 14, 2023

Instruction latency & throughput profiler for AArch64

C++ 31 8 Updated Jan 26, 2024

Tencent Distribution of TVM

Python 15 5 Updated Apr 7, 2023
Python 4 Updated Sep 3, 2024

TensorRT Plugin Autogen Tool

Python 366 42 Updated Apr 7, 2023

深度学习经典、新论文逐段精读

26,740 2,419 Updated Aug 8, 2024

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Python 11,712 3,460 Updated Oct 18, 2024

Minimal PyTorch implementation of YOLOv3

Python 7,318 2,627 Updated Oct 18, 2023