Skip to content
View Ghost-in-LCL's full-sized avatar

Block or report Ghost-in-LCL

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Puzzles for learning Triton

Jupyter Notebook 1,019 66 Updated Sep 25, 2024

Development repository for the Triton language and compiler

C++ 12,970 1,582 Updated Oct 9, 2024

A baseline repository of Auto-Parallelism in Training Neural Networks

Python 139 19 Updated Jun 25, 2022

A fast communication-overlapping library for tensor parallelism on GPUs.

C++ 200 14 Updated Sep 18, 2024

Enabling PyTorch on XLA Devices (e.g. Google TPU)

C++ 2,465 469 Updated Oct 9, 2024
C++ 16 6 Updated Aug 5, 2020

NCCL Tests

Cuda 847 233 Updated Jul 30, 2024

Best practices of Modern C++

258 18 Updated Oct 6, 2020

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

LLVM 28,385 11,726 Updated Oct 9, 2024

DLRover: An Automatic Distributed Deep Learning System

Python 1,230 155 Updated Oct 8, 2024

Memory Optimizations for Deep Learning (ICML 2023)

Python 58 12 Updated Mar 13, 2024

A machine learning compiler for GPUs, CPUs, and ML accelerators

C++ 2,623 412 Updated Oct 9, 2024

Code samples for C++ Concurrency in Action

C++ 683 212 Updated Jun 13, 2024

西安电子科技大学计算机专业经验分享:lollipop:

Python 323 42 Updated Nov 12, 2023

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.

C 6,328 1,488 Updated Oct 9, 2024

小彭老师自用 NeoVim 整合包

Lua 320 52 Updated Oct 2, 2024

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

C++ 5,875 666 Updated Sep 6, 2024

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 82,791 22,311 Updated Oct 9, 2024

Production-Grade Container Scheduling and Management

Go 110,330 39,464 Updated Oct 9, 2024

The Prometheus monitoring system and time series database.

Go 55,184 9,082 Updated Oct 9, 2024

mlpack: a fast, header-only C++ machine learning library

C++ 5,059 1,601 Updated Oct 3, 2024

The official SALIENT++ system described in the paper "Communication-Efficient Graph Neural Networks with Probabilistic Neighborhood Expansion Analysis and Caching".

C++ 6 3 Updated Jul 28, 2023

Artifact for the MLSys 2023 paper "Communication-Efficient Graph Neural Networks with Probabilistic Neighborhood Expansion Analysis and Caching"

C++ 2 Updated May 3, 2023

A distributed, fast open-source graph database featuring horizontal scalability and high availability

C++ 10,699 1,199 Updated Oct 9, 2024

oar is a simple software renderer

C++ 2 1 Updated Oct 4, 2024

oar is a simple software renderer

C++ 1 Updated Oct 26, 2023

A simple C++11 Thread Pool implementation

C++ 7,867 2,236 Updated Jul 20, 2024

The next-generation api backend server for bgm.tv

Go 582 62 Updated Oct 9, 2024

Curated list of project-based tutorials

200,085 26,103 Updated Aug 15, 2024

List of Computer Science courses with video lectures.

66,893 9,089 Updated Sep 13, 2024
Next