botbw

Follow

botbw botbw

Follow

19 followers · 100 following

None
Singapore
11:23 (UTC +08:00)
https://botbw.github.io/

Achievements

Achievements

Highlights

Pro

Organizations

Stars

google-deepmind / alphafold3

AlphaFold 3 inference pipeline.

Python 1,792 177 Updated Nov 11, 2024

SiriusNEO / Triton-Puzzles-Lite

Puzzles for learning Triton, play it with minimal environment configuration!

Python 63 Updated Nov 9, 2024

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes/codes for ML SYS.

Python 35 Updated Nov 12, 2024

spack / spack

A flexible package manager that supports multiple versions, configurations, platforms, and compilers.

Python 4,296 2,274 Updated Nov 12, 2024

pmodels / mpich

Official MPICH Repository

C 555 281 Updated Nov 11, 2024

open-mpi / ompi

Open MPI main development repository

C 2,163 859 Updated Nov 7, 2024

XuehaiPan / nvitop

An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.

Python 4,804 150 Updated Oct 27, 2024

bdusell / semiring-einsum

Generic PyTorch implementation of einsum that supports different semirings

Python 46 7 Updated Jul 17, 2024

TheGejr / SpringShell

Spring4Shell - Spring Core RCE - CVE-2022-22965

Python 127 85 Updated Apr 4, 2022

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8,621 981 Updated Nov 6, 2024

NVIDIA / DCGM

NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs

C++ 410 53 Updated Sep 5, 2024

HuaizhengZhang / AI-System-School

🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSy…

2,687 307 Updated Aug 14, 2024

DeepWok / mase

Machine-Learning Accelerator System Exploration Tools

Python 121 54 Updated Nov 11, 2024

DD-DuDa / Cute-Learning

Examples of CUDA implementations by Cutlass CuTe

Makefile 90 12 Updated Nov 11, 2024

genmoai / models

The best OSS video generation models

Python 1,898 192 Updated Nov 12, 2024

pybind / pybind11

Seamless operability between C++11 and Python

C++ 15,739 2,111 Updated Nov 12, 2024

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention)

Python 2,562 246 Updated Nov 12, 2024

reznok / Spring4Shell-POC

Dockerized Spring4Shell (CVE-2022-22965) PoC application and exploit

Python 312 235 Updated Aug 4, 2022

asterictnl-lvdw / CVE-2024-6387

Remote Unauthenticated Code Execution Vulnerability in OpenSSH server (CVE-2024-6387)

Python 45 18 Updated Aug 22, 2024

zjin-lcf / HeCBench

C++ 215 78 Updated Nov 7, 2024

openucx / ucx

Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)

C 1,153 427 Updated Nov 10, 2024

alibaba / EasyParallelLibrary

Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.

Python 264 49 Updated Mar 31, 2023

mirage-project / mirage

Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA

C++ 602 35 Updated Nov 5, 2024

mobiusml / gemlite

Simple and fast low-bit matmul kernels in CUDA / Triton

Python 139 10 Updated Nov 10, 2024

BBuf / how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda 1,576 130 Updated Nov 12, 2024

te42kyfo / gpu-benches

collection of benchmarks to measure basic GPU capabilities

Jupyter Notebook 264 41 Updated Jun 21, 2024

srush / GPU-Puzzles

Solve puzzles. Learn CUDA.

Jupyter Notebook 9,867 855 Updated Sep 1, 2024

bshoshany / thread-pool

BS::thread_pool: a fast, lightweight, and easy-to-use C++17 thread pool library

C++ 2,202 253 Updated May 11, 2024

S-Lab-System-Group / Awesome-DL-Scheduling-Papers

253 31 Updated Jan 22, 2024

hpcaitech / TensorNVMe

A Python library transfers PyTorch tensors between CPU and NVMe

C++ 96 19 Updated Nov 12, 2024