-
triton Public
Forked from triton-lang/tritonDevelopment repository for the Triton language and compiler
C++ MIT License UpdatedOct 10, 2024 -
FBGEMM-1 Public
Forked from pytorch/FBGEMMFB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
C++ Other UpdatedOct 1, 2024 -
benchmark Public
Forked from pytorch/benchmarkPython BSD 3-Clause "New" or "Revised" License UpdatedSep 27, 2024 -
flash-attention Public
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
Python BSD 3-Clause "New" or "Revised" License UpdatedSep 6, 2024 -
midwit-matmul Public
A simplistic approach to high-performance GPU matmul
-
pytorch Public
Forked from pytorch/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
-
tf32_gemm Public
Example of binding a TF32 CUTLASS GEMM kernel to PyTorch
-
llama2.so Public
Forked from karpathy/llama2.cInference Llama 2 with a model compiled to native code by TorchInductor
-
-
multipy Public
Forked from pytorch/multipytorch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters in a single C++ process.
C++ Other UpdatedNov 1, 2022 -
torchdynamo Public
Forked from pytorch/torchdynamoA Python-level JIT compiler designed to make unmodified PyTorch programs faster.
Python BSD 3-Clause "New" or "Revised" License UpdatedOct 7, 2022 -
pyhpc-benchmarks Public
Forked from dionhaefner/pyhpc-benchmarksA suite of benchmarks for CPU and GPU performance of the most popular high-performance libraries for Python 🚀
-
functorch Public
Forked from pytorch/functorchfunctorch is a prototype of JAX-like composable function transforms for PyTorch.
C++ BSD 3-Clause "New" or "Revised" License UpdatedOct 18, 2021 -
-
-
builder Public
Forked from pytorch/builderContinuous builder and binary build scripts for pytorch
Shell BSD 2-Clause "Simplified" License UpdatedSep 7, 2021 -
-
tvm Public
Forked from apache/tvmOpen deep learning compiler stack for cpu, gpu and specialized accelerators
Python Apache License 2.0 UpdatedMar 30, 2021 -
deepstream-video-pipeline Public
Forked from pbridger/deepstream-video-pipeline -
nvprof2json Public
Forked from ezyang/nvprof2jsonConvert nvprof profiles into about:tracing compatible JSON files
Python UpdatedSep 13, 2020 -
-
hub Public
Forked from pytorch/hubSubmission to https://pytorch.org/hub/
Python UpdatedSep 1, 2020 -
BERT-pytorch Public
Forked from SplitInfinity/BERT-pytorchGoogle AI 2018 BERT pytorch implementation
Python Apache License 2.0 UpdatedAug 28, 2020 -
Background-Matting Public
Forked from asuhan/Background-MattingBackground Matting: The World is Your Green Screen
Python UpdatedAug 28, 2020 -
fastNLP Public
Forked from zhangguanheng66/fastNLPfastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Python Apache License 2.0 UpdatedAug 25, 2020 -
maskrcnn-benchmark Public
Forked from facebookresearch/maskrcnn-benchmarkFast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.
Python MIT License UpdatedAug 20, 2020 -
glow Public
Forked from pytorch/glowCompiler for Neural Network hardware accelerators
C++ Apache License 2.0 UpdatedFeb 14, 2020 -
-
bitserial Public
Hacking around with ultra-low precision GEMM using TVM
-