Skip to content
View dmarx's full-sized avatar

Organizations

@pytti-tools

Block or report dmarx

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

ML Performance

638 repositories

Memory mapped numpy arrays of varying shapes

Python 278 11 Updated Jun 19, 2024

Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

Python 3,186 247 Updated Oct 10, 2024

Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning Framework on a GPU (JMLR 2022)

Python 458 79 Updated Aug 2, 2024

A plug-and-play library for parameter-efficient-tuning (Delta Tuning)

Python 990 79 Updated Sep 19, 2024

Accessible large language models via k-bit quantization for PyTorch.

Python 6,146 617 Updated Oct 9, 2024
Jupyter Notebook 103 11 Updated Sep 20, 2023

Fast NumPy array functions written in C

Python 1,056 101 Updated Sep 10, 2024

Minimization of the filesystem for containers

Shell 80 11 Updated Dec 9, 2020

Transformer related optimization, including BERT, GPT

C++ 5,815 888 Updated Mar 27, 2024

Export Hugging Face models to Core ML and TensorFlow Lite

Python 613 44 Updated Jul 23, 2024

Pytorch library for fast transformer implementations

Python 1,625 175 Updated Mar 23, 2023

This is a seed project for distributed PyTorch training, which was built to customize your network quickly

Python 138 14 Updated Jun 22, 2022

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools

Python 2,511 448 Updated Oct 10, 2024

A library for distributed ML training with PyTorch

C++ 366 20 Updated Dec 12, 2022

High-performance automatic differentiation of LLVM and MLIR.

LLVM 1,272 108 Updated Oct 9, 2024

Direct voxel grid optimization for fast radiance field reconstruction.

Python 1,040 109 Updated May 15, 2023

Minimal library to train LLMs on TPU in JAX with pjit().

Python 274 36 Updated Dec 20, 2023
C 600 42 Updated Jul 18, 2024

maximal update parametrization (µP)

Jupyter Notebook 1,374 94 Updated Jul 17, 2024

OSLO: Open Source for Large-scale Optimization

Python 172 29 Updated Sep 9, 2023

Simple package that makes your generator work in background thread

Python 272 22 Updated Jun 9, 2022

A powerful set of Python debugging tools, based on PySnooper

Python 1,252 35 Updated Oct 6, 2024

Train very large language models in Jax.

Python 192 17 Updated Oct 21, 2023

Compression schema for gradients of activations in backward pass

Python 43 4 Updated Jul 26, 2023

A Data Streaming Library for Efficient Neural Network Training

Python 1,099 137 Updated Oct 8, 2024

Test pytorch code with minimal computational overhead

Python 25 1 Updated Jun 8, 2023

Library for reading and writing large multi-dimensional arrays.

C++ 1,338 120 Updated Oct 9, 2024

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python 4,542 363 Updated Sep 13, 2024

A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind.

Python 146 41 Updated Oct 4, 2024