Skip to content
View dmarx's full-sized avatar

Organizations

@pytti-tools

Block or report dmarx

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

27 stars written in Cuda
Clear filter

LLM training in simple, raw C/CUDA

Cuda 23,194 2,572 Updated Aug 26, 2024

Instant neural graphics primitives: lightning fast NeRF and more

Cuda 15,788 1,898 Updated Apr 18, 2024

A massively parallel, optimal functional runtime in Rust

Cuda 10,417 393 Updated Sep 4, 2024

Squeeze-and-Excitation Networks

Cuda 3,363 835 Updated Feb 25, 2019

GPU Accelerated t-SNE for CUDA with Python bindings

Cuda 1,777 126 Updated Apr 5, 2024

Tile primitives for speedy kernels

Cuda 1,474 55 Updated Sep 4, 2024

FlashInfer: Kernel Library for LLM Serving

Cuda 1,106 100 Updated Sep 5, 2024

RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing …

Cuda 729 187 Updated Sep 6, 2024
Cuda 670 50 Updated Oct 20, 2023

UNet diffusion model in pure CUDA

Cuda 560 28 Updated Jun 28, 2024

Flash Attention in ~100 lines of CUDA (forward pass only)

Cuda 549 48 Updated Apr 7, 2024

Code for KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs

Cuda 471 51 Updated Jun 16, 2021

NeRFshop: Interactive Editing of Neural Radiance Fields

Cuda 444 23 Updated Mar 27, 2023

Fast CUDA matrix multiplication from scratch

Cuda 411 53 Updated Dec 28, 2023

Neighborhood Attention Extension. Bringing attention to a neighborhood near you!

Cuda 339 25 Updated Aug 20, 2024

[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

Cuda 253 20 Updated Jul 2, 2024

Code for "Representing Volumetric Videos as Dynamic MLP Maps" CVPR 2023

Cuda 233 10 Updated Dec 6, 2023

The CUDA version of the RWKV language model ( https://github.com/BlinkDL/RWKV-LM )

Cuda 208 34 Updated May 12, 2024

CUDA Learning guide

Cuda 198 19 Updated Jun 20, 2024

Learning Deformable Tetrahedral Meshes for 3D Reconstruction (NeurIPS 2020)

Cuda 165 11 Updated Oct 23, 2023

[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

Cuda 150 6 Updated Jul 3, 2024

Blazingly fast encoding for neural networks based on permutohedral lattices

Cuda 94 10 Updated May 10, 2023

LLM training in simple, raw C/CUDA

Cuda 77 6 Updated May 1, 2024

3D Gaussian Splatting in JAX

Cuda 52 2 Updated May 30, 2024

Simple and fast low-bit matmul kernels in CUDA

Cuda 46 4 Updated Aug 21, 2024

A faster implementation of OpenCV-CUDA that uses OpenCV objects, and more!

Cuda 34 5 Updated Sep 6, 2024