Highlights
- Pro
Starred repositories
Quantized Attention that achieves speedups of 2.1x and 2.7x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
Entropy Based Sampling and Parallel CoT Decoding
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
A golang-based data loader which can be used from Python. Focused on a VectorDB stack at the moment, fetching and processing data per sample at GB/s speeds.
Official software repository of S. Bruch, F. M. Nardini, C. Rulli, and R. Venturini, "Efficient Inverted Indexes for Approximate Retrieval over Learned Sparse Representations". Long Paper @ ACM SIG…
Materials for the Ultimate Hybrid Search Workshop
A Rust HTTP server for Python applications
Intro to leetcodes. Basic techniques, quicksort and hash structures implementation, space and time complexities.
Einsum-like high-level array sharding API for JAX
Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
A self-paced course to learn Rust, one exercise at a time.
depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.
A modern model graph visualizer and debugger
FastKAN: Very Fast Implementation of Kolmogorov-Arnold Networks (KAN)
An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).
Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models"