Skip to content
View hypnopump's full-sized avatar

Highlights

  • Pro

Organizations

@EleutherAI @RWKV

Block or report hypnopump

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

Showing results

A MAD laboratory to improve AI architecture designs 🧪

Python 95 6 Updated May 2, 2024

Stick-breaking attention

Python 32 Updated Oct 30, 2024

📦 Repomix (formerly Repopack) is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) o…

TypeScript 3,521 158 Updated Nov 5, 2024

High performance AI inference stack. Built for production. @ziglang / @openxla / MLIR / @bazelbuild

Zig 1,633 57 Updated Nov 7, 2024

Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 9,361 575 Updated Nov 6, 2024

Official implementation for Yuan & Liu & Zhong et al., KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches. EMNLP Findings 2024

Python 46 2 Updated Oct 16, 2024

Simple and fast low-bit matmul kernels in CUDA / Triton

Python 132 10 Updated Nov 7, 2024

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Python 12 Updated Oct 27, 2024

AWX provides a web-based user interface, REST API, and task engine built on top of Ansible. It is one of the upstream projects for Red Hat Ansible Automation Platform.

Python 14,051 3,424 Updated Nov 7, 2024

Using FlexAttention to compute attention with different masking patterns

Python 40 Updated Sep 22, 2024

Experimental paper writing linter.

TeX 30 Updated Sep 2, 2024

`dattri` is a PyTorch library for developing, benchmarking, and deploying efficient data attribution algorithms.

Python 30 9 Updated Nov 2, 2024

An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.

Python 4,804 150 Updated Oct 27, 2024
Python 82 21 Updated Nov 1, 2024

Efficient Triton Kernels for LLM Training

Python 3,390 193 Updated Nov 8, 2024

Inference RWKV with multiple supported backends.

C++ 26 1 Updated Aug 21, 2024

Influence Functions with (Eigenvalue-corrected) Kronecker-Factored Approximate Curvature

Python 99 8 Updated Jul 31, 2024

Scalable neural net training via automatic normalization in the modular norm.

Jupyter Notebook 118 8 Updated Aug 20, 2024

使用 cutlass 仓库在 ada 架构上实现 fp8 的 flash attention

Cuda 51 3 Updated Aug 12, 2024

RWKV6 in native pytorch and triton:)

Python 10 Updated Aug 4, 2024

WIP

Python 88 1 Updated Aug 13, 2024

lina-speech : linear attention based text-to-speech

Jupyter Notebook 133 11 Updated Nov 7, 2024

Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793

Python 321 10 Updated Oct 30, 2024

Interaction Fingerprints for protein-ligand complexes and more

Python 369 70 Updated Oct 26, 2024

Solve puzzles. Improve your pytorch.

Jupyter Notebook 3,260 277 Updated Jul 15, 2024

Run PyTorch LLMs locally on servers, desktop and mobile

Python 3,352 219 Updated Nov 7, 2024

Triton implementation of FlashAttention2 that adds Custom Masks.

Python 72 6 Updated Aug 14, 2024
Python 79 7 Updated Sep 9, 2024
Python 62 4 Updated Oct 8, 2024

GoldFinch and other hybrid transformer components

Python 39 3 Updated Jul 20, 2024
Next