-
22:09
(UTC -05:00) - patrickyi.xyz
Stars
Efficient Triton Kernels for LLM Training
A modern runtime for JavaScript and TypeScript.
📊 A minimalist, self-hosted WakaTime-compatible backend for coding statistics
Free and source-available fair-code licensed workflow automation tool. Easily automate tasks across different services.
Train transformer language models with reinforcement learning.
Huly — All-in-One Project Management Platform (alternative to Linear, Jira, Slack, Notion, Motion)
A simple screen parsing tool towards pure vision based GUI agent
Quantized Attention that achieves speedups of 2.1x and 2.7x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
A self-paced course to learn Rust, one exercise at a time.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
FlashInfer: Kernel Library for LLM Serving
Universal LLM Deployment Engine with ML Compilation
Running large language models on a single GPU for throughput-oriented scenarios.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
The GitButler version control client, backed by Git, powered by Tauri/Rust/Svelte
A curated list for Efficient Large Language Models
SGLang is a fast serving framework for large language models and vision language models.
Make your first Pull Request on Hacktoberfest 2024. Don't forget to spread love and if you like give us a ⭐️
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
A high-throughput and memory-efficient inference and serving engine for LLMs