Skip to content
View pjyi2147's full-sized avatar

Block or report pjyi2147

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Efficient Triton Kernels for LLM Training

Python 3,400 192 Updated Nov 9, 2024
C++ 25 1 Updated Nov 7, 2024

LLM inference in C/C++

C++ 67,575 9,705 Updated Nov 11, 2024

A modern runtime for JavaScript and TypeScript.

Rust 97,459 5,373 Updated Nov 10, 2024

📊 A minimalist, self-hosted WakaTime-compatible backend for coding statistics

Go 2,728 169 Updated Nov 5, 2024

Free and source-available fair-code licensed workflow automation tool. Easily automate tasks across different services.

TypeScript 48,543 7,614 Updated Nov 11, 2024

Train transformer language models with reinforcement learning.

Python 10,001 1,265 Updated Nov 11, 2024

Huly — All-in-One Project Management Platform (alternative to Linear, Jira, Slack, Notion, Motion)

TypeScript 17,144 1,031 Updated Nov 11, 2024

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 4,476 332 Updated Nov 5, 2024

Quantized Attention that achieves speedups of 2.1x and 2.7x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.

Python 370 16 Updated Nov 11, 2024

CUDA Templates for Linear Algebra Subroutines

C++ 5,637 961 Updated Nov 8, 2024

Official inference framework for 1-bit LLMs

C++ 10,961 739 Updated Nov 8, 2024

NumPy aware dynamic Python compiler using LLVM

Python 9,959 1,127 Updated Nov 7, 2024

Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.

Go 97,182 7,735 Updated Nov 11, 2024

Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.

Python 15,806 1,542 Updated Oct 15, 2024

A self-paced course to learn Rust, one exercise at a time.

Rust 6,150 1,050 Updated Nov 8, 2024

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 12,073 2,516 Updated Nov 11, 2024

FlashInfer: Kernel Library for LLM Serving

Cuda 1,407 129 Updated Nov 11, 2024

Materials for learning SGLang

78 5 Updated Nov 10, 2024

Universal LLM Deployment Engine with ML Compilation

Python 19,153 1,573 Updated Nov 7, 2024

Running large language models on a single GPU for throughput-oriented scenarios.

Python 9,188 549 Updated Oct 28, 2024

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8,617 979 Updated Nov 6, 2024

Benchmarking library for RAG

Jupyter Notebook 112 10 Updated Nov 8, 2024

The GitButler version control client, backed by Git, powered by Tauri/Rust/Svelte

Rust 13,228 528 Updated Nov 10, 2024

A curated list for Efficient Large Language Models

Python 1,247 93 Updated Oct 30, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 5,958 491 Updated Nov 11, 2024

Make your first Pull Request on Hacktoberfest 2024. Don't forget to spread love and if you like give us a ⭐️

JavaScript 2,530 8,022 Updated Oct 24, 2024

MLIR For Beginners tutorial

C++ 815 69 Updated Sep 30, 2024

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 663 54 Updated Nov 9, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 29,909 4,517 Updated Nov 11, 2024
Next