Lists (3)
Sort Name ascending (A-Z)
Starred repositories
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
Run PyTorch LLMs locally on servers, desktop and mobile
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
The official repo of INF-34B models trained by INF Technology.
A generative speech model for daily dialogue.
Collection of training data management explorations for large language models
You like pytorch? You like micrograd? You love tinygrad! ❤️
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
Ring attention implementation with flash attention
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
A benchmark to evaluate language models on questions I've previously asked them to solve.
Transformers with Arbitrarily Large Context
A natural language interface for computers
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Official implementation for the paper: "Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering""
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Official inference library for Mistral models
A high-throughput and memory-efficient inference and serving engine for LLMs