-
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedSep 30, 2024 -
transformers Public
Forked from huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
-
SillyTavern Public
Forked from SillyTavern/SillyTavernLLM Frontend for Power Users.
JavaScript GNU Affero General Public License v3.0 UpdatedSep 21, 2024 -
exllamav2 Public
Forked from turboderp/exllamav2A fast inference library for running LLMs locally on modern consumer-class GPUs
Python MIT License UpdatedSep 7, 2024 -
misc-scripts Public
Miscellaneous scripts for various stuff.
-
flash-attention Public
Forked from vllm-project/flash-attentionFast and memory-efficient exact attention
Python BSD 3-Clause "New" or "Revised" License UpdatedSep 1, 2024 -
tabbyAPI Public
Forked from theroyallab/tabbyAPIAn OAI compatible exllamav2 API that's both lightweight and fast
-
Liger-Kernel Public
Forked from linkedin/Liger-KernelEfficient Triton Kernels for LLM Training
Python BSD 2-Clause "Simplified" License UpdatedAug 27, 2024 -
-
-
tplib Public
A torch-based, universal tensor-parallel library.
-
huggingface.js Public
Forked from huggingface/huggingface.jsUtilities to use the Hugging Face Hub API
TypeScript MIT License UpdatedMay 30, 2024 -
SimpleTuner Public
Forked from bghira/SimpleTunerA general fine-tuning kit geared toward Stable Diffusion 2.1, DeepFloyd, and SDXL.
Python GNU Affero General Public License v3.0 UpdatedMay 19, 2024 -
KoboldAI Public
Forked from henk717/KoboldAIC++ GNU Affero General Public License v3.0 UpdatedMay 14, 2024 -
triton Public
Forked from triton-lang/tritonDevelopment repository for the Triton language and compiler
C++ MIT License UpdatedApr 29, 2024 -
AutoQuarot Public
Forked from sgsdxzy/AutoQuarotAuto convert transformers models to QuaRot.
-
grok-1 Public
Forked from xai-org/grok-1Grok open release
Python Apache License 2.0 UpdatedMar 17, 2024 -
marlin Public
Forked from IST-DASLab/marlinFP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Python Apache License 2.0 UpdatedMar 5, 2024 -
-
llama.cpp Public
Forked from ggerganov/llama.cppPort of Facebook's LLaMA model in C/C++
-
rentry_actions Public
Automatically create rentries from markdown files in your GitHub repository.
Python MIT License UpdatedFeb 13, 2024 -
AutoQuIP Public
Forked from chu-tianxiang/QuIP-for-allEasy to use LLM quantization via the QuIP# technique
-
LLM-Shearing Public
Forked from princeton-nlp/LLM-ShearingPreprint: Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
-
TalOS-Reborn Public
Forked from talos-bots/TalOS-RebornLLM Powered discord bot, Character Card enabled Chat page, Stable Diffusion discord bot, and overall AI tool. All from one app, TalOS: Reborn.
TypeScript UpdatedJan 24, 2024 -
-
STMP Public
Forked from RossAscends/STMPSillyTavern MultiPlayer is an LLM chat interface, created by RossAscends, that allows multiple users to chat together with each other and an AI.
JavaScript GNU Affero General Public License v3.0 UpdatedDec 25, 2023 -
langchain Public
Forked from langchain-ai/langchain⚡ Building applications with LLMs through composability ⚡
Python MIT License UpdatedDec 16, 2023 -
litellm Public
Forked from BerriAI/litellmCall all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)
-
mergekit Public
Forked from arcee-ai/mergekitTools for merging pretrained large language models.
Python GNU Lesser General Public License v3.0 UpdatedDec 3, 2023 -
prompt-tuner Public
Soft Prompt training library for Causal Transformer Language Models