llm-inference-solutions

A collection of all available inference solutions for the LLMs

Name	Org	Description
vllm	UC Berkeley	A high-throughput and memory-efficient inference and serving engine for LLMs
Text-Generation-Inference	Hugginface🤗	Large Language Model Text Generation Inference
llm-engine	ScaleAI	Scale LLM Engine public repository
DeepSpeed	Microsoft	DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective
OpenLLM	BentoML	Operating LLMs in production
LLMDeploy	InternLM Team	LMDeploy is a toolkit for compressing, deploying, and serving LLM
FlexFlow	CMU,Stanford,UCSD	A distributed deep learning framework.
CTranslate2	OpenNMT	Fast inference engine for Transformer models
Fastchat	lm-sys	An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Triton-Inference-Server	Nvidia	The Triton Inference Server provides an optimized cloud and edge inferencing solution.
Lepton.AI	lepton.ai	A Pythonic framework to simplify AI service building
ScaleLLM	Vectorch	A high-performance inference system for large language models, designed for production environments
Lorax	Predibase	Serve 100s of Fine-Tuned LLMs in Production for the Cost of 1
TensorRT-LLM	Nvidia	TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines
mistral.rs	mistral.rs	Blazingly fast LLM inference.
NanoFlow	NanoFlow	A throughput-oriented high-performance serving framework for LLMs
LMCache	LMCache	Fast and Cost Efficient Inference
Litserve	Lighting.AI	Lightning-fast serving engine for AI models. Flexible. Easy. Enterprise-scale.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llm-inference-solutions

About

Releases

Packages

License

mani-kantap/llm-inference-solutions

Folders and files

Latest commit

History

Repository files navigation

llm-inference-solutions

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages