Making large AI models cheaper, faster and more accessible
-
Updated
Jun 17, 2024 - Python
Making large AI models cheaper, faster and more accessible
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
A high-throughput and memory-efficient inference and serving engine for LLMs
Faster Whisper transcription with CTranslate2
Large Language Model Text Generation Inference
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
💎1MB lightweight face detection model (1MB轻量级人脸检测模型)
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.
An easy to use PyTorch to TensorRT converter
Pre-trained Deep Learning models and demos (high quality and extremely fast)
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
Sparsity-aware deep learning inference runtime for CPUs
Python Library for learning (Structure and Parameter), inference (Probabilistic and Causal), and simulations in Bayesian Networks.
🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
TensorFlow template application for deep learning
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
DELTA is a deep learning based natural language and speech processing platform.
Add a description, image, and links to the inference topic page so that developers can more easily learn about it.
To associate your repository with the inference topic, visit your repo's landing page and select "manage topics."