Making large AI models cheaper, faster and more accessible
-
Updated
Jun 18, 2024 - Python
Making large AI models cheaper, faster and more accessible
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
High-efficiency floating-point neural network inference operators for mobile, server, and Web
A high-throughput and memory-efficient inference and serving engine for LLMs
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
An open source NLP as a service project focused on providing state of the art systems with ease. Training and inference by simple docker commands
Large Language Model Text Generation Inference
🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.
A scalable inference server for models optimized with OpenVINO™
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
📚 Jupyter notebook tutorials for OpenVINO™
Utilities to use the Hugging Face Hub API
Cross-platform, customizable ML solutions for live and streaming media.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Port of OpenAI's Whisper model in C/C++
🤗 Optimum Intel: Accelerate inference with Intel optimization tools
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
ncnn is a high-performance neural network inference framework optimized for the mobile platform
Friendli: the fastest serving engine for generative AI
Add a description, image, and links to the inference topic page so that developers can more easily learn about it.
To associate your repository with the inference topic, visit your repo's landing page and select "manage topics."