#

inference

Here are 1,209 public repositories matching this topic...

ColossalAI

hpcaitech / ColossalAI

Making large AI models cheaper, faster and more accessible

ai deep-learning hpc distributed-computing inference big-model large-scale data-parallelism model-parallelism pipeline-parallelism foundation-models heterogeneous-training

Updated Jun 18, 2024
Python

openvinotoolkit / openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference

nlp natural-language-processing ai computer-vision deep-learning transformers inference speech-recognition yolo recommendation-system performance-boost good-first-issue openvino diffusion-models stable-diffusion generative-ai llm-inference optimize-ai deploy-ai

Updated Jun 18, 2024
C++

google / XNNPACK

High-efficiency floating-point neural network inference operators for mobile, server, and Web

cpu neural-network inference multithreading simd matrix-multiplication neural-networks convolutional-neural-networks convolutional-neural-network inference-optimization mobile-inference

Updated Jun 18, 2024
C

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

amd cuda inference pytorch transformer llama gpt rocm model-serving tpu mlops xpu llm inferentia llmops llm-serving trainium

Updated Jun 18, 2024
Python

microsoft / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

machine-learning compression deep-learning gpu inference pytorch zero data-parallelism model-parallelism mixture-of-experts pipeline-parallelism billion-parameters trillion-parameters

Updated Jun 18, 2024
Python

flozi00 / atra

An open source NLP as a service project focused on providing state of the art systems with ease. Training and inference by simple docker commands

chatbot speech transformers inference speech-recognition asr llm stable-diffusion

Updated Jun 18, 2024
Jupyter Notebook

huggingface / text-generation-inference

Large Language Model Text Generation Inference

nlp bloom deep-learning inference pytorch falcon transformer gpt starcoder

Updated Jun 18, 2024
Python

superduperdb

SuperDuperDB / superduperdb

🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.

Updated Jun 18, 2024
Python

openvinotoolkit / model_server

A scalable inference server for models optimized with OpenVINO™

kubernetes machine-learning cloud ai deep-learning inference edge dag model-serving serving openvino

Updated Jun 18, 2024
C++

xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

Updated Jun 18, 2024
Python

openvino_notebooks

openvinotoolkit / openvino_notebooks

📚 Jupyter notebook tutorials for OpenVINO™

machine-learning computer-vision deep-learning inference openvino

Updated Jun 18, 2024
Jupyter Notebook

intel / xFasterTransformer

intel inference transformer xeon llama model-serving llm chatglm qwen

Updated Jun 18, 2024
C++

huggingface.js

huggingface / huggingface.js

Utilities to use the Hugging Face Hub API

machine-learning inference hub api-client huggingface

Updated Jun 18, 2024
TypeScript

google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.

android c-plus-plus calculator machine-learning framework computer-vision deep-learning inference pipeline-framework stream-processing video-processing perception mobile-development audio-processing graph-framework graph-based mediapipe

Updated Jun 18, 2024
C++

Trusted-AI / adversarial-robustness-toolbox

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

python machine-learning privacy ai attack extraction inference artificial-intelligence evasion red-team poisoning adversarial-machine-learning blue-team adversarial-examples adversarial-attacks trusted-ai trustworthy-ai

Updated Jun 18, 2024
Python

whisper.cpp

ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++

inference transformer speech-recognition openai speech-to-text whisper

Updated Jun 18, 2024
C

huggingface / optimum-intel

🤗 Optimum Intel: Accelerate inference with Intel optimization tools

optimization intel transformers inference pruning quantization distillation onnx openvino diffusers

Updated Jun 18, 2024
Jupyter Notebook

AutoGPTQ / AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

nlp deep-learning transformers inference pytorch transformer quantization large-language-models llms

Updated Jun 18, 2024
Python

Tencent / ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform

Updated Jun 18, 2024
C++

friendliai / friendli-client

Friendli: the fastest serving engine for generative AI

ai ml inference gpt inference-server mistral inference-engine serving mlops gpt3 llm stable-diffusion llms generative-ai llmops llm-serving llm-inference llama2 llm-ops

Updated Jun 18, 2024
Python

Improve this page

Add a description, image, and links to the inference topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the inference topic, visit your repo's landing page and select "manage topics."