Block or Report
Block or report dbuades
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language: C++
Sort by: Most stars
An Open Source Machine Learning Framework for Everyone
The new Windows Terminal and the original Windows console host, all in the same place!
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning …
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
Development repository for the Triton language and compiler
Bringing Characters to Life with Computer Brains in Unity
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Fit interpretable models. Explain blackbox machine learning.
A flexible, high-performance serving system for machine learning models
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
Head tracking software for MS Windows, Linux, and Apple OSX
A C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.
A lightweight process isolation tool that utilizes Linux namespaces, cgroups, rlimits and seccomp-bpf syscall filters, leveraging the Kafel BPF language for enhanced security.
Stan development repository. The master branch contains the current release. The develop branch contains the latest stable development. See the Developer Process Wiki for details.
FlexFlow Serve: Low-Latency, High-Performance LLM Serving
🛰️ An approximate nearest-neighbor search library for Python and Java with a focus on ease of use, simplicity, and deployability.
Tensor parallelism is all you need. Run LLMs on weak devices or make powerful devices even more powerful by distributing the workload and dividing the RAM usage.
Puffer is a free live TV streaming website and a research study at Stanford using machine learning to improve video streaming
Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure
Dataset, streaming, and file system extensions maintained by TensorFlow SIG-IO
TinyChatEngine: On-Device LLM Inference Library
torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters in a single C++ process.