Set priorities and quotas.
Run:ai Scheduler will continuously optimize resource allocations accordingly.
Prevent resource contention with over quota priorities and automatic job preemption and fairshare resource allocation
Reduce GPU idleness and increase Cluster utilization with Job Queueing and opportunistic batch job scheduling
Prevent GPU Hogging and guarantee access to always-available GPU quotas per user
Optimize cluster utilization and mitigate cluster fragmentation with automatic bin packing and workload consolidation
Schedule distributed workloads reliably on multiple nodes
Increase efficiency and reduce costs with fractional GPUs.
Perfect for Jupyter Notebook Farms; Ideal for Inference.
gpu sharing
Run multiple notebooks or host multiple inference servers on the same GPU to increase efficiency and reduce cost
Memory Isolation
Prevent collisions between workloads running on the same GPU with Run:ai Software Isolation. No code Change is required.
Compute Time Slicing
Control how GPU Compute is shared between multiple workloads with advanced time slicing methods like Strict and Fairshare
Dynamic MIG
Provision Multi-Instance GPU (MIG) slices on the fly without manual configurations, draining workloads, or rebooting your GPUs
Node Pools
Set priorities, quotas, and policies for each Node Pool and ensure resource allocations and security controls are aligned with business goals even for the most heterogeneous clusters running T4s and H100s