This project provides a workbench to test the performance of the Kubernetes scheduler, and other batch schedulers or queue managers, such as Kueue. It particularly focuses on testing how their performance scale w.r.t. the number of Nodes and batch Jobs, but also w.r.t. the Jobs parallelism, i.e. the number of Pods per Job.
The workbench includes these core components:
It also includes the following components as part of the SUT (System Under Test):
- Kueue
- scheduler-plugins (for the coscheduling plugin)
The performance are observed via the Prometheus metrics exposed by each component, and are made accessible via the dashboard_scheduling.yaml Grafana dashboard, e.g.:
You can set the workbench up by running the following command:
$ make setup
Note this also creates a KinD cluster currently, though it should be possible to deploy the workbench on any existing Kubernetes cluster.
You can run the test for each individual component by running the commands listed below.
$ make test-kube-scheduler
$ make test-coscheduling
$ make test-kueue