Skip to content

Scheduler for sub-node tasks for HPC systems with batch scheduling

License

Notifications You must be signed in to change notification settings

makenv/hyperqueue

 
 

Repository files navigation

HyperQueue (HQ) lets you build a computation plan consisting of a large amount of tasks and then execute it transparently over a system like SLURM/PBS. It dynamically groups jobs into SLURM/PBS jobs and distributes them to fully utilize allocated notes. You thus do not have to manually aggregate your tasks into SLURM/PBS jobs.

Documentation

If you find a bug or a problem with HyperQueue, please create an issue. For more general discussion or feature requests, please use our discussion forum. If you want to chat with the HyperQueue developers, you can use our Zulip server.

Features

  • Performance

    • The inner scheduler can scale to hundreds of nodes
    • The overhead per one task is below 0.1ms.
    • HQ allows streaming outputs from tasks to avoid creating many small files on a distributed filesystem
  • Easy deployment

    • HQ is provided as a single, statically linked binary without any dependencies
    • No admin access to a cluster is needed

Getting started

Installation

  • Download the latest binary distribution from this link.

  • Unpack the downloaded archive:

    $ tar -xvzf hq-<version>-linux-x64.tar.gz

If you want to try the newest features, you can also download a nightly build.

Submitting a simple task

  • Start a server (e.g. on a login node or in a cluster partition)

    $ hq server start &
  • Submit a job (command echo 'Hello world' in this case)

    $ hq submit echo 'Hello world'
  • Ask for computing resources

    • Start worker manually

      $ hq worker start &
    • Automatic submission of workers into PBS/SLURM

      • PBS: