Releases: why-in-Shanghaitech/sapp
Release v0.4.4
Bug Fix
- bug: sinfo -N
This bug leads to an underestimate of available cards. The output of sinfo may have multiple nodes in a single line. Add -N to enforce one node per line. - bug: start from an existing config will lead to wrong gpu type
This bug is from the npyscreen UI. When updating the partition name, the gpu type will be rewrite to the default (any type). - bug: cpu count
The previous max CPU count is too small. Now users could set at most 128 CPU cores per task. - bug: change clash download link to raw github
Change the link to download clash to raw.githubusercontents.com. May help users cannot connect to github.com but can connect to raw.githubusercontents.com.
New Features
- feat: support avail
The previous implementation does not consider multi-gpu settings. Now the 'available' hints will consider the number of required gpus. That is, showing 'Available: x' means that the resource will support submitting x tasks to idle or mix nodes. - feat: support email
Add email settings to config. - feat: main page avail
Show available numbers for last time execution - feat: default value for walltime and email
Global setting for default values of walltime and email - feat: set default values for submission
Users can set the default values in job submission from global configs. This may help accelerate the job submission process.
What's Changed
- Bugfix 0.4.4 by @why-in-Shanghaitech in #7
Full Changelog: v0.4.3...v0.4.4
Release v0.4.3
Bug Fix
Since clash has been removed from github, this bugfix downloads clash binary from another source from github.
New Features
- In menu page, the command to execute will be highlighted.
- Sapp now supports direct use of clash. The command
clash
will link to the binary clash file.
What's Changed
- Fix clash installation by @why-in-Shanghaitech in #6
Full Changelog: v0.4.2...v0.4.3
Release v0.4.2
Bug Fix
The previous implementation will crash if the slurm system has only 1 partition or 1 type of GPU cards. This release fixes this problem with a better regex. It also allows users to modify global settings, which allows users to better utilize nodes with NVLINK support.
What's Changed
- Bugfix on other slurm systems by @why-in-Shanghaitech in #5
Full Changelog: v0.4.1...v0.4.2
Release v0.4.1
Bug Fix
The release fixes some essential bugs in sapp 0.4.0, which may affect the functionality for most users.
Fix some bugs and enhance features, including
- Release resources for sbatch jobs. Previous implementation will raise an error.
- Multiple login nodes. Previous implementation does not consider the race condition for multiple login nodes.
- Unknown host of ssh connection may raise error: Host key verification failed.
- Previous implementation will cause a second user fail to create the lock file.
- Automatically generate key pairs and register authorized keys.
- Filter out squeue lines with status CG.
- Automatically kill the clash service on the login node at the end of the last sbatch job.
Improve the accessibility
- timestamp filename pattern. More friendly for sbatch users to do quick job submission.
- Improve the help message in the interactive form.
What's Changed
- sbatch Bugfix by @why-in-Shanghaitech in #2
- SSH and mutli-user Bugfix by @why-in-Shanghaitech in #3
- Release sapp 0.4.1 by @why-in-Shanghaitech in #4
Full Changelog: v0.4.0...v0.4.1
Release v0.4.0
This release extends the functionality of sapp on a large scale.
New Features
- Better UI. Use
npyscreen
to create a more user-friendly UI.- The options are shown with the help message.
- The user now has full access to all the data. The previous version does not allow users to delete settings through the console. Now users can create, edit and remove settings. The name of the settings could include special characters.
- Support
sbatch
submission. - Sapp now allows users to execute
srun
andsbatch
without worrying about file changes. Sapp will copy the files detected in the command line to a temporary place, and then replace the command line arguments with these copies. Users can change the script or config files after submitting the job, greatly simplifying the work when the job cannot get executed immediately. - Sapp could do auto port forwarding. By default, sapp will create a clash service on the login node and connects to the compute node through port forwarding. Jobs submitted with sapp could now connect to the Internet. You could also manually establish a proxy service on the login node and tell sapp which port to forward.
What's Changed
- Release sapp 0.4.0 by @why-in-Shanghaitech in #1
Full Changelog: v0.3.1...v0.4.0
Release v0.3.1
Release v0.3.1
sapp
is now switching to pip installation!
Change
- Switch to pip installation. No longer support .bashrc installation.
- Support
LAST-RUN
. Now the default option user enters sapp is the one user submitted last time.