Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for controlling the number of jobs used by some cmake tasks #107

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

KazNX
Copy link
Contributor

@KazNX KazNX commented Jul 26, 2021

This change adds support for build arg --cmake-jobs to set the number of jobs to use to preserve CPU for the OS. It also renames _get_make_arguments() to _get_make_jobs_arguments() to better reflect the responsibilities of the function.

There is currently support to set the number of job threads to the maximum available, unless the MAKEFLAGS environment variable is set. The assumption is that MAKEFLAGS will be used by the build framework to control the number of jobs. However, this is only true for make - as far as I can tell, ninja does not have any such environment variable based control (related issues reported against ninja 922, 1482).

When MAKEFLAGS is not set, colcon will maxout the -j argument for ninja. Since ninja is so efficient, this can make the computer unresponsive for the OS and other applications. This change has been made to allow the OS to remain somewhat responsive.

As part of the change, I noted that _get_make_arguments() appears to be misnamed as the only thing it does is set the number of jobs to use, as evidenced by the use of the returned values. I've renamed it _get_make_jobs_arguments() to better reflect this. Happy to revert this change if required.

The --cmake-jobs args supports the following usage patterns:

  • Positive number - use that many threads.
  • Zero - use all available threads (current behaviour)
  • Negative - use all but the absolute value of this number of threads (minimum use 1 thread).

Kazys Stepanas added 2 commits August 30, 2021 13:25
- Add support for `build` arg `--cmake-jobs` to set the number of jobs to use.
- Rename `_get_make_arguments()` to `_get_make_jobs_arguments()` for clarity
@KazNX
Copy link
Contributor Author

KazNX commented Aug 30, 2021

PR has been rebased onto master fixing the CI check failures.

Copy link
Contributor

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this LGTM. @dirk-thomas ?

colcon_cmake/task/cmake/build.py Outdated Show resolved Hide resolved
colcon_cmake/task/cmake/build.py Outdated Show resolved Hide resolved
colcon_cmake/task/cmake/build.py Outdated Show resolved Hide resolved
colcon_cmake/task/cmake/__init__.py Outdated Show resolved Hide resolved
KazNX and others added 3 commits September 14, 2021 09:29
Slight reorganisation to change jobs variable initialisation.

Co-authored-by: Michel Hidalgo <[email protected]>
Rename `is_job_base_generator` to `is_job_based_generator` to be more correct
This better reflects the nature of the function.
Copy link
Contributor

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me, but I'd appreciate @dirk-thomas or @cottsay input as well.

colcon_cmake/task/cmake/build.py Outdated Show resolved Hide resolved
Copy link
Member

@cottsay cottsay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't simply specifying the CMAKE_BUILD_PARALLEL_LEVEL environment variable work for Ninja?

type=int,
help='Number of jobs to use for supported generators (e.g., Ninja '
'Makefiles). Negative values subtract from the maximum '
'available, so --jobs=-1 uses all bar 1 available threads.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'available, so --jobs=-1 uses all bar 1 available threads.')
'available, so --jobs=-1 uses all but 1 available threads.')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a typo

Copy link
Contributor Author

@KazNX KazNX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't simply specifying the CMAKE_BUILD_PARALLEL_LEVEL environment variable work for Ninja?

@cottsay Not sure, but it was introduced in CMake 3.12. I'm working with baseline support for 3.10 as ships with Ubuntu 18.04.

:rtype: bool
"""
known_jobs_base_multi_configuration_generators = (
'Ninja Multi-Config',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally made this change on top of PR 106 then pulled this out. You can view this as a defensive or pre-emptive change. It should have little to no impact until then as the ninja multi-config won't really work yet.

@@ -240,8 +249,8 @@ async def _build(self, args, env, *, additional_targets=None):
cmd += ['--clean-first']
if multi_configuration_generator:
cmd += ['--config', self._get_configuration(args)]
else:
job_args = self._get_make_arguments(env)
if jobs_base_generator:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, the removal of the else and replacing it with this separate condition check relates to ninja multi-config support, but should have no impact in isolation.

# the number of cores can't be determined
return []
# Finalise jobs as as CPU count deducting the limit specified.
jobs = max(jobs + jobs_args, 1)
return [
'-j{jobs}'.format_map(locals()),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rename of the function is based on this result. It can only return jobs. Will revert if appropriate.

@@ -343,8 +362,8 @@ async def _install(self, args, env):
args.build_base, args.cmake_args)
if multi_configuration_generator:
cmd += ['--config', self._get_configuration(args)]
elif allow_job_args:
job_args = self._get_make_arguments(env)
if allow_job_args:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This elif to if change also supports ninja multi-config.

colcon_cmake/task/cmake/build.py Outdated Show resolved Hide resolved
colcon_cmake/task/cmake/__init__.py Outdated Show resolved Hide resolved
colcon_cmake/task/cmake/__init__.py Outdated Show resolved Hide resolved
colcon_cmake/task/cmake/build.py Outdated Show resolved Hide resolved
colcon_cmake/task/cmake/build.py Outdated Show resolved Hide resolved
type=int,
help='Number of jobs to use for supported generators (e.g., Ninja '
'Makefiles). Negative values subtract from the maximum '
'available, so --jobs=-1 uses all bar 1 available threads.')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a typo

@KazNX
Copy link
Contributor Author

KazNX commented Sep 15, 2021

@cottsay Looking at the CMake 3.12 release notes, it looks like they added --parallel and --jobs arguments as well as CMAKE_BUILD_PARALLEL_LEVEL. This is great, but I need to target a minimum version of CMake 3.10 for now (the default CMake version for Ubuntu 18.04).

This changes the context of this PR a bit and I see the need to revise the original code I modified to deal with that as well. I propose the ideal solution is to use the changes I've made with CMake < 3.12. For CMake 3.12+, change this function to return --parallel and --jobs instead, which special provisions not to append these after -- is added to the command line. _is_jobs_based_generator() can then return True in call cases.

My question is, are the changes better made in this PR or a to push this through and start a new PR?

@cottsay
Copy link
Member

cottsay commented Sep 15, 2021

are the changes better made in this PR or a to push this through and start a new PR?

I see two conceptual changes:

  1. We should to stop mucking with the MAKEFLAGS if CMAKE_BUILD_PARALLEL_LEVEL is specified in the environment, which would support this use case for any system with CMake 3.12 or newer. This should be a trivial change. We should not remove the existing MAKEFLAGS arguments outright in favor of CMAKE_BUILD_PARALLEL_LEVEL because it uses --load-average, which achieves different behavior than --jobs alone.
  2. We should introduce a mechanism to support your use case with CMake older than 3.12. I'm not sure if a full-fledged command line option is the right move or something a little more "hidden" like an environment variable. Maybe a deprecated command line option would be good - or one that shows a deprecation warning if you're already using 3.12 and should switch to the supported environment variable for doing this.

The latter can be done in this PR, I think. The former should probably be done in a new PR. Thoughts?

@KazNX
Copy link
Contributor Author

KazNX commented Sep 15, 2021

I see two conceptual changes:

I fully agree with point 1, but would add that it's not just using CMAKE_BUILD_PARALLEL_LEVEL. It needs to be either/or with passing cmake args for --parallel and/or --jobs. I forsee some annoying permutations and need to learn more about how these interact.

On point 2, the deprecation warning only makes sense if CMAKE_BUILD_PARALLEL_LEVEL or --parallel and/or --jobs are present. Something like:

if cmake_ver >= 3.12 and (have_env_var('CMAKE_BUILD_PARALLEL_LEVEL') or have_cmake_arg('--parallel') or have_cmake_arg('--jobs')):
  print('deprecation warning')

Again, I need more information aboug how the command line arguments interact with CMAKE_BUILD_PARALLEL_LEVEL. Will get back to you on that.

@KazNX
Copy link
Contributor Author

KazNX commented Sep 16, 2021

FYI, -j --parallel and CMAKE_BUILD_PARALLEL_LEVEL are all the same. Command line help from CMake 3.12;

-j [] --parallel [] = Build in parallel using
the given number of jobs. If is omitted
the native build tool's default number is used.
The CMAKE_BUILD_PARALLEL_LEVEL environment variable
specifies a default parallel level when this option
is not given.

Revised how jobs are specified.

For cmake 3.12+, we use `cmake --build -jN`.

For cmake < 3.12, we use `cmake --build [args] -- -jN -lN`, but only for Ninja and Makefiles generators.

Note we no longer check if we have a jobs based generator. This is now managed internally to `_get_jobs_arguments()`.

Also note we now add `--` inside `_get_jobs_arguments()` as it is not needed for CMake 3.12+
@KazNX
Copy link
Contributor Author

KazNX commented Sep 17, 2021

@cottsay @hidmic
I've made a significant revision of this PR based on CMake 3.12+ supporting --parallel or -jN. Note that it sill uses a new colcon build argument --cmake-jobs in all cases. I can't use --cmake-args because that only affects CMake configuration and we need to inject arguments into cmake --build.

The changes are summaried as:

  • For cmake 3.12+, we use cmake --build -jN.
  • For cmake < 3.12, we use cmake --build [args] -- -jN -lN, but only for Ninja and Makefiles generators.

Note we no longer check if we have a jobs based generator. This is now managed internally to _get_jobs_arguments().

Also note we now add -- inside _get_jobs_arguments() as it is not needed for CMake 3.12+

@dirk-thomas
Copy link
Member

Since I don't have the spare time to review this change I leave it up to the co-maintainers to review and test this.

A general - fairly significant - concern I have: the Travis CI integration for all colcon repositories has been broken for a while. Merging any changes has an increased risk to introduce regressions. I strongly suggest to first "get the house in order" and reestablish CI coverage for the lost platforms - e.g. by adding GitHub action support to all repos.

@KazNX
Copy link
Contributor Author

KazNX commented Sep 27, 2021

@cottsay @hidmic Correct me if I'm wrong, but I read Dirk's comments as concerns over the current state of master rather than a applying to this PR. In that light, are there any further changes required to get this PR merged?

@cottsay
Copy link
Member

cottsay commented Sep 27, 2021

...over the current state of master rather than a applying to this PR.

Neither this PR nor master are getting test coverage on Linux or macOS because of the missing Travis builds.

are there any further changes required to get this PR merged?

I'm still not feeling great about merging a PR that adds a command line option to support such a narrow use case, especially one that has an official mechanism available in newer versions of CMake.

Here's a completely different suggestion - could we expose a free-form --cmake-build-args option to correspond to the existing free-form --cmake-args option? That would be useful outside of the Ninja use case, and wouldn't be obsoleted by CMake 3.12+.

@KazNX
Copy link
Contributor Author

KazNX commented Sep 27, 2021

Here's a completely different suggestion - could we expose a free-form --cmake-build-args option to correspond to the existing free-form --cmake-args option? That would be useful outside of the Ninja use case, and wouldn't be obsoleted by CMake 3.12+.

Yes, I like this option. It does mean removing the _get_jobs_arguments() method (previously _get_make_arguments()) completely because to prevent it from adding implied -j arguments.

@KazNX
Copy link
Contributor Author

KazNX commented Sep 28, 2021

I've created a new PR for the --cmake-build-args approach. See here #110

I managed to maintain _get_make_arguments(), though I think that functionality is questionable.

We can deprecate this PR if the 110 is the preferred solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants