Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[runtime env] Async pip runtime env #22381

Merged
merged 18 commits into from
Feb 24, 2022

Conversation

fyrestone
Copy link
Contributor

@fyrestone fyrestone commented Feb 15, 2022

Why are these changes needed?

In order to initialize runtime env concurrently, this PR makes pip runtime env asynchronous. It includes,

  • New check_output_cmd in runtime env utils.
  • Async PipProcessor.
  • The asynccontextmanager from https://github.com/python-trio/async_generator for Python 3.6
  • Remove pip runtime env lock.
  • Disable pip cache.

Related issue number

#21950

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@fyrestone fyrestone self-assigned this Feb 15, 2022
Copy link
Contributor

@SongGuyang SongGuyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great for me! Only left some little comments.

python/ray/_private/runtime_env/utils.py Show resolved Hide resolved
str_list = [super().__str__().strip(",.")]
out = {
"stdout": self.stdout,
"stderr": self.stderr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very similar to this part, maybe you can modify the implementation in conda_utils.py to avoid splitting the implementation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part

Yes. But we can't assume that the stderr is always redirected to the stdout. The implementation of SubprocessCalledProcessError should cover the interface of the parent subprocess.CalledProcessError.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very similar to this part, maybe you can modify the implementation in conda_utils.py to avoid splitting the implementation

The exec_cmd and exec_cmd_stream_to_logger in the conda_utils.py are used by many runtime envs, not only by the conda. I thinks it's better to put the exec utils into the utils.py instead of conda_utils.py.

But, we are going to make all the runtime env async, so exec_cmd and exec_cmd_stream_to_logger will be removed after they are not referenced.

python/ray/_private/runtime_env/utils.py Outdated Show resolved Hide resolved
@fyrestone fyrestone marked this pull request as ready for review February 16, 2022 07:18
@fyrestone
Copy link
Contributor Author

@architkulkarni @edoakes @rkooo567 Hi, can you review this PR?

Copy link
Contributor

@rkooo567 rkooo567 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice change btw :)! I will leave the further review to @architkulkarni who should have more context in this code path

python/ray/_private/async_compat.py Show resolved Hide resolved
@edoakes edoakes removed their assignment Feb 18, 2022
@edoakes
Copy link
Contributor

edoakes commented Feb 18, 2022

Deferring to @architkulkarni to review. Let's make sure there's no regression in logging/exception messages due to the change (hopefully our existing tests are enough here)

from contextlib import asynccontextmanager
except ImportError:
# Copy from https://github.com/python-trio/async_generator
# for compatible with Python 3.6
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rkooo567 The Python 3.6 does not support asynccontextmanager. This code is copied from async_generator.

Copy link
Contributor

@architkulkarni architkulkarni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! Just have one question:

Are we sure that it's safe to remove the lock for concurrent pip installs? I think concurrently installing pip environments can cause some unexpected errors, e.g. #17086. Or is it safe now because we added --no-cache-dir?

If we can add a test to cover the case of concurrent installs, I'd feel better about this. Something like installing two runtime envs, pip: ["A", "B"] concurrently with pip: ["A", "C"]. Is it possible to add such a test to this PR?

@fyrestone
Copy link
Contributor Author

This looks good to me! Just have one question:

Are we sure that it's safe to remove the lock for concurrent pip installs? I think concurrently installing pip environments can cause some unexpected errors, e.g. #17086. Or is it safe now because we added --no-cache-dir?

If we can add a test to cover the case of concurrent installs, I'd feel better about this. Something like installing two runtime envs, pip: ["A", "B"] concurrently with pip: ["A", "C"]. Is it possible to add such a test to this PR?

Concurrent installs in different virtualenvs is OK because,

  • Creating a virtualenv with a clean and dedicated directory for the app data. Related options: --app-data, --reset-app-data.
  • Disable the pip cache (Set a dedicated cache dir for pip also works, but the cache is useless because pip runtime env only install once to the virtualenv). Related options: --no-cache-dir.

I will add a test to cover the current installs.

@fyrestone
Copy link
Contributor Author

fyrestone commented Feb 23, 2022

This looks good to me! Just have one question:

Are we sure that it's safe to remove the lock for concurrent pip installs? I think concurrently installing pip environments can cause some unexpected errors, e.g. #17086. Or is it safe now because we added --no-cache-dir?

If we can add a test to cover the case of concurrent installs, I'd feel better about this. Something like installing two runtime envs, pip: ["A", "B"] concurrently with pip: ["A", "C"]. Is it possible to add such a test to this PR?

I noticed that the test_runtime_env_complicated.py::test_simultaneous_install already covers the concurrent pip installs. Please refer to: https://github.com/ray-project/ray/blob/master/python/ray/tests/test_runtime_env_complicated.py#L706

@fyrestone fyrestone added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Feb 23, 2022
@fyrestone
Copy link
Contributor Author

@architkulkarni Hi, all the comments have been fixed. Can this PR be merged?

Copy link
Contributor

@architkulkarni architkulkarni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! There were a handful of failed tests in CI but I don't think they're related to this PR; I restarted them just to have more confidence.

@fyrestone
Copy link
Contributor Author

Looks good to me! There were a handful of failed tests in CI but I don't think they're related to this PR; I restarted them just to have more confidence.

Thanks.

@jovany-wang jovany-wang merged commit 6a9a286 into ray-project:master Feb 24, 2022
@jovany-wang jovany-wang deleted the async_pip_runtime_env branch February 24, 2022 03:03
simonsays1980 pushed a commit to simonsays1980/ray that referenced this pull request Feb 27, 2022
In order to initialize runtime env concurrently, this PR makes pip runtime env asynchronous. It includes,

- [x] New `check_output_cmd` in runtime env utils.
- [x] Async PipProcessor.
- [x] The `asynccontextmanager` from `https://github.com/python-trio/async_generator` for Python 3.6
- [x] Remove pip runtime env lock.
- [x] Disable pip cache.

Co-authored-by: 刘宝 <[email protected]>
raulchen pushed a commit that referenced this pull request Feb 28, 2022
…n in threads. (#22588)

I recently realized that during a runtime_env creation process, a plugin/manager that is very slow to setup may block the creation of other runtime_env, so I make plugin/manager setup run in threads.

[The refactor of `PipManager`](#22381) is about to be completed, so I ignore it in this PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants