Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[runtime env] Allow __MACOS folder and multiple top-level directories for remote URI zip #29010

Merged

Conversation

architkulkarni
Copy link
Contributor

Signed-off-by: Archit Kulkarni [email protected]

Why are these changes needed?

Implements the changes proposed in #28859, please see the issue for full details.

Copied from the issue:

If there is a single top level directory and possibly a __MACOS folder, stick to the current behavior and use the top level directory in the zip file as the working directory (this is the current behavior, but also fixes the __MACOS folder problem)
If there are multiple files or folders in the .zip file, take the contents of the zip file itself as the working directory.
This behavior will be fully backwards compatible and supports a common format of .zip files (i.e. put all the files into a top level directory), yet it also allows users to put files directly into the .zip file and removes the footgun.

Related issue number

Closes #28859

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Archit Kulkarni <[email protected]>
Signed-off-by: Archit Kulkarni <[email protected]>
Copy link
Contributor

@simon-mo simon-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stamp

"directory of example_dir when creating the zip file. "
"You can check the contents with `zipinfo -1 example.zip`."
)
if top_level_directory is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is always true right? because top_level_directory is a boolean from the args

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be thinking of remove_top_level_directory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_top_level_dir_from_compressed_package returns None if there isn't a unique top-level directory (not counting __MACOS

Copy link
Contributor

@shrekris-anyscale shrekris-anyscale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, these changes look good. I left a couple comments. Could you update the documentation to remove (e.g. __MACOSX/)?

Screen Shot 2022-10-06 at 10 23 37 AM

)
if top_level_directory is not None:
# Remove __MACOSX directory if it exists
macos_dir = os.path.join(target_dir, MAC_OS_ZIP_HIDDEN_DIR_NAME)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also remove the __MACOSX directory even if there's no top-level directory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong preference, but I would prefer not to, just to change as little possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.e. to only manually delete things if we really have to

python/ray/_private/runtime_env/packaging.py Outdated Show resolved Hide resolved
Co-authored-by: shrekris-anyscale <[email protected]>
Signed-off-by: Archit Kulkarni <[email protected]>
@architkulkarni
Copy link
Contributor Author

Oh great point, I forgot to update the documentation

@architkulkarni architkulkarni added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Oct 6, 2022
@architkulkarni architkulkarni requested a review from a team as a code owner October 6, 2022 18:29
@architkulkarni
Copy link
Contributor Author

architkulkarni commented Oct 6, 2022

@shrekris-anyscale, I updated the docs.

Do you think we should actually remove the entire part in the docs about requiring a top-level directory, or leave it the way it is? It depends whether we want to treat the multiple top-level directory as a recommended path, or treat it as an edge case / user error that we're handling gracefully.

@architkulkarni
Copy link
Contributor Author

cc @pcmoritz for docs codeowner review, and in case you want to weigh in on the docs question above

@architkulkarni architkulkarni removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Oct 6, 2022
You can avoid this by using the ``zip -r`` command directly on the directory you want to compress. Make sure to run the command from that directory's parent.
If Ray detects more than a single directory at the top level, it will use the entire zip file instead of the top-level directory, which may lead to unexpected behavior.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid this, use the zip -r command directly on the directory you want to compress from its parent's directory. For example, if you have a directory structure such as: a/b and you what to compress b, issue the zip -r b from the directory a.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that's what you mean with zip -r?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct and the way you've stated it is cleaner. I'll update this.

Copy link
Contributor

@dmatrix dmatrix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left minor comments. But should be good.

@architkulkarni
Copy link
Contributor Author

Of the tests which failed in both this commit (merge master) and the previous commit:
linkcheck: ( rllib/rllib-env: line 371) broken https://github.com/Farama-Foundation/PettingZoo/blob/master/tutorials/rllib_pistonball.py - 404 Client Error: Not Found for url: https://github.com/Farama-Foundation/PettingZoo/blob/master/tutorials/rllib_pistonball.py
unrelated
Windows serve:tutorial_rllib: was broken on master, unrelated
Windows tests:test_client_library_integration: was broken on master, unrelated

@architkulkarni architkulkarni added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Oct 7, 2022
@architkulkarni architkulkarni merged commit 5282ab4 into ray-project:master Oct 7, 2022
@architkulkarni architkulkarni deleted the runtime-env-zip-requirements branch October 7, 2022 22:47
WeichenXu123 pushed a commit to WeichenXu123/ray that referenced this pull request Dec 19, 2022
… for remote URI zip (ray-project#29010)

Implements the changes proposed in ray-project#28859, please see the issue for full details.

Copied from the issue:

If there is a single top level directory and possibly a __MACOS folder, stick to the current behavior and use the top level directory in the zip file as the working directory (this is the current behavior, but also fixes the __MACOS folder problem)
If there are multiple files or folders in the .zip file, take the contents of the zip file itself as the working directory.
This behavior will be fully backwards compatible and supports a common format of .zip files (i.e. put all the files into a top level directory), yet it also allows users to put files directly into the .zip file and removes the footgun.

Signed-off-by: Weichen Xu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ray 2.1 tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[runtime env] Broaden acceptable directory structure for remote zip working_dir
5 participants