Skip to content

Commit

Permalink
[Doc] [runtime env] Clarify zip requirements for remote URIs (#28026)
Browse files Browse the repository at this point in the history
The requirements that a remote URI have a top-level directory are specified fully in the documentation; this PR makes the instructions more foolproof. This PR 
- specifies that the `zip` command must be run from the parent directory
- gives an example of the correct `zipinfo` output
- Adds instructions to the error message when there's no top-level directory

Co-authored-by: shrekris-anyscale <[email protected]>
  • Loading branch information
architkulkarni and shrekris-anyscale committed Sep 7, 2022
1 parent 27511ad commit 01d92fd
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 11 deletions.
24 changes: 15 additions & 9 deletions doc/source/ray-core/handling-dependencies.rst
Original file line number Diff line number Diff line change
Expand Up @@ -475,22 +475,28 @@ If you want to specify this directory as a local path, your ``runtime_env`` dict
Suppose instead you want to host your files in your ``/some_path/example_dir`` directory remotely and provide a remote URI.
You would need to first compress the ``example_dir`` directory into a zip file.
You can use the following command in the Terminal to do so:

There should be no other files or directories at the top level of the zip file, other than ``example_dir``.
You can use the following command in the Terminal to do this:

.. code-block:: bash
zip -r example.zip /some_path/example_dir
cd /some_path
zip -r zip_file_name.zip example_dir
Note that this command must be run from the *parent directory* of the desired ``working_dir`` to ensure that the resulting zip file contains a single top-level directory.
In general, the zip file's name and the top-level directory's name can be anything.
The top-level directory's contents will be used as the ``working_dir`` (or ``py_module``).

In general, to compress a directory called ``directory_to_zip`` into a zip file called ``zip_file_name.zip``, the command is:
You can check that the zip file contains a single top-level directory by running the following command in the Terminal:

.. code-block:: bash
# General command
zip -r zip_file_name.zip directory_to_zip
zipinfo -1 zip_file_name.zip
# example_dir/
# example_dir/my_file_1.txt
# example_dir/subdir/my_file_2.txt
There should be no other files or directories at the top level of the zip file, other than ``example_dir``.
In general, the zip file's name and the top-level directory's name can be anything.
The top-level directory's contents will be used as the ``working_dir`` (or ``py_module``).
Suppose you upload the compressed ``example_dir`` directory to AWS S3 at the S3 URI ``s3:https://example_bucket/example.zip``.
Your ``runtime_env`` dictionary should contain:

Expand All @@ -504,7 +510,7 @@ Your ``runtime_env`` dictionary should contain:
You can inspect a zip file's contents by running the ``zipinfo -1 zip_file_name.zip`` command in the Terminal.
Some zipping methods can cause hidden files or metadata directories to appear in the zip file at the top level.
This will cause Ray to throw an error because the structure of the zip file is invalid since there is more than a single directory at the top level.
You can avoid this by using the ``zip -r`` command directly on the directory you want to compress.
You can avoid this by using the ``zip -r`` command directly on the directory you want to compress. Make sure to run the command from that directory's parent.

Currently, three types of remote URIs are supported for hosting ``working_dir`` and ``py_modules`` packages:

Expand Down
7 changes: 5 additions & 2 deletions python/ray/_private/runtime_env/packaging.py
Original file line number Diff line number Diff line change
Expand Up @@ -801,10 +801,13 @@ def unzip_package(
top_level_directory = get_top_level_dir_from_compressed_package(package_path)
if top_level_directory is None:
raise ValueError(
"The package at package_path must contain "
f"The zip package at {package_path} must contain "
"a single top level directory. Make sure there "
"are no hidden files at the same level as the "
"top level directory."
"top level directory. You can ensure this by running "
"`zip -r example.zip example_dir` from the parent "
"directory of example_dir when creating the zip file. "
"You can check the contents with `zipinfo -1 example.zip`."
)

remove_dir_from_filepaths(target_dir, top_level_directory)
Expand Down

0 comments on commit 01d92fd

Please sign in to comment.