Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Data] Document DataContext #43578

Merged
merged 11 commits into from
Mar 12, 2024
Merged

Conversation

bveeramani
Copy link
Member

@bveeramani bveeramani commented Mar 1, 2024

Why are these changes needed?

To globally configure data, you need to set attributes of the DataContext object. However, none of the DataContext settings are documented. This PR documents them, and also simplifies the DataContext module.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Balaji Veeramani <[email protected]>
Signed-off-by: Balaji Veeramani <[email protected]>
Signed-off-by: Balaji Veeramani <[email protected]>
Signed-off-by: Balaji Veeramani <[email protected]>
python/ray/data/context.py Outdated Show resolved Hide resolved
python/ray/data/context.py Outdated Show resolved Hide resolved
Signed-off-by: Balaji Veeramani <[email protected]>
@pcmoritz
Copy link
Contributor

pcmoritz commented Mar 4, 2024

Can we call out which of these users are likely to want to configure and which ones are advanced? The best way would be by grouping them, and if that's not possible, put an (advanced) next to the advanced ones.

Also some more detailed questions: What's the difference between enable_progress_bars and use_ray_tqdm? That's currently not clear from the docs.

For many of these, it is impossible for the users to know what they are doing, like actor_prefetcher_enabled, pipeline_push_based_shuffle_reduce_tasks (not documented), enable_get_object_locations_for_metrics, write_file_retry_on_errors (not documented)

The reference to retry_exceptions should be a link.

Signed-off-by: Balaji Veeramani <[email protected]>
Signed-off-by: Balaji Veeramani <[email protected]>
@bveeramani
Copy link
Member Author

Can we call out which of these users are likely to want to configure and which ones are advanced? The best way would be by grouping them, and if that's not possible, put an (advanced) next to the advanced ones.

@pcmoritz would you be opposed to us addressing this in a subsequent PR? I agree that the descriptions aren't very helpful right now. That said, before we invest time into improving the documentation, I was thinking we should go through the list of settings and remove unnecessary ones.

The reference to retry_exceptions should be a link.

Fixed.

Signed-off-by: Balaji Veeramani <[email protected]>
Signed-off-by: Balaji Veeramani <[email protected]>
@bveeramani bveeramani merged commit 56f142c into ray-project:master Mar 12, 2024
9 checks passed
@bveeramani bveeramani deleted the document-context branch March 12, 2024 20:13
hongchaodeng pushed a commit to hongchaodeng/ray that referenced this pull request Mar 13, 2024
To globally configure data, you need to set attributes of the DataContext object. However, none of the DataContext settings are documented. This PR documents them, and also simplifies the DataContext module.

---------

Signed-off-by: Balaji Veeramani <[email protected]>
ryanaoleary pushed a commit to ryanaoleary/ray that referenced this pull request Jun 7, 2024
To globally configure data, you need to set attributes of the DataContext object. However, none of the DataContext settings are documented. This PR documents them, and also simplifies the DataContext module.

---------

Signed-off-by: Balaji Veeramani <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants