[train] Setup xgboost `CommunicatorContext` automatically #44883

justinvyu · 2024-04-20T01:01:24Z

Why are these changes needed?

This PR uses the recently added Backend(train_func_context) configuration to automatically set up the xgboost CommunicatorContext for users so they don't have to call it manually in their training code. Users do not need to change their single worker code as much.

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Justin Yu <[email protected]>

matthewdeng

woshiyyya · 2024-04-22T21:52:40Z

python/ray/train/xgboost/v2.py

@@ -53,17 +51,16 @@ def train_fn_per_worker(config: dict):
 "max_depth": 2,
 }

- # 2. Do distributed data-parallel training with the `CommunicatorContext`.
+ # 2. Do distributed data-parallel training.
 # Ray Train sets up the necessary coordinator processes and


Shall we mention CommunicatorContext here?

I think there's no need to explain it in this example since the user shouldn't need to think about it. I could add it to the XGBoostConfig docstring since that's how users would customize the params passed into the CommunicatorContext.

woshiyyya · 2024-04-22T21:54:09Z

python/ray/train/xgboost/v2.py

 # Ray Train sets up the necessary coordinator processes and
 # environment variables for your workers to communicate with each other.
- with CommunicatorContext():


What would happen when using a single worker for training? Will the context manager here be a noop?

Yeah, it works with single worker

…t#44883) Automatically set up the xgboost `CommunicatorContext` for users so they don't have to call it manually in their training code. --------- Signed-off-by: Justin Yu <[email protected]>

justinvyu added 2 commits April 19, 2024 17:55

automatically setup CommunicatorContext

d65510a

Signed-off-by: Justin Yu <[email protected]>

return the context

2c2dd5e

Signed-off-by: Justin Yu <[email protected]>

justinvyu requested review from matthewdeng and woshiyyya as code owners April 20, 2024 01:01

matthewdeng approved these changes Apr 22, 2024

View reviewed changes

woshiyyya approved these changes Apr 22, 2024

View reviewed changes

justinvyu merged commit 8fe3ac4 into ray-project:master Apr 25, 2024
5 checks passed

justinvyu deleted the xgb_context branch April 25, 2024 18:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[train] Setup xgboost `CommunicatorContext` automatically #44883

[train] Setup xgboost `CommunicatorContext` automatically #44883

justinvyu commented Apr 20, 2024

matthewdeng left a comment

woshiyyya Apr 22, 2024

justinvyu Apr 22, 2024

woshiyyya Apr 22, 2024

justinvyu Apr 22, 2024

[train] Setup xgboost CommunicatorContext automatically #44883

[train] Setup xgboost CommunicatorContext automatically #44883

Conversation

justinvyu commented Apr 20, 2024

Why are these changes needed?

Related issue number

Checks

matthewdeng left a comment

Choose a reason for hiding this comment

woshiyyya Apr 22, 2024

Choose a reason for hiding this comment

justinvyu Apr 22, 2024

Choose a reason for hiding this comment

woshiyyya Apr 22, 2024

Choose a reason for hiding this comment

justinvyu Apr 22, 2024

Choose a reason for hiding this comment

[train] Setup xgboost `CommunicatorContext` automatically #44883

[train] Setup xgboost `CommunicatorContext` automatically #44883