-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Train] Restructure Ray Train Example Page #38814
[Train] Restructure Ray Train Example Page #38814
Conversation
Signed-off-by: woshiyyya <[email protected]>
Signed-off-by: woshiyyya <[email protected]>
Signed-off-by: woshiyyya <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried two other ways of organizing it. LMK what you think @woshiyyya, @angelinalg . I can push a commit if one of these makes sense to you!
Table | Tabs + Table |
---|---|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Matt that looks nice👍 For me, the first one(table only) seems better because people can see all examples at once, no need to take another click.
Signed-off-by: Matthew Deng <[email protected]>
Signed-off-by: Matthew Deng <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@woshiyyya should we update the ToC to match these examples?
Lines 85 to 109 in 0562409
sections: | |
- file: train/examples/pytorch/torch_fashion_mnist_example | |
title: "PyTorch Fashion MNIST Example" | |
- file: train/examples/transformers/transformers_torch_trainer_basic | |
title: "Hugging Face Transformers Basic Example" | |
- file: train/examples/lightning/lightning_mnist_example | |
title: "PyTorch Lightning Basic Example" | |
- file: train/examples/lightning/lightning_cola_advanced | |
title: "PyTorch Lightning Advanced Example" | |
- file: train/examples/lightning/lightning_exp_tracking | |
title: "PyTorch Lightning with Experiment Tracking Tools" | |
- file: train/examples/tf/tensorflow_mnist_example | |
title: "TensorFlow MNIST Example" | |
- file: train/examples/horovod/horovod_example | |
title: "Horovod Example" | |
- file: train/examples/tf/tune_tensorflow_mnist_example | |
title: "Tune & TensorFlow Example" | |
- file: train/examples/pytorch/tune_cifar_torch_pbt_example | |
title: "Tune & PyTorch Example" | |
- file: train/examples/pytorch/torch_data_prefetch_benchmark/benchmark_example | |
title: "Torch Data Prefetching Benchmark" | |
- file: train/examples/pytorch/pytorch_resnet_finetune | |
title: "PyTorch Finetuning ResNet Example" | |
- file: train/examples/lightning/vicuna_13b_lightning_deepspeed_finetune | |
title: "Fine-tune Vicuna-13B with DeepSpeed and PyTorch Lightning" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have an even more aggressive idea. Should we delete this secondary level directly? Putting the full name of each example in the side menu would make it too long. Additionally, the narrow width of the sidebar would result in poor readability.
Signed-off-by: matthewdeng <[email protected]>
Signed-off-by: woshiyyya <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the non-table version as well. Thanks for doing this. This is such an improvement. It's actually beautiful!
Signed-off-by: woshiyyya <[email protected]>
…into restructure_train_example_page
Signed-off-by: woshiyyya <[email protected]>
Signed-off-by: woshiyyya <[email protected]>
Signed-off-by: woshiyyya <[email protected]>
* [train] enable new persistence mode for core and serve tests (#38938) Signed-off-by: Matthew Deng <[email protected]> * [train] New persistence mode: Update 🐠 `ML Libraries w/ Ray Client Examples (Python 3.7)` (#38923) Signed-off-by: Justin Yu <[email protected]> * [train] remove non-URI assertion (#38944) Signed-off-by: Matthew Deng <[email protected]> * [train] New persistence mode: Update 📖 `Doc tests and examples (excluding Ray AIR examples)` (#38940) Signed-off-by: Justin Yu <[email protected]> Signed-off-by: Matthew Deng <[email protected]> Co-authored-by: Matthew Deng <[email protected]> * disable legacy sync config logic in trainable (#38952) Signed-off-by: Justin Yu <[email protected]> * [2.7 CI][New Persistent Mode][6/n] 📖✈️ Ray AIR examples (#38918) Signed-off-by: woshiyyya <[email protected]> * [2.7 CI][New Persistent Mode][2/n] 📺 📖 Doc GPU tests and examples (#38905) Signed-off-by: woshiyyya <[email protected]> * [2.7 CI][New Persistent Mode][4/n] 📺 🚂 Train GPU tests & 🚂 Datasets Train Integration GPU Tests and Examples (#38910) Signed-off-by: woshiyyya <[email protected]> Signed-off-by: Justin Yu <[email protected]> Co-authored-by: Justin Yu <[email protected]> * [2.7 CI][New Persistent Mode][1/n] 📺✈️ AIR GPU tests (ray/air) & ⚡ :python: Lightning 2.0 Train GPU tests (#38903) Signed-off-by: woshiyyya <[email protected]> Signed-off-by: Yunxuan Xiao <[email protected]> * [train] Fix broken tune tests and support ray storage (#38950) This PR re-introduces support for ray storage ray.init(storage="s3:https://...") and fixes a broken tune controller test. Signed-off-by: Justin Yu <[email protected]> * [train] New persistence mode: Finish migrating `xgb`, `lgbm` and `sklearn` trainers, checkpoints + tests (#38959) Signed-off-by: Justin Yu <[email protected]> * [2.7 CI][New Persistent Mode][5/n] 📖 Doc examples for external code (#38915) Signed-off-by: woshiyyya <[email protected]> * [train][rllib] temporarily disable new persistence mode for rllib tests (#38965) Signed-off-by: Matthew Deng <[email protected]> * [2.7 CI][New Persistent Mode][8/n]✈️ AIR tests (ray/air) (#38932) Signed-off-by: woshiyyya <[email protected]> * [tune] Storage: 🐙 🧠 Tune tests and examples {using RLlib} migration (#38895) Signed-off-by: Kai Fricke <[email protected]> Co-authored-by: matthewdeng <[email protected]> * [train] Fix MosaicTrainer example and unit test (#38970) Signed-off-by: Justin Yu <[email protected]> * [air/release] Fix dreambooth example image preprocessing logic (#39020) Signed-off-by: Justin Yu <[email protected]> * [train] clean up ray.train._checkpoint imports (#38951) Signed-off-by: Matthew Deng <[email protected]> * [train] high level cleanup of Ray Train docs (#38971) Signed-off-by: Matthew Deng <[email protected]> * [wip][docs] update FrameworkPredictor examples (#38634) Signed-off-by: Matthew Deng <[email protected]> Signed-off-by: matthewdeng <[email protected]> * [train] Add documentation for using metadata argument to save preprocessors (#38701) * [Train] Restructure Ray Train Example Page (#38814) Signed-off-by: woshiyyya <[email protected]> * [air] Deprecate some fields/classes that are supposed to be gone in 2.6. (#38794) Signed-off-by: xwjiang2010 <[email protected]> * [tune/storage] Fix Tune multinode tests (#39050) Fixes multinode tests by using the new train.report() API. Signed-off-by: Kai Fricke <[email protected]> * [tune] Fix BOHB example for new storage (#38983) The new storage path does not create "empty" checkpoints per default anymore. Previously, when no checkpoint is saved, PAUSEing a trial would create a dummy checkpoint that only contains trial metadata (such as the iteration number). This is not the case anymore. Examples now have to implement checkpointing to properly restore previous state. This was also true previously - but some of our simple examples (e.g. the one in this PR) didn't implement it and still "worked". I think it's fine to keep the functionality as is and require our examples to show checkpointing implementations. This will ensure that users don't shoot their feet trying to use e.g. BOHB. Separately, BOHB was malfunctioning as trials were repeatedly PAUSED and restarted as they've never been removed from `bracket.trials_to_unpause`. @justinvyu mentioned this in the review where it was introduced and I believed at the time it wasn't necessary - turns out it is, as we can end up in a situation where a bracket is never finished because trials are constantly running. This was not caught by any tests. We should add one in a follow-up - for now we can proceed with this PR to pick onto Ray 2.7. Signed-off-by: Kai Fricke <[email protected]> * [Release Test] Fix `long_running_horovod_tune_test`. (#39012) Signed-off-by: Yunxuan Xiao <[email protected]> Signed-off-by: Yunxuan Xiao <[email protected]> * [train] New persistence mode: `StorageContext` unit tests (#39023) Signed-off-by: Justin Yu <[email protected]> * [train] enable train + tune tests and examples (#39021) Signed-off-by: Matthew Deng <[email protected]> * [rllib] Fix storage-path related tests (#38947) This PR fixes rllib-related tests that didn't pass changes related to the new storage context. Signed-off-by: Kai Fricke <[email protected]> Signed-off-by: matthewdeng <[email protected]> Co-authored-by: matthewdeng <[email protected]> * [train] New persistence mode: Migrate 🐙 `Tune tests and examples (medium)` (#39081) Signed-off-by: Justin Yu <[email protected]> --------- Signed-off-by: Matthew Deng <[email protected]> Signed-off-by: Justin Yu <[email protected]> Signed-off-by: woshiyyya <[email protected]> Signed-off-by: Yunxuan Xiao <[email protected]> Signed-off-by: Kai Fricke <[email protected]> Signed-off-by: matthewdeng <[email protected]> Signed-off-by: xwjiang2010 <[email protected]> Signed-off-by: Yunxuan Xiao <[email protected]> Co-authored-by: Justin Yu <[email protected]> Co-authored-by: Yunxuan Xiao <[email protected]> Co-authored-by: Kai Fricke <[email protected]> Co-authored-by: Eric Liang <[email protected]> Co-authored-by: xwjiang2010 <[email protected]>
Signed-off-by: woshiyyya <[email protected]> Signed-off-by: e428265 <[email protected]>
Signed-off-by: woshiyyya <[email protected]>
Signed-off-by: woshiyyya <[email protected]> Signed-off-by: Jim Thompson <[email protected]>
Signed-off-by: woshiyyya <[email protected]> Signed-off-by: Victor <[email protected]>
Why are these changes needed?
Rendered doc: https://anyscale-ray--38814.com.readthedocs.build/en/38814/train/examples.html
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.