forked from ray-project/ray
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[AIR/train] Use new Train API (ray-project#25735)
Uses the new AIR Train API for examples and tests. The `Result` object gets a new attribute - `log_dir`, pointing to the Trial's `logdir` allowing users to access tensorboard logs and artifacts of other loggers. This PR only deals with "low hanging fruit" - tests that need substantial rewriting or Train user guide are not touched. Those will be updated in followup PRs. Tests and examples that concern deprecated features or which are duplicated in AIR have been removed or disabled. Requires ray-project#25943 to be merged in first
- Loading branch information
Showing
38 changed files
with
666 additions
and
622 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
:orphan: | ||
|
||
torch_fashion_mnist_example | ||
=========================== | ||
|
||
.. literalinclude:: /../../python/ray/train/examples/torch_fashion_mnist_example.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
:orphan: | ||
|
||
torch_linear_dataset_example | ||
============================ | ||
|
||
.. literalinclude:: /../../python/ray/train/examples/torch_linear_dataset_example.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
:orphan: | ||
|
||
torch_linear_example | ||
==================== | ||
|
||
.. literalinclude:: /../../python/ray/train/examples/torch_linear_example.py |
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
:orphan: | ||
|
||
tune_cifar_torch_pbt_example | ||
============================ | ||
|
||
.. literalinclude:: /../../python/ray/train/examples/tune_cifar_torch_pbt_example.py |
This file was deleted.
Oops, something went wrong.
6 changes: 6 additions & 0 deletions
6
doc/source/train/examples/tune_torch_linear_dataset_example.rst
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
:orphan: | ||
|
||
tune_torch_linear_dataset_example | ||
================================= | ||
|
||
.. literalinclude:: /../../python/ray/air/examples/pytorch/tune_torch_linear_dataset_example.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,40 +1,53 @@ | ||
from ray import train | ||
from ray.train import Trainer | ||
from ray.train.callbacks import MLflowLoggerCallback, TBXLoggerCallback | ||
from ray.air import RunConfig | ||
from ray.train.torch import TorchTrainer | ||
from ray.tune.integration.mlflow import MLflowLoggerCallback | ||
from ray.tune.logger import TBXLoggerCallback | ||
|
||
|
||
def train_func(): | ||
for i in range(3): | ||
train.report(epoch=i) | ||
|
||
|
||
trainer = Trainer(backend="torch", num_workers=2) | ||
trainer.start() | ||
trainer = TorchTrainer( | ||
train_func, | ||
scaling_config={"num_workers": 2}, | ||
run_config=RunConfig( | ||
callbacks=[ | ||
MLflowLoggerCallback(experiment_name="train_experiment"), | ||
TBXLoggerCallback(), | ||
], | ||
), | ||
) | ||
|
||
# Run the training function, logging all the intermediate results | ||
# to MLflow and Tensorboard. | ||
result = trainer.run( | ||
train_func, | ||
callbacks=[ | ||
MLflowLoggerCallback(experiment_name="train_experiment"), | ||
TBXLoggerCallback(), | ||
], | ||
) | ||
result = trainer.fit() | ||
|
||
# Print the latest run directory and keep note of it. | ||
# For example: /home/ray_results/train_2021-09-01_12-00-00/run_001 | ||
print("Run directory:", trainer.latest_run_dir) | ||
# For MLFLow logs: | ||
|
||
# MLFlow logs will by default be saved in an `mlflow` directory | ||
# in the current working directory. | ||
|
||
trainer.shutdown() | ||
# $ cd mlflow | ||
# # View the MLflow UI. | ||
# $ mlflow ui | ||
|
||
# You can change the directory by setting the `tracking_uri` argument | ||
# in `MLflowLoggerCallback`. | ||
|
||
# For TensorBoard logs: | ||
|
||
# Print the latest run directory and keep note of it. | ||
# For example: /home/ubuntu/ray_results/TorchTrainer_2022-06-13_20-31-06 | ||
print("Run directory:", result.log_dir.parent) # TensorBoard is saved in parent dir | ||
|
||
# How to visualize the logs | ||
|
||
# Navigate to the run directory of the trainer. | ||
# For example `cd /home/ray_results/train_2021-09-01_12-00-00/run_001` | ||
# For example `cd /home/ubuntu/ray_results/TorchTrainer_2022-06-13_20-31-06` | ||
# $ cd <TRAINER_RUN_DIR> | ||
# | ||
# # View the MLflow UI. | ||
# $ mlflow ui | ||
# | ||
# # View the tensorboard UI. | ||
# $ tensorboard --logdir . |
Oops, something went wrong.