adding some links from guides to the overview pages; fix typos

Signed-off-by: angelinalg <[email protected]>
ray-project · matthewdeng · Sep 14, 2023 · Sep 7, 2023 · Sep 11, 2023 · Sep 11, 2023
commit 87097bab94ac5a257aa9878dda243186e1d0235a
@@ -5,6 +5,9 @@ Get Started with DeepSpeed
 
 The :class:`~ray.train.torch.TorchTrainer` can help you easily launch your `DeepSpeed <https://www.deepspeed.ai/>`_ training across a distributed Ray cluster.
 
+Code example
+------------
+
 You only need to run your existing training code with a TorchTrainer. You can expect the final code to look like this:
 
 .. code-block:: python

@@ -110,8 +110,8 @@ To customize the backend setup, you can pass a
 For more configurability, see the :py:class:`~ray.train.data_parallel_trainer.DataParallelTrainer` API.
 
 
-Run your training function
---------------------------
+Run a training function
+-----------------------
 
 With a distributed training function and a Ray Train ``Trainer``, you are now
 ready to start training.

@@ -25,7 +25,7 @@ Quickstart
  :end-before: __lightgbm_end__
 
 
-Basic Training with Tree-Based Models in Train
+Basic training with tree-based models in Train
 ----------------------------------------------
 
 Just as in the original `xgboost.train() <https://xgboost.readthedocs.io/en/stable/parameter.html>`__ and
@@ -53,12 +53,12 @@ training parameters are passed as the ``params`` dictionary.
  :end-before: __lightgbm_end__
 
 
-Ray-specific params are passed in through the trainer constructors.
+Trainer constructors pass Ray-specific parameters.
 
 
 .. _train-gbdt-checkpoints:
 
-Save and Load XGBoost and LightGBM Checkpoints
+Save and load XGBoost and LightGBM checkpoints
 ----------------------------------------------
 
 When you train a new tree on every boosting round,
@@ -209,13 +209,13 @@ How to optimize XGBoost memory usage?
 XGBoost uses a compute-optimized datastructure, the ``DMatrix``,
 to hold training data. When converting a dataset to a ``DMatrix``,
 XGBoost creates intermediate copies and ends up
-holding a complete copy of the full data. The data will be converted
-into the local dataformat (on a 64 bit system these are 64 bit floats.)
+holding a complete copy of the full data. XGBoost converts the data
+into the local data format. On a 64-bit system the format is 64-bit floats.
 Depending on the system and original dataset dtype, this matrix can
 thus occupy more memory than the original dataset.
 
 The **peak memory usage** for CPU-based training is at least
-**3x** the dataset size (assuming dtype ``float32`` on a 64bit system)
+**3x** the dataset size, assuming dtype ``float32`` on a 64-bit system,
 plus about **400,000 KiB** for other resources,
 like operating system requirements and storing of intermediate
 results.

@@ -8,8 +8,8 @@ This tutorial walks through the process of converting an existing PyTorch Lightn
 Learn how to:
 
 1. Configure the Lightning Trainer so that it runs distributed with Ray and on the correct CPU or GPU device.
-2. Configure the training function to report metrics and save checkpoints.
-3. Configure scale and CPU or GPU resource requirements for a training job.
+2. Configure :ref:`training function <train-overview-training-function>` to report metrics and save checkpoints.
+3. Configure :ref:`scaling <train-overview-scaling-config>` and CPU or GPU resource requirements for a training job.
 4. Launch a distributed training job with a :class:`~ray.train.torch.TorchTrainer`.
 
 Quickstart
@@ -29,7 +29,7 @@ For reference, the final code is as follows:
  trainer = TorchTrainer(train_func, scaling_config=scaling_config)
  result = trainer.fit()
 
-1. Your `train_func` is the Python code that each distributed training worker executes.
+1. Your `train_func` is the Python code that each distributed training :ref:`worker <train-overview-worker>` executes.
 2. Your `ScalingConfig` defines the number of distributed training workers and whether to use GPUs.
 3. Your `TorchTrainer` launches the distributed training job.
 

@@ -8,9 +8,9 @@ This tutorial walks through the process of converting an existing PyTorch script
 Learn how to:
 
 1. Configure a model to run distributed and on the correct CPU/GPU device.
-2. Configure a dataloader to shard data across the workers and place data on the correct CPU or GPU device.
-3. Configure a training function to report metrics and save checkpoints.
-4. Configure scale and CPU or GPU resource requirements for a training job.
+2. Configure a dataloader to shard data across the :ref:`workers <train-overview-worker>` and place data on the correct CPU or GPU device.
+3. Configure a :ref:`training function <train-overview-training-function>` to report metrics and save checkpoints.
+4. Configure :ref:`scaling <train-overview-scaling-config>` and CPU or GPU resource requirements for a training job.
 5. Launch a distributed training job with a :class:`~ray.train.torch.TorchTrainer` class.
 
 Quickstart

@@ -7,8 +7,8 @@ This tutorial walks through the process of converting an existing Hugging Face T
 
 Learn how to:
 
-1. Configure your training function to report metrics and save checkpoints.
-2. Configure scale and CPU/GPU resource requirements for your training job.
+1. Configure a :ref:`training function <train-overview-training-function>` to report metrics and save checkpoints.
+2. Configure :ref:`scaling <train-overview-scaling-config>` and CPU or GPU resource requirements for your training job.
 3. Launch your distributed training job with a :class:`~ray.train.torch.TorchTrainer`.
 
 Quickstart
@@ -28,7 +28,7 @@ For reference, the final code follows:
  trainer = TorchTrainer(train_func, scaling_config=scaling_config)
  result = trainer.fit()
 
-1. `train_func` is the Python code that executes on each distributed training worker.
+1. `train_func` is the Python code that executes on each distributed training :ref:`worker <train-overview-worker>`.
 2. :class:`~ray.train.ScalingConfig` defines the number of distributed training workers and computing resources (e.g. GPUs).
 3. :class:`~ray.train.torch.TorchTrainer` launches the distributed training job.
 
@@ -175,7 +175,7 @@ Set up a training function
 --------------------------
 
 First, update your training code to support distributed training. 
-You can begin by wrapping your code in a function:
+You can begin by wrapping your code in a :ref:`training function <train-overview-training-function>`:
 
 .. code-block:: python
 

@@ -3,7 +3,7 @@
 Get Started with Hugging Face Accelerate
 ========================================
 
-The :class:`~ray.train.torch.TorchTrainer` can help you easily launch your `Accelelate <https://huggingface.co/docs/accelerate>`_ training across a distributed Ray cluster.
+The :class:`~ray.train.torch.TorchTrainer` can help you easily launch your `Accelerate <https://huggingface.co/docs/accelerate>`_ training across a distributed Ray cluster.
 
 You only need to run your existing training code with a TorchTrainer. You can expect the final code to look like this: