[docs] fixing broken references, links, note (ray-project#35694)

krfricke · May 25, 2023 · 98a446b · 98a446b
1 parent 55315e8
commit 98a446b
Show file tree

Hide file tree

Showing 11 changed files with 76 additions and 70 deletions.
diff --git a/doc/source/_toc.yml b/doc/source/_toc.yml
@@ -407,7 +407,6 @@ parts:
  - file: ray-observability/user-guides/debug-apps/debug-failures
  - file: ray-observability/user-guides/debug-apps/optimize-performance
  - file: ray-observability/user-guides/debug-apps/ray-debugging
- - file: ray-observability/user-guides/debug-apps/ray-core-profiling
  - file: ray-observability/user-guides/cli-sdk
  - file: ray-observability/user-guides/configure-logging
  - file: ray-observability/user-guides/add-app-metrics

diff --git a/doc/source/cluster/kubernetes/user-guides/logging.md b/doc/source/cluster/kubernetes/user-guides/logging.md
@@ -144,48 +144,6 @@ kubectl logs raycluster-complete-logs-head-xxxxx -c fluentbit
 [KubDoc]: https://kubernetes.io/docs/concepts/cluster-administration/logging/
 [ConfigLink]: https://raw.githubusercontent.com/ray-project/ray/releases/2.4.0/doc/source/cluster/kubernetes/configs/ray-cluster.log.yaml
 
-## Customizing Worker Loggers
-
-When using Ray, all tasks and actors are executed remotely in Ray's worker processes. 
-
-:::{note}
-To stream logs to a driver, they should be flushed to stdout and stderr.
-:::
-
-```python
-import ray
-import logging
-# Initiate a driver.
-ray.init()
-
-@ray.remote
-class Actor:
- def __init__(self):
- # Basic config automatically configures logs to
- # be streamed to stdout and stderr.
- # Set the severity to INFO so that info logs are printed to stdout.
- logging.basicConfig(level=logging.INFO)
-
- def log(self, msg):
- logger = logging.getLogger(__name__)
- logger.info(msg)
-
-actor = Actor.remote()
-ray.get(actor.log.remote("A log message for an actor."))
-
-@ray.remote
-def f(msg):
- logging.basicConfig(level=logging.INFO)
- logger = logging.getLogger(__name__)
- logger.info(msg)
-
-ray.get(f.remote("A log message for a task."))
-```
-
-```bash
-(Actor pid=179641) INFO:__main__:A log message for an actor.
-(f pid=177572) INFO:__main__:A log message for a task.
-```
 ## Using structured logging
 
 The metadata of tasks or actors may be obtained by Ray's :ref:`runtime_context APIs <runtime-context-apis>`.

diff --git a/doc/source/ray-air/getting-started.rst b/doc/source/ray-air/getting-started.rst
@@ -216,4 +216,4 @@ Next Steps
 - :ref:`air-examples-ref`
 - :ref:`API reference <air-api-ref>`
 - :ref:`Technical whitepaper <whitepaper>`
-- To check how your application is doing, you can use the :ref:`Ray dashboard<robservability-getting-started>`. 
+- To check how your application is doing, you can use the :ref:`Ray dashboard<observability-getting-started>`. 
diff --git a/doc/source/ray-observability/key-concepts.rst b/doc/source/ray-observability/key-concepts.rst
@@ -66,7 +66,7 @@ internal stats (e.g., number of actors in the cluster, number of worker failures
 and custom metrics (e.g., metrics defined by users). All stats can be exported as time series data (to Prometheus by default) and used
 to monitor the cluster over time.
 
-See :ref:`Ray Metrics <ray-metrics>` for more details.
+See :ref:`Ray Metrics <dash-metrics-view>` for more details.
 
 Exceptions
 ----------
@@ -93,9 +93,9 @@ See :ref:`Ray Debugger <ray-debugger>` for more details.
 
 Profiling
 ---------
-Ray is compatible with Python profiling tools such as ``CProfile``. It also supports its built-in profiling tool such as :ref:```ray timeline`` <ray-timeline-doc>`.
+Ray is compatible with Python profiling tools such as ``CProfile``. It also supports its built-in profiling tool such as :ref:`ray timeline <ray-timeline-doc>`.
 
-See :ref:`Profiling <ray-core-profiling>` for more details.
+See :ref:`Profiling <dashboard-cprofile>` for more details.
 
 Tracing
 -------
@@ -166,13 +166,16 @@ Actor log messages look like the following by default.
 
  (MyActor pid=480956) actor log message
 
+.. _logging-directory-structure:
+
 Logging directory structure
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 By default, Ray logs are stored in a ``/tmp/ray/session_*/logs`` directory.
 
-..{note}:
-The default temp directory is ``/tmp/ray`` (for Linux and MacOS). To change the temp directory, specify it when you call ``ray start`` or ``ray.init()``. 
+.. note::
+
+ The default temp directory is ``/tmp/ray`` (for Linux and MacOS). To change the temp directory, specify it when you call ``ray start`` or ``ray.init()``. 
 
 A new Ray instance creates a new session ID to the temp directory. The latest session ID is symlinked to ``/tmp/ray/session_latest``.
 

diff --git a/doc/source/ray-observability/user-guides/add-app-metrics.rst b/doc/source/ray-observability/user-guides/add-app-metrics.rst
@@ -8,7 +8,7 @@ There are currently three metrics supported: Counter, Gauge, and Histogram.
 These metrics correspond to the same `Prometheus metric types <https://prometheus.io/docs/concepts/metric_types/>`_.
 Below is a simple example of an actor that exports metrics using these APIs:
 
-.. literalinclude:: doc_code/metrics_example.py
+.. literalinclude:: ../doc_code/metrics_example.py
  :language: python
 
 While the script is running, the metrics are exported to ``localhost:8080`` (this is the endpoint that Prometheus would be configured to scrape).

diff --git a/doc/source/ray-observability/user-guides/configure-logging.rst b/doc/source/ray-observability/user-guides/configure-logging.rst
@@ -6,7 +6,7 @@ Configuring Logging
 This guide helps you modify the default configuration of Ray's logging system.
 
 
-Internal Ray Logging Configuration
+Internal Ray logging configuration
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 When ``import ray`` is executed, Ray's logger is initialized, generating a sensible configuration given in ``python/ray/_private/log.py``. The default logging level is ``logging.INFO``.
 
@@ -40,7 +40,7 @@ Similarly, to modify the logging configuration for any Ray subcomponent, specify
  # Here's how to add an aditional file handler for ray tune:
  ray_tune_logger.addHandler(logging.FileHandler("extra_ray_tune_log.log"))
 
-For more information about logging in workers, see :ref:`Customizing worker loggers`.
+For more information about logging in workers, see :ref:`Customizing worker loggers <customize-worker-loggers>`.
 
 Disabling logging to the driver
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -106,7 +106,7 @@ By default Ray prints Actor logs prefixes in light blue:
 Users may instead activate multi-color prefixes by setting the environment variable ``RAY_COLOR_PREFIX=1``.
 This will index into an array of colors modulo the PID of each process.
 
-.. image:: ./images/coloring-actor-log-prefixes.png
+.. image:: ../images/coloring-actor-log-prefixes.png
  :align: center
 
 Distributed progress bars (tqdm)
@@ -129,3 +129,49 @@ Limitations:
 
 By default, the builtin print will also be patched to use `ray.experimental.tqdm_ray.safe_print` when `tqdm_ray` is used.
 This avoids progress bar corruption on driver print statements. To disable this, set `RAY_TQDM_PATCH_PRINT=0`.
+
+.. _customize-worker-loggers:
+
+Customizing worker loggers
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When using Ray, all tasks and actors are executed remotely in Ray's worker processes. 
+
+.. note::
+
+ To stream logs to a driver, they should be flushed to stdout and stderr.
+
+.. code-block:: python
+
+ import ray
+ import logging
+ # Initiate a driver.
+ ray.init()
+
+ @ray.remote
+ class Actor:
+ def __init__(self):
+ # Basic config automatically configures logs to
+ # be streamed to stdout and stderr.
+ # Set the severity to INFO so that info logs are printed to stdout.
+ logging.basicConfig(level=logging.INFO)
+
+ def log(self, msg):
+ logger = logging.getLogger(__name__)
+ logger.info(msg)
+
+ actor = Actor.remote()
+ ray.get(actor.log.remote("A log message for an actor."))
+
+ @ray.remote
+ def f(msg):
+ logging.basicConfig(level=logging.INFO)
+ logger = logging.getLogger(__name__)
+ logger.info(msg)
+
+ ray.get(f.remote("A log message for a task."))
+
+.. code-block:: bash
+
+ (Actor pid=179641) INFO:__main__:A log message for an actor.
+ (f pid=177572) INFO:__main__:A log message for a task.
diff --git a/doc/source/ray-observability/user-guides/debug-apps/debug-failures.rst b/doc/source/ray-observability/user-guides/debug-apps/debug-failures.rst
@@ -96,7 +96,7 @@ it will raise an exception with one of the following error messages (which indic
 
 Also, you can use the `dmesg <https://phoenixnap.com/kb/dmesg-linux#:~:text=The%20dmesg%20command%20is%20a,take%20place%20during%20system%20startup.>`_ CLI command to verify the processes are killed by the Linux out-of-memory killer.
 
-.. image:: ../images/dmsg.png
+.. image:: ../../images/dmsg.png
  :align: center
 
 If the worker is killed by Ray's memory monitor, they are automatically retried (see the :ref:`link <ray-oom-retry-policy>` for the detail).
@@ -130,10 +130,10 @@ Ray memory monitor also periodically prints the aggregated out-of-memory killer
 
 Ray Dashboard's :ref:`metrics page <dash-metrics-view>` and :ref:`event page <dash-event>` also provides the out-of-memory killer-specific events and metrics.
 
-.. image:: ../images/oom-metrics.png
+.. image:: ../../images/oom-metrics.png
  :align: center
 
-.. image:: ../images/oom-events.png
+.. image:: ../../images/oom-events.png
  :align: center
 
 .. _troubleshooting-out-of-memory-task-actor-mem-usage:
@@ -150,7 +150,7 @@ The memory usage from the per component graph uses RSS - SHR. See the below for
 
 Alternatively, you can also use the CLI command `htop <https://htop.dev/>`_.
 
-.. image:: ../images/htop.png
+.. image:: ../../images/htop.png
  :align: center
 
 See the ``allocate_memory`` row. See two columns, RSS and SHR. 
@@ -173,12 +173,12 @@ Head Node Out-of-Memory Error
 
 First, check the head node memory usage from the metrics page. Find the head node address from the cluster page.
 
-.. image:: ../images/head-node-addr.png
+.. image:: ../../images/head-node-addr.png
  :align: center
 
 And then check the memory usage from the head node from the node memory usage view inside the Dashboard :ref:`metrics view <dash-metrics-view>`.
 
-.. image:: ../images/metrics-node-view.png
+.. image:: ../../images/metrics-node-view.png
  :align: center
 
 Ray head node has more memory-demanding system components such as GCS or the dashboard. 
@@ -201,10 +201,10 @@ You can verify it by looking at the :ref:`per task and actor memory usage graph
 First, see the memory usage of a ``allocate_memory`` task. It is total 18GB.
 At the same time, you can verify 15 concurrent tasks running.
 
-.. image:: ../images/component-memory.png
+.. image:: ../../images/component-memory.png
  :align: center
 
-.. image:: ../images/tasks-graph.png
+.. image:: ../../images/tasks-graph.png
  :align: center
 
 It means each task uses about 18GB / 15 == 1.2 GB. To reduce the parallelism,

diff --git a/doc/source/ray-observability/user-guides/debug-apps/debug-memory.rst b/doc/source/ray-observability/user-guides/debug-apps/debug-memory.rst
@@ -1,5 +1,3 @@
-.. _ray-core-profiling:
-
 .. _ray-core-mem-profiling:
 
 Debugging Memory Issues
@@ -22,7 +20,7 @@ This will allow you to download profiling files from other nodes.
 
  .. tab-item:: Actors
 
- .. literalinclude:: ../doc_code/memray_profiling.py
+ .. literalinclude:: ../../doc_code/memray_profiling.py
  :language: python
  :start-after: __memray_profiling_start__
  :end-before: __memray_profiling_end__
@@ -31,19 +29,19 @@ This will allow you to download profiling files from other nodes.
 
  Note that tasks have a shorter lifetime, so there could be lots of memory profiling files.
 
- .. literalinclude:: ../doc_code/memray_profiling.py
+ .. literalinclude:: ../../doc_code/memray_profiling.py
  :language: python
  :start-after: __memray_profiling_task_start__
  :end-before: __memray_profiling_task_end__
 
 Once the task or actor runs, go to the :ref:`Logs View <dash-logs-view>` of the dashboard. Find and click the log file name.
 
-.. image:: ../images/memory-profiling-files.png
+.. image:: ../../images/memory-profiling-files.png
  :align: center
 
 Click the download button. 
 
-.. image:: ../images/download-memory-profiling-files.png
+.. image:: ../../images/download-memory-profiling-files.png
  :align: center
 
 Now, you have the memory profiling file. Running

diff --git a/doc/source/ray-observability/user-guides/debug-apps/index.md b/doc/source/ray-observability/user-guides/debug-apps/index.md
@@ -1,4 +1,4 @@
-(observability-user-guides)=
+(observability-debug-apps)=
 
 # Troubleshooting Applications
 

diff --git a/doc/source/ray-observability/user-guides/debug-apps/optimize-performance.rst b/doc/source/ray-observability/user-guides/debug-apps/optimize-performance.rst
@@ -83,7 +83,7 @@ Then open `chrome:https://tracing`_ in the Chrome web browser, and load
 Python CPU Profiling in the Dashboard
 -------------------------------------
 
-The :ref:`ray-dashboard` lets you profile Ray worker processes by clicking on the "Stack Trace" or "CPU Flame Graph"
+The :ref:`Ray dashboard <observability-getting-started>` lets you profile Ray worker processes by clicking on the "Stack Trace" or "CPU Flame Graph"
 actions for active workers, actors, and jobs.
 
 .. image:: /images/profile.png
@@ -119,6 +119,8 @@ not have root permissions, the dashboard will prompt with instructions on how to
 
  Alternatively, you can start Ray with passwordless sudo / root permissions.
 
+.. _dashboard-cprofile:
+
 Profiling Using Python's CProfile
 ---------------------------------
 

diff --git a/doc/source/ray-observability/user-guides/index.md b/doc/source/ray-observability/user-guides/index.md
@@ -6,7 +6,7 @@ These guides help you monitor and debug your Ray applications and clusters.
 
 The guides include:
 * {ref}`observability-general-troubleshoot`
-* {ref}`observability-user-guides`
+* {ref}`observability-debug-apps`
 * {ref}`observability-programmatic`
 * {ref}`configure-logging`
 * {ref}`application-level-metrics`