Skip to content

Commit

Permalink
[Serve][Docs] Mark metrics served for HTTP vs Python calls (ray-proje…
Browse files Browse the repository at this point in the history
…ct#27858)

Different metrics are collected in Ray Serve when the deployments are called from HTTP vs Python. This needs to be mentioned in the documentation and each metric marked accordingly.

Signed-off-by: Stefan van der Kleij <[email protected]>
  • Loading branch information
zoltan-fedor authored and Stefan van der Kleij committed Aug 18, 2022
1 parent 933407f commit e62c8f5
Show file tree
Hide file tree
Showing 2 changed files with 21 additions and 12 deletions.
2 changes: 1 addition & 1 deletion doc/source/serve/handle-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,4 +58,4 @@ In both types of ServeHandle, you can call a specific method by using the `.meth
:start-after: __begin_handle_method__
:end-before: __end_handle_method__
:language: python
```
```
31 changes: 20 additions & 11 deletions doc/source/serve/monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -211,6 +211,13 @@ You can leverage built-in Ray Serve metrics to get a closer look at your applica
Ray Serve exposes important system metrics like the number of successful and
failed requests through the [Ray metrics monitoring infrastructure](ray-metrics). By default, the metrics are exposed in Prometheus format on each node.

:::{note}
Different metrics are collected when Deployments are called
via Python `ServeHandle` and when they are called via HTTP.

See the list of metrics below marked for each.
:::

The following metrics are exposed by Ray Serve:

```{eval-rst}
Expand All @@ -219,29 +226,31 @@ The following metrics are exposed by Ray Serve:
* - Name
- Description
* - ``serve_deployment_request_counter``
* - ``serve_deployment_request_counter`` [**]
- The number of queries that have been processed in this replica.
* - ``serve_deployment_error_counter``
* - ``serve_deployment_error_counter`` [**]
- The number of exceptions that have occurred in the deployment.
* - ``serve_deployment_replica_starts``
* - ``serve_deployment_replica_starts`` [**]
- The number of times this replica has been restarted due to failure.
* - ``serve_deployment_processing_latency_ms``
* - ``serve_deployment_processing_latency_ms`` [**]
- The latency for queries to be processed.
* - ``serve_replica_processing_queries``
* - ``serve_replica_processing_queries`` [**]
- The current number of queries being processed.
* - ``serve_num_http_requests``
* - ``serve_num_http_requests`` [*]
- The number of HTTP requests processed.
* - ``serve_num_http_error_requests``
* - ``serve_num_http_error_requests`` [*]
- The number of non-200 HTTP responses.
* - ``serve_num_router_requests``
* - ``serve_num_router_requests`` [*]
- The number of requests processed by the router.
* - ``serve_handle_request_counter``
* - ``serve_handle_request_counter`` [**]
- The number of requests processed by this ServeHandle.
* - ``serve_deployment_queued_queries``
* - ``serve_deployment_queued_queries`` [*]
- The number of queries for this deployment waiting to be assigned to a replica.
* - ``serve_num_deployment_http_error_requests``
* - ``serve_num_deployment_http_error_requests`` [*]
- The number of non-200 HTTP responses returned by each deployment.
```
[*] - only available when using HTTP calls
[**] - only available when using Python `ServeHandle` calls

To see this in action, first run the following command to start Ray and set up the metrics export port:

Expand Down

0 comments on commit e62c8f5

Please sign in to comment.