[Serve] create internal request id to track request objects #45761

GeneDer · 2024-06-05T18:51:18Z

Why are these changes needed?

User found an issue where when the request id is reused by multiple requests around the same time, Serve's proxy will be unable to track the request objects correctly. We should not rely on client's request id to be unique during their lifecycle. This PR added a new internal_request_id on the RequestMetadata, randomly generated by the proxy, to track the request object. Client passed request_id continues to be set on the RequestMetadata and be used for logging. Also, updated the doc on the request id to make it more clear on what happens when it's re-used.

Related issue number

Closes #45723

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Gene Su <[email protected]>

GeneDer · 2024-06-05T23:48:54Z

python/ray/serve/_private/proxy.py

@@ -396,15 +396,18 @@ def _get_response_handler_info(
 if version.parse(starlette.__version__) < version.parse("0.33.0"):
 proxy_request.set_path(route_path.replace(route_prefix, "", 1))

+ internal_request_id = generate_request_id()


internal_request_id is generated here always

JoshKarpel

Thanks for jumping on this so quickly! Much appreciated!

doc/source/serve/monitoring.md

jugalshah291 · 2024-06-06T15:22:13Z

python/ray/serve/_private/common.py

@@ -677,6 +677,7 @@ class DeploymentHandleSource(str, Enum):
 @dataclass
 class RequestMetadata:
 request_id: str
+ internal_request_id: str


qq: is the internal_request_id send as an header to the app behind ray serve proxy? (just a thought)

No it's not, this will appear in the serve context object during the request if there is a need for the deployment to know about it, but the request client won't have this info. Do you think there is a need to pass this back to the client? Can make a new header for it if there is a need for this.

Oh actually reading it again, since it's in the context object, the "app" can get it through calling ray.serve.context._serve_request_context.get(). internal_request_id. It's not passed as a header to the "app" nor the request "client".

^ not a public API

this internal request ID is purely an implementation detail, shouldn't be exposed to end users

oh ok, let me take it out👍

GeneDer · 2024-06-06T16:04:27Z

Thanks for jumping on this so quickly! Much appreciated!

You got it! Also thanks for the detailed issue, that made writing tests so much easier 😄

Signed-off-by: Gene Su <[email protected]>

edoakes

@GeneDer I believe request_id is used in some internal data structures inside the replica scheduler logic -- does that part need to be updated?

doc/source/serve/monitoring.md

edoakes · 2024-06-06T17:05:08Z

python/ray/serve/_private/common.py

@@ -677,6 +677,7 @@ class DeploymentHandleSource(str, Enum):
 @dataclass
 class RequestMetadata:
 request_id: str
+ internal_request_id: str


^ not a public API

this internal request ID is purely an implementation detail, shouldn't be exposed to end users

src/ray/protobuf/serve.proto

python/ray/serve/_private/common.py

Co-authored-by: Edward Oakes <[email protected]> Signed-off-by: Gene Der Su <[email protected]>

Signed-off-by: Gene Su <[email protected]>

edoakes · 2024-06-06T20:41:28Z

@GeneDer I believe request_id is used in some internal data structures inside the replica scheduler logic -- does that part need to be updated?

I think at a minimum we are using request_id in some logs in the replica scheduler -- want to use the external one to avoid this causing confusion.

Signed-off-by: Gene Su <[email protected]>

edoakes

LGTM

edoakes · 2024-06-10T16:16:30Z

python/ray/serve/tests/test_http_headers.py

+ """Sending 20 requests in parallel all with the same request id, but with
+ different request body.
+ """
+ bodies = [{"app_name": f"an_{uuid.uuid4()}"} for _ in range(20)]


did this fail consistently w/o the code changes?

could do many more concurrent requests here, like 500, to increase confidence

Yes, on master, this test will consistently failing with the requests are tracked incorrectly

…ect#45761)   ## Why are these changes needed? User found an issue where when the request id is reused by multiple requests around the same time, Serve's proxy will be unable to track the request objects correctly. We should not rely on client's request id to be unique during their lifecycle. This PR added a new `internal_request_id` on the `RequestMetadata`, randomly generated by the proxy, to track the request object. Client passed `request_id` continues to be set on the `RequestMetadata` and be used for logging. Also, updated the doc on the request id to make it more clear on what happens when it's re-used. ## Related issue number Closes ray-project#45723 ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Gene Su <[email protected]> Signed-off-by: Gene Der Su <[email protected]> Co-authored-by: Edward Oakes <[email protected]> Signed-off-by: Richard Liu <[email protected]>

[Serve] create internal request id to track request objects

a6ee6d1

Signed-off-by: Gene Su <[email protected]>

GeneDer added serve Ray Serve Related Issue go add ONLY when ready to merge, run all tests labels Jun 5, 2024

GeneDer added 2 commits June 5, 2024 13:27

fix some tests

2858889

Signed-off-by: Gene Su <[email protected]>

add new test and doc change

8e566d7

Signed-off-by: Gene Su <[email protected]>

GeneDer marked this pull request as ready for review June 5, 2024 23:48

GeneDer requested review from edoakes, shrekris-anyscale, zcin, akshay-anyscale and a team as code owners June 5, 2024 23:48

GeneDer commented Jun 5, 2024

View reviewed changes

JoshKarpel reviewed Jun 6, 2024

View reviewed changes

doc/source/serve/monitoring.md Outdated Show resolved Hide resolved

jugalshah291 reviewed Jun 6, 2024

View reviewed changes

update docs about reusing request id

cf8a71d

Signed-off-by: Gene Su <[email protected]>

edoakes reviewed Jun 6, 2024

View reviewed changes

python/ray/serve/_private/common.py Show resolved Hide resolved

GeneDer and others added 2 commits June 6, 2024 10:10

Update doc/source/serve/monitoring.md

8648921

Co-authored-by: Edward Oakes <[email protected]> Signed-off-by: Gene Der Su <[email protected]>

address comments

7bde588

Signed-off-by: Gene Su <[email protected]>

GeneDer added 2 commits June 6, 2024 16:02

fix

dee81ea

Signed-off-by: Gene Su <[email protected]>

fix proxy tests

3462362

Signed-off-by: Gene Su <[email protected]>

edoakes approved these changes Jun 10, 2024

View reviewed changes

edoakes merged commit 5452d0b into ray-project:master Jun 10, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Serve] create internal request id to track request objects #45761

[Serve] create internal request id to track request objects #45761

GeneDer commented Jun 5, 2024 •

edited

Loading

GeneDer Jun 5, 2024

JoshKarpel left a comment

jugalshah291 Jun 6, 2024 •

edited

Loading

GeneDer Jun 6, 2024

GeneDer Jun 6, 2024

edoakes Jun 6, 2024

GeneDer Jun 6, 2024

GeneDer commented Jun 6, 2024

edoakes left a comment

edoakes Jun 6, 2024

edoakes commented Jun 6, 2024

edoakes left a comment

edoakes Jun 10, 2024

GeneDer Jun 10, 2024 •

edited

Loading

[Serve] create internal request id to track request objects #45761

[Serve] create internal request id to track request objects #45761

Conversation

GeneDer commented Jun 5, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

GeneDer Jun 5, 2024

Choose a reason for hiding this comment

JoshKarpel left a comment

Choose a reason for hiding this comment

jugalshah291 Jun 6, 2024 • edited Loading

Choose a reason for hiding this comment

GeneDer Jun 6, 2024

Choose a reason for hiding this comment

GeneDer Jun 6, 2024

Choose a reason for hiding this comment

edoakes Jun 6, 2024

Choose a reason for hiding this comment

GeneDer Jun 6, 2024

Choose a reason for hiding this comment

GeneDer commented Jun 6, 2024

edoakes left a comment

Choose a reason for hiding this comment

edoakes Jun 6, 2024

Choose a reason for hiding this comment

edoakes commented Jun 6, 2024

edoakes left a comment

Choose a reason for hiding this comment

edoakes Jun 10, 2024

Choose a reason for hiding this comment

GeneDer Jun 10, 2024 • edited Loading

Choose a reason for hiding this comment

GeneDer commented Jun 5, 2024 •

edited

Loading

jugalshah291 Jun 6, 2024 •

edited

Loading

GeneDer Jun 10, 2024 •

edited

Loading