Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add worker num paused tasks api #41422

Merged
merged 1 commit into from
Nov 30, 2023

Conversation

sofianhnaide
Copy link
Contributor

@sofianhnaide sofianhnaide commented Nov 28, 2023

Why are these changes needed?

This is added to give the debugger another API to signal paused workloads without reliably depending on Ray having a valid task Id, this is needed for cases where manual threads are created by the user.

This work also solidify the debugger_port reading and writing to be thread safe.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Copy link
Contributor

@rkooo567 rkooo567 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally lgtm

Actually, I'd like to discuss a little bit more about num_paused_tasks

Is it currently used to be displayed for vs code debugger terminal? If not, I feel like it is better removing this.

  1. Dashboard doesn't display worker state, and we don't have any plan to expose it in a while, so it won't be that useful
  2. it will simplify a lot of code in this PR

RAY_CHECK_OK(gcs_client_->Workers().AsyncUpdateWorkerNumPausedTasks(
worker_id, num_paused_tasks_delta, [&promise](const Status &status) {
RAY_CHECK_OK(status);
promise.set_value(status.ok());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will deadlock if this API is called within the same thread as the callback thread (thread_io_service_).

Can you add a check to the beginning of the function that the current thread is not same as thread_io_service_?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@sofianhnaide
Copy link
Contributor Author

sofianhnaide commented Nov 28, 2023

Generally lgtm

Actually, I'd like to discuss a little bit more about num_paused_tasks

Is it currently used to be displayed for vs code debugger terminal? If not, I feel like it is better removing this.

  1. Dashboard doesn't display worker state, and we don't have any plan to expose it in a while, so it won't be that useful

  2. it will simplify a lot of code in this PR

What we need here is a flag to signal a process being paused in the debugger. That flag can not be the port since you might have a port and be either paused or not.

I am using num_paused_tasks counter essentially as a flag, but then I can capture threading paused and unpaused safely.

The current behavior is all threads are unpaused on connect so technically an unpaused call should trump all paused but I would rather not build with that assumption giving that it won't technically simplify things for me.

If you have any suggestions to simply that I am happy to consider.

Copy link
Contributor

@rickyyx rickyyx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code lgtm!

So is this orthogonal to what Sang's been working on with the threaded task id thing?

python/ray/tests/test_worker_state.py Show resolved Hide resolved
@sofianhnaide
Copy link
Contributor Author

I have also renamed the newely added property to num_paused_threads to be explicit and not mix it with tasks

@rkooo567 rkooo567 merged commit c913358 into ray-project:master Nov 30, 2023
16 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants