Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Fix bugs in data locality #24698

Merged
merged 7 commits into from
May 20, 2022

Conversation

stephanie-wang
Copy link
Contributor

@stephanie-wang stephanie-wang commented May 11, 2022

Why are these changes needed?

This fixes two bugs in data locality:

  1. When a dependent task is already in the CoreWorker's queue, we ran the data locality policy to choose a raylet before we added the first location for the dependency, so it would appear as if the dependency was not available anywhere.
  2. The locality policy did not take into account spilled locations.

Added C++ unit tests and Python tests for the above.

Related issue number

Fixes #24267.

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Copy link
Contributor

@clarkzinzow clarkzinzow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if tests pass and nightly test looks good

// it as local in the in-memory store so that the data locality policy
// will choose the right raylet for any queued dependent tasks.
const auto pinned_at_raylet_id = NodeID::FromBinary(worker_addr.raylet_id());
reference_counter_->UpdateObjectPinnedAtRaylet(object_id, pinned_at_raylet_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah nice ordering catch!

auto node_ids = it->second.locations;
if (!it->second.spilled_node_id.IsNil()) {
node_ids.emplace(it->second.spilled_node_id);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

ray.wait(xs, num_returns=len(xs), fetch_local=False)
for i, x in enumerate(xs):
task = check_locality.remote(x)
print(i, x, task)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Should remove debug print.

@stephanie-wang
Copy link
Contributor Author

Still seeing unexpected transfers when there is heavy task load, but this does seem to have fixed the problem in 10TB sort where the head node raylet was coming under heavy load.

Before PR:
Screenshot from 2022-05-09 15-33-28

After PR:
Screenshot from 2022-05-12 23-14-38

Note task throughput (green lline) is very low towards the righthand side of the graph in the before version.

@stephanie-wang stephanie-wang merged commit eaec96d into ray-project:master May 20, 2022
@stephanie-wang stephanie-wang deleted the fix-data-locality branch May 20, 2022 00:36
@mwtian
Copy link
Member

mwtian commented May 20, 2022

Can this affect //python/ray/tests:test_reconstruction? It seems to be more flaky after this change:
image

@stephanie-wang
Copy link
Contributor Author

Hmm possibly, but it might just be that the test suite is too large, @mwtian?

@mwtian
Copy link
Member

mwtian commented May 20, 2022

That is possible. I guess the test needs to be splitted.

@mwtian
Copy link
Member

mwtian commented May 20, 2022

Or is it possible that reconstruction becomes slower? It seems timeouts of 900s happen at or after test_reconstruction_stress or test_reconstruction_hangs. I will revert this for now..

mwtian added a commit that referenced this pull request May 20, 2022
@stephanie-wang
Copy link
Contributor Author

OK, I'll try to split test_reconstruction after the revert is merged.

@mwtian
Copy link
Member

mwtian commented May 20, 2022

Sounds good. If you verified that the slowdown is harmless, I can split the tests too.

stephanie-wang pushed a commit that referenced this pull request May 20, 2022
stephanie-wang added a commit to stephanie-wang/ray that referenced this pull request May 23, 2022
stephanie-wang added a commit that referenced this pull request May 25, 2022
Redo for PR #24698:

This fixes two bugs in data locality:

    When a dependent task is already in the CoreWorker's queue, we ran the data locality policy to choose a raylet before we added the first location for the dependency, so it would appear as if the dependency was not available anywhere.
    The locality policy did not take into account spilled locations.

Added C++ unit tests and Python tests for the above.

Split test_reconstruction to avoid test timeout. I believe this was happening because the data locality fix was causing extra scheduler load in a couple of the reconstruction stress tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Datasets][core] Data locality not working as expected during map stages
4 participants