-
Notifications
You must be signed in to change notification settings - Fork 651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX-#4449 Drain the call queue before waiting on result in benchmark mode. #4472
FIX-#4449 Drain the call queue before waiting on result in benchmark mode. #4472
Conversation
… in benchmark mode. Signed-off-by: mvashishtha <[email protected]>
Signed-off-by: mvashishtha <[email protected]>
8e03165
to
936b268
Compare
Codecov Report
@@ Coverage Diff @@
## master #4472 +/- ##
==========================================
+ Coverage 86.82% 89.03% +2.21%
==========================================
Files 222 222
Lines 18038 18494 +456
==========================================
+ Hits 15662 16467 +805
+ Misses 2376 2027 -349
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
Co-authored-by: Anatoly Myachev <[email protected]>
…ager.py Co-authored-by: Anatoly Myachev <[email protected]>
@YarShev that won't fix the bug. e.g. if we have a transpose on the call queue, |
@mvashishtha, yes, I understand. I am saying about waiting on partition ids after we drain the call queue. |
@YarShev what's wrong with waiting the way we are doing now? We are not waiting in parallel, but any remote functions should be executing in parallel anyway. Doing a |
@mvashishtha, I would like to make sure if there is a difference between waiting on each partition separately and on all partitions at once both in a single node and in a cluster. @modin-project/modin-ray, could you shed a light on this? |
@YarShev Isn't that question separate from the problem this PR is addressing? Draining the call queue is something we have to do serially anyway. |
@mvashishtha, I think my question may be related to the problem this PR is addressing. I am wondering if there is a parallel wait on object references in case of a cluster? |
If I'm understanding this conversation correctly, |
@wuisawesome, that is what I expected. Could you tell us about how the parallel wait works? Does the parallelization happen per workers or per nodes (plasma)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, we should just fix the partition.wait
being called in a loop here. It is the main source of the issue.
@devin-petersohn I think this particular issue just has to do with the call queue. Even if we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mvashishtha Makes sense, I thought this bug was an issue with synchronous execution.
LGTM
Still, I think we should use a parallel wait from Ray (and probably from Dask) to check the performance correctly. We likely want to wait for partitions when all computations are complete in parallel. I guess we could add a method named |
@YarShev I still think this is a separate issue that is independent of this one. I have filed #4491 for it. This issue is about deferred tasks that are executed in sequence in benchmark mode, whereas #4491 is about more precisely waiting for all async computations in benchmark mode in general. Let's please keep this PR focused on #4449. |
Signed-off-by: mvashishtha <[email protected]>
…shishtha/modin into 4449-benchmark-mode-drain-call-queue
adbf2ff
@mvashishtha, ok |
Co-authored-by: Yaroslav Igoshev <[email protected]>
What do these changes do?
flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
git commit -s
docs/development/architecture.rst
is up-to-date