FEA Implement generator unordered parameter #1463

fcharras · 2023-06-26T08:10:26Z

The PR builds upon #1458

It looks straightforward, the diff in parallel.py outside of documentation is about 10 lines added/removed.

More testing to be added, and an extension to the tutorial, that highlights how RAM usage can be further decreased if the tasks have significantly different processing times maybe ?

Co-authored-by: Thomas Moreau <[email protected]>

codecov · 2023-06-26T08:13:09Z

Codecov Report

Patch coverage: 100.00% and project coverage change: -0.08% ⚠️

Comparison is base (abd3af7) 94.93% compared to head (5b7f73c) 94.85%.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1463      +/-   ##
==========================================
- Coverage   94.93%   94.85%   -0.08%     
==========================================
  Files          45       45              
  Lines        7515     7546      +31     
==========================================
+ Hits         7134     7158      +24     
- Misses        381      388       +7

Files Changed	Coverage Δ
joblib/parallel.py	`96.81% <100.00%> (+0.98%)`	⬆️
joblib/test/test_parallel.py	`96.18% <100.00%> (+<0.01%)`	⬆️

... and 5 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…tor_unordered advantages

fcharras · 2023-07-06T16:03:13Z

I don't think the fail on pypy is related to the PR and it's benign

ogrisel · 2023-07-07T09:30:00Z

There are two warnings making the HTML rendering fail on the CI:

/home/docs/checkouts/readthedocs.org/user_builds/joblib/envs/1463/lib/python3.7/site-packages/joblib/parallel.py:docstring of joblib.parallel.Parallel:46: WARNING: Block quote ends without a blank line; unexpected unindent.
/home/docs/checkouts/readthedocs.org/user_builds/joblib/envs/1463/lib/python3.7/site-packages/joblib/parallel.py:docstring of joblib.parallel.Parallel:45: WARNING: Block quote ends without a blank line; unexpected unindent.

could you please fix those to make it possible to review the rendered HTML for the examples expanded in this PR?

ogrisel

This looks great but I will finalize the review once the example is running on the CI.

joblib/test/test_parallel.py

…aways with reasonably fast tasks

fcharras · 2023-07-12T16:46:14Z

The example is running now. The pypy test for backend abortion fails consistently on this PR though, something new in this PR probably has been delaying backend abortion in pypy to the point that this test fails now 🤔

edit: what seems to happen is that the abort_backend unit test is delayed by the slow task (10sc sleep) in the previous test.

…nsistent

… latest completion)

fcharras · 2023-07-27T10:29:12Z

NB: current pipeline issues are not related to the latest commit, it seems that the pipeline setup is down because of dependencies issues.

ogrisel

First pass of comments on the example and the tests. I will do a second pass on the change itself after I finish some local experiments.

examples/parallel_generator.py

joblib/test/test_parallel.py

ogrisel · 2023-08-23T16:04:39Z

@fcharras I think we do not need the _jobs_unordered set. Let's see if this the tests pass on the CI with that suggested change in #1500.

EDIT: the CI is green (although there was an unrelated failure in a jupyter notebook related test). Feel free to cherry-pick fce3475 into this PR.

ogrisel · 2023-08-23T16:32:54Z

Once the above comments are addressed, LGTM.

fcharras · 2023-08-24T08:46:36Z

Thank you for the review @ogrisel !

Regarding fce3475 I thought about that too but in this case after a new task is fired the main process doesn't keep any reference to it, it could just be garbage collected and then strange behavior might happen, and said behavior could depend on the backend ? I'd prefer keep explicitly the set of references in _jobs_unordered for this reason.

ogrisel · 2023-08-24T09:14:58Z

Regarding fce3475 I thought about that too but in this case after a new task is fired the main process doesn't keep any reference to it, it could just be garbage collected and then strange behavior might happen, and said behavior could depend on the backend ?

It's not possible for a backend to call a callback successfully without keeping a reference to the callback object. If a backend does not call the callback, then the backend would be broken for any the value of return_as.

fcharras · 2023-08-24T10:19:03Z

But couldn't the opposite be true: because some given backend does not keep a reference to any task object internally, then the callback might not be called if the user-level reference is deleted ? basically a backend could use this for early-interruption of tasks on deletion of the refence.

ogrisel · 2023-08-24T12:26:37Z

Since we call:

job = self._backend.apply_async(batch, callback=batch_tracker)

and that the backend has to call the batch_tracker callable once the task is complete, it has to keep a reference to it (or a serialized copy of it, it does not matter). Once the task is complete, and the batch_tracker callable is found by the reference held by the backend (or reconstructed from a serialize copy if the backend chose to do that) and then called. When called the batch_tracker object will re-add itself to the parallel._jobs queue, so there is no problem.

The naming is weird, because the parallel._jobs is actually a queue of task batch "callbacks", but beyond the naming I do not not see any problem with not keeping a reference from the parallel object.

ogrisel · 2023-08-25T09:25:42Z

Please also document the new feature in the changelog, targeting the next release.

…nerator_unordered

ogrisel

LGTM!

tomMoral

LGTM! thx @fcharras for this nice implem :)

fcharras and others added 9 commits June 22, 2023 11:49

return_generator={True,False} -> return_as={'list','submitted'}

27629e0

Add the link to the PR to CHANGES.rst

e98a91f

backquote formatting

debd352

Merge branch 'master' into enh/change_generator_api

e4ffb3f

linting

542d081

apply review suggestions

7fad99d

"submitted/completed" replaced with "generator/generator_unordered"

efcdb90

minor fixups

1e717d2

Typo

7c2043f

Co-authored-by: Thomas Moreau <[email protected]>

fcharras force-pushed the fea/generator_unordered branch from 7bb322f to 479e10f Compare June 26, 2023 08:17

ogrisel mentioned this pull request Jun 26, 2023

DOC reference parallel_config instead of parallel_backend #1457

Merged

implement generator_unordered option

0d1d076

fcharras force-pushed the fea/generator_unordered branch from 479e10f to 0d1d076 Compare June 26, 2023 08:29

fcharras added 2 commits June 28, 2023 10:13

more testing

fd89270

Conflicts fixups

429623c

fcharras changed the title ~~WIP: Implement generator unordered parameter~~ FEA Implement generator unordered parameter Jun 28, 2023

fcharras added 5 commits June 28, 2023 10:32

generator_unordered test requires multiprocessing enabled

5db6e5e

Extend return_generator example with a usecase that highlights genera…

4e34d16

…tor_unordered advantages

linting fixup

abef6da

test_multiple_generator_call_separated_gc fixup

962ece3

test_multiple_generator_call_separated_gc fixup

6141233

fcharras mentioned this pull request Jul 6, 2023

return_generator: add async flag to yield results in order completed #1449

Closed

renaming _pending_jobs -> _jobs_unordered

8523349

ogrisel reviewed Jul 7, 2023

View reviewed changes

joblib/test/test_parallel.py Show resolved Hide resolved

fcharras added 3 commits July 12, 2023 16:33

set parameters to reproduce consistently the generator_unordered take…

3fb692c

…aways with reasonably fast tasks

faster example and better explanations

204deab

linting

1054d64

fix doc complaining about docstring formatting

6192528

fcharras added 2 commits July 13, 2023 13:50

weaken test_abort_backend for pypy / make generator_unordered more co…

aa893bd

…nsistent

fix order in which results are returned (first completion rather than…

a952025

… latest completion)

fcharras mentioned this pull request Jul 27, 2023

WIP: add an example of self sustaining parallel compute #1485

Open

ogrisel reviewed Aug 23, 2023

View reviewed changes

ogrisel mentioned this pull request Aug 23, 2023

Simplified generator unordered #1500

Closed

fcharras and others added 3 commits August 29, 2023 10:11

Merge branch 'master' of https://github.com/joblib/joblib into fea/ge…

a1cae85

…nerator_unordered

Simplify return_as='generator_unordered'

c730efa

Address review suggestions for example and changes

5b7f73c

ogrisel approved these changes Aug 29, 2023

View reviewed changes

tomMoral approved these changes Aug 29, 2023

View reviewed changes

tomMoral merged commit bfd14eb into joblib:master Aug 29, 2023
16 checks passed

fcharras mentioned this pull request Aug 29, 2023

Asynchronous output variation of Parallel.__call__ #79

Closed

apiszcz mentioned this pull request Mar 8, 2024

return_as generator_unordered support #1554

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEA Implement generator unordered parameter #1463

FEA Implement generator unordered parameter #1463

fcharras commented Jun 26, 2023

codecov bot commented Jun 26, 2023 •

edited

Loading

fcharras commented Jul 6, 2023

ogrisel commented Jul 7, 2023

ogrisel left a comment

fcharras commented Jul 12, 2023 •

edited

Loading

fcharras commented Jul 27, 2023

ogrisel left a comment •

edited

Loading

ogrisel commented Aug 23, 2023 •

edited

Loading

ogrisel commented Aug 23, 2023

fcharras commented Aug 24, 2023

ogrisel commented Aug 24, 2023 •

edited

Loading

fcharras commented Aug 24, 2023 •

edited

Loading

ogrisel commented Aug 24, 2023

ogrisel commented Aug 25, 2023

ogrisel left a comment

tomMoral left a comment

FEA Implement generator unordered parameter #1463

FEA Implement generator unordered parameter #1463

Conversation

fcharras commented Jun 26, 2023

codecov bot commented Jun 26, 2023 • edited Loading

Codecov Report

fcharras commented Jul 6, 2023

ogrisel commented Jul 7, 2023

ogrisel left a comment

Choose a reason for hiding this comment

fcharras commented Jul 12, 2023 • edited Loading

fcharras commented Jul 27, 2023

ogrisel left a comment • edited Loading

Choose a reason for hiding this comment

ogrisel commented Aug 23, 2023 • edited Loading

ogrisel commented Aug 23, 2023

fcharras commented Aug 24, 2023

ogrisel commented Aug 24, 2023 • edited Loading

fcharras commented Aug 24, 2023 • edited Loading

ogrisel commented Aug 24, 2023

ogrisel commented Aug 25, 2023

ogrisel left a comment

Choose a reason for hiding this comment

tomMoral left a comment

Choose a reason for hiding this comment

codecov bot commented Jun 26, 2023 •

edited

Loading

fcharras commented Jul 12, 2023 •

edited

Loading

ogrisel left a comment •

edited

Loading

ogrisel commented Aug 23, 2023 •

edited

Loading

ogrisel commented Aug 24, 2023 •

edited

Loading

fcharras commented Aug 24, 2023 •

edited

Loading