[core][experimental] asyncio support for accelerated DAGs #43128

stephanie-wang · 2024-02-13T05:00:19Z

Why are these changes needed?

Adds asyncio support for accelerated DAGs. This works by creating two asyncio.Queues on the driver, one for DAG inputs and the other for DAG outputs. Reading/writing to the DAG input/output channels is performed on asyncio's default threadpool executor to avoid blocking the main thread.

The DAG is not thread-safe.

See unit test changes for API example.

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Stephanie Wang <[email protected]>

rkooo567 · 2024-02-20T10:26:10Z

python/ray/dag/compiled_dag_node.py

+ def __init__(
+ self,
+ buffer_size_bytes: Optional[int],
+ enable_asyncio: bool = False,


Add docstring?

rkooo567 · 2024-02-20T10:32:57Z

python/ray/experimental/channel.py

+ c.end_read()
+
+
+class AwaitableBackgroundOutputReader(InputReader):


it is confusing to have "Output"Reader is a subclass of "Input"Reader.

Do we have to associate these classes to output/input? Why don't we just call it writer / reader?

Good point! :D

rkooo567 · 2024-02-20T10:38:56Z

python/ray/dag/compiled_dag_node.py

+ self._get_or_compile()
+ async with self._dag_submission_lock:
+ await self._dag_submitter.write(args[0])
+ # Allocate a future that the caller can use to get the result.


Can you mention the future is set from a background thread of fetcher?

Ah actually it is not set from a background thread. Fetching is done in the background thread, and then the result is set in the main thread. But I'll note that down.

rkooo567 · 2024-02-20T10:40:16Z

python/ray/dag/compiled_dag_node.py

+ def __init__(
+ self,
+ buffer_size_bytes: Optional[int],
+ enable_asyncio: bool = False,


It feels a little weird to me this has to be defined in the constructor. Normally you can just create a DAG and use *_sync & *_async both.

I wonder if we can lazily start the background reader/writer when we call execute_* first time instead of defininig it at the end of compile?

Hmm I think we can probably handle this better in the future? For now, I think it's better to be explicit. The issue with starting both kinds of readers is that the async reader will create background tasks.

handling it later also sgtm!

rkooo567 · 2024-02-20T10:40:56Z

python/ray/experimental/channel.py

+ loop = asyncio.get_event_loop()
+ while True:
+ res = await self._queue.get()
+ await loop.run_in_executor(None, self._run, res)


I think None will create the threadpool which uses total_num_cpus / 2 threads by default. Should we create a threadpool with a single thread here instead?

This reverts commit b322646.

Signed-off-by: Stephanie Wang <[email protected]>

stephanie-wang · 2024-02-21T03:10:08Z

FYI @rkooo567 added some C++ changes to enable thread-safe begin_read / end_read, lost them accidentally in the merge earlier.

Signed-off-by: Stephanie Wang <[email protected]>

rkooo567

LGTM. One question regarding the thread safety implementation

rkooo567 · 2024-02-28T14:55:47Z

src/ray/core_worker/experimental_mutable_object_manager.cc

- RAY_RETURN_NOT_OK(EnsureGetAcquired(channel));
+
+ // If there is a concurrent ReadRelease, wait for that to finish first.
+ sem_wait(&channel.reader_semaphore);


Q: does it affect perf btw? Remeber Eric mentioned busy waiting was necessary for perf?

rkooo567 · 2024-02-28T15:00:19Z

src/ray/core_worker/experimental_mutable_object_manager.h

 ReaderChannel(std::unique_ptr<plasma::MutableObject> mutable_object_ptr)
- : mutable_object(std::move(mutable_object_ptr)) {}
+ : mutable_object(std::move(mutable_object_ptr)) {
+ RAY_CHECK(sem_init(&reader_semaphore, /*pshared=*/0, /*value=*/1) == 0);


https://man7.org/linux/man-pages/man3/sem_init.3.html

The doc says this value needs to be non-zero to be shared across process. Does this mean we pshared == 0 doesn't work?

It's only used by the same process.

rkooo567 · 2024-02-28T15:01:08Z

src/ray/core_worker/experimental_mutable_object_manager.h

- /// All channels for which we are registered as a writer. This can overlap
- /// with reader channels (e.g., if the CoreWorker is multithreaded and one
- /// thread reads while the other writes).
+ // TODO(swang): Access to these maps is not threadsafe. This is fine in the


Nit; To be safer, we can create Add/Remove Channel APi with FATAL error? And we can add a comment there?

rkooo567 · 2024-02-28T15:02:36Z

src/ray/core_worker/experimental_mutable_object_manager.cc

-#endif
- return Status::OK();
-}
+ sem_post(&channel.reader_semaphore);


This means if release is not called, it can hang indefinitely? Is it okay in case of exceptions being raised? E.g., read 2 values, release 1 value and exception is raised

Yes, but that was already true before. Exceptions get serialized as normal values.

rkooo567 · 2024-03-01T22:07:57Z

Feel free to merge after addressing last comments!

…microbenchmarks (#44330) Fix microbenchmark after changes made in #43128: must begin_read() on channel before calling end_read() multi-output DAGs now return a single channel instead of multiple add asyncio versions of the DAG benchmarks Signed-off-by: Stephanie Wang <[email protected]>

…t#43128) Adds asyncio support for accelerated DAGs. This works by creating two asyncio.Queues on the driver, one for DAG inputs and the other for DAG outputs. Reading/writing to the DAG input/output channels is performed on asyncio's default threadpool executor to avoid blocking the main thread. The DAG is not thread-safe. See unit test changes for API example. --------- Signed-off-by: Stephanie Wang <[email protected]>

…microbenchmarks (ray-project#44330) Fix microbenchmark after changes made in ray-project#43128: must begin_read() on channel before calling end_read() multi-output DAGs now return a single channel instead of multiple add asyncio versions of the DAG benchmarks Signed-off-by: Stephanie Wang <[email protected]>

stephanie-wang added 12 commits February 5, 2024 10:21

ExperimentalChannelManager works

d909fb1

Signed-off-by: Stephanie Wang <[email protected]>

refactor

8d8b70f

Signed-off-by: Stephanie Wang <[email protected]>

clean

22b6f92

Signed-off-by: Stephanie Wang <[email protected]>

clean

7392072

Signed-off-by: Stephanie Wang <[email protected]>

clean

72f81d4

Signed-off-by: Stephanie Wang <[email protected]>

fix

1003937

Signed-off-by: Stephanie Wang <[email protected]>

fix

7513285

Signed-off-by: Stephanie Wang <[email protected]>

return

07e3299

Signed-off-by: Stephanie Wang <[email protected]>

ReadAcquire thread-safe

fad43c3

Signed-off-by: Stephanie Wang <[email protected]>

Channel I/O

7bd7ab8

Signed-off-by: Stephanie Wang <[email protected]>

asyncio

1a53569

Signed-off-by: Stephanie Wang <[email protected]>

test

19ccf88

Signed-off-by: Stephanie Wang <[email protected]>

stephanie-wang force-pushed the asyncio-channels branch from a4f081b to 19ccf88 Compare February 14, 2024 00:04

stephanie-wang changed the title ~~[WIP][donotmerge] async and pipelined channels for accelerated DAG~~ [WIP] asyncio support for accelerated DAG Feb 14, 2024

stephanie-wang added 5 commits February 14, 2024 11:52

release gil

b9db801

Signed-off-by: Stephanie Wang <[email protected]>

return futures, teardown channels

c59b16b

Signed-off-by: Stephanie Wang <[email protected]>

async with

5b0214a

Signed-off-by: Stephanie Wang <[email protected]>

lint

add2a00

Signed-off-by: Stephanie Wang <[email protected]>

Merge remote-tracking branch 'upstream/master' into asyncio-channels

8ba92a3

stephanie-wang changed the title ~~[WIP] asyncio support for accelerated DAG~~ [core][experimental] asyncio support for accelerated DAGs Feb 20, 2024

stephanie-wang assigned ericl and rkooo567 Feb 20, 2024

rm old

b322646

Signed-off-by: Stephanie Wang <[email protected]>

rkooo567 reviewed Feb 20, 2024

View reviewed changes

rkooo567 added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Feb 20, 2024

stephanie-wang added 3 commits February 20, 2024 18:39

Revert "rm old"

d80ba56

This reverts commit b322646.

revert cpp

9c492f4

Signed-off-by: Stephanie Wang <[email protected]>

update

af79a44

Signed-off-by: Stephanie Wang <[email protected]>

stephanie-wang removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Feb 21, 2024

stephanie-wang added 6 commits February 20, 2024 19:26

rm print

bf85821

Signed-off-by: Stephanie Wang <[email protected]>

lint and tests

a049bdb

Signed-off-by: Stephanie Wang <[email protected]>

update docstring

80d8e14

Signed-off-by: Stephanie Wang <[email protected]>

handle exceptions

99509cb

Signed-off-by: Stephanie Wang <[email protected]>

fix returns, exceptions

6a2d8f8

Signed-off-by: Stephanie Wang <[email protected]>

windows build

1bf3642

Signed-off-by: Stephanie Wang <[email protected]>

stephanie-wang added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Feb 27, 2024

rkooo567 approved these changes Feb 28, 2024

View reviewed changes

rkooo567 added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Mar 1, 2024

stephanie-wang merged commit 4f3f1f7 into ray-project:master Mar 8, 2024
8 of 9 checks passed

stephanie-wang deleted the asyncio-channels branch March 8, 2024 05:15

stephanie-wang mentioned this pull request Mar 27, 2024

[core][experimental] Fix experimental microbenchmark and add asyncio microbenchmarks #44330

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core][experimental] asyncio support for accelerated DAGs #43128

[core][experimental] asyncio support for accelerated DAGs #43128

stephanie-wang commented Feb 13, 2024 •

edited

Loading

rkooo567 Feb 20, 2024

rkooo567 Feb 20, 2024

stephanie-wang Feb 21, 2024

rkooo567 Feb 20, 2024

stephanie-wang Feb 21, 2024

rkooo567 Feb 20, 2024

stephanie-wang Feb 21, 2024

rkooo567 Feb 21, 2024

rkooo567 Feb 20, 2024

stephanie-wang commented Feb 21, 2024

rkooo567 left a comment

rkooo567 Feb 28, 2024

rkooo567 Feb 28, 2024

stephanie-wang Mar 8, 2024

rkooo567 Feb 28, 2024

rkooo567 Feb 28, 2024

stephanie-wang Mar 8, 2024

rkooo567 commented Mar 1, 2024

		c.end_read()


		class AwaitableBackgroundOutputReader(InputReader):

[core][experimental] asyncio support for accelerated DAGs #43128

[core][experimental] asyncio support for accelerated DAGs #43128

Conversation

stephanie-wang commented Feb 13, 2024 • edited Loading

Why are these changes needed?

Checks

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stephanie-wang commented Feb 21, 2024

rkooo567 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rkooo567 commented Mar 1, 2024

stephanie-wang commented Feb 13, 2024 •

edited

Loading