Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: adding the ability to use Ray Serve async functionality #3769

Merged
merged 11 commits into from
Jan 23, 2023

Conversation

zoltan-fedor
Copy link
Contributor

@zoltan-fedor zoltan-fedor commented Dec 27, 2022

As I have explained in issue #2968 , a deployment on Ray Serve can be called async, which can make a lot of sense, as Ray Serve is typically used to serve large ML models - which means it takes a while for each request sent to a cluster of Ray Serve Deployment to return (from a few ms up to about a 100ms depending on the size of the model used).

As such, it makes sense to build a concurrent (eg async) application using an async web framework like FastAPI (or recently even Flask started supporting async`) and then send the requests to Ray Serve in an async fashion (alias without blocking), which means that while the Ray Serve Deployment is working on the request, the application is not blocked, but can work on other requests.

Unfortunately until now Haystack has only supported the calling of Ray Serve deployments in a sync fashion.
This PR is to add the ability to call the same Ray Serve deployments in an async fashion too.

This means that what is deployed into Ray Serve does NOT change, the only thing changes is how we are calling the deployment from the Ray client in Haystack.

Unfortunately making async calls require the methods containing the async call (typically await [call]) code to be marked as async and then that method to be called with await, so unfortunately some duplication of methods was required (see pipelines/base.py::Pipeline.run() and pipelines/base.py::Pipeline.run_async()) , but tried to minimize it as much as possible.

Now you can call a Ray Serve deployment async the following way.

The original (sync) call:

from haystack.pipelines import RayPipeline
raypipeline = RayPipeline.load_from_yaml(...)
prediction = raypipeline.run(query="Why I didn't receive my package?", params={...})

The new, async option was made to be very similar:

from haystack.pipelines import RayPipeline
raypipeline = RayPipeline.load_from_yaml(...)
prediction = await raypipeline.run_async(query="Why I didn't receive my package?", params={...})

Related Issues

Proposed Changes:

I have added the ability to call a Ray Pipeline Deployment from Haystack in an async fashion in addition to the old sync one.

How did you test it?

I have added a dedicated test to the new async method.
I have also tested it on a live Ray Serve cluster.

Notes for the reviewer

@ZanSara, this is what we have discussed earlier - some code repetition was unfortunately required, but managed to keep it to the minimum, see pipelines/base.py::Pipeline.run() and pipelines/base.py::Pipeline.run_async().
The run_async() method is very much the same as run(), the only difference is the await call in it.

Checklist

@zoltan-fedor zoltan-fedor requested a review from a team as a code owner December 27, 2022 01:29
@zoltan-fedor zoltan-fedor requested review from vblagoje and removed request for a team December 27, 2022 01:29
@zoltan-fedor zoltan-fedor changed the title Adding the ability to use Ray Serve async functionality feat: adding the ability to use Ray Serve async functionality Dec 27, 2022
@zoltan-fedor
Copy link
Contributor Author

Hi @vblagoje,
Have you had a chance to look into this? I would hope to get this merged before 1.13 - if possible.
Thanks

@vblagoje
Copy link
Member

Hey @zoltan-fedor this is not my area of expertise but I'll ping a colleague and he'll get back to you

@zoltan-fedor
Copy link
Contributor Author

Thanks @vblagoje, much appreciated.
Yes, the original issue was discussed with @ZanSara, so I was a bit surprised that the PR got assigned to you for review.

@ZanSara
Copy link
Contributor

ZanSara commented Jan 13, 2023

Hey @zoltan-fedor! I've been on holiday, that's why the PR went on @vblagoje in the meantime 🙂

I'll review the Python side of this PR shortly, but before merging I hope we can get someone more expert with Ray from the rest of the team to look over these changes 👍

Copy link
Contributor

@ZanSara ZanSara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of question marks but otherwise really nice PR 😊 It's way simpler than I imagined!

haystack/pipelines/base.py Outdated Show resolved Hide resolved
haystack/pipelines/base.py Outdated Show resolved Hide resolved
@zoltan-fedor
Copy link
Contributor Author

Hi @ZanSara,
Welcome back from your holiday! I thought that you might be an holiday... :-) I hope you had a good time!

Yes, I too thought that adding async will be more complicated and had to spent quite some time on figuring out how to make it the least invasive.
In the end - as you have seen - I think it came together pretty nice, with minimal code duplication required and with a decent user experience. Really the only thing is that from now on both the sync and async version of the run() method (the run() and run_async() methods) will need to be maintained and kept in sync, so their logic stays aligned.

@vblagoje
Copy link
Member

My colleague left the following comments: There is 99% code duplication between Pipeline.run and Pipeline.run_async , not sure however whether this can be implemented in a cleaner way.
What can be improved is:

  • There's no _run_node_async method in Pipeline, although it's called by Pipeline.run_async
  • There's no batch counterpart to _run_async
  • If it's really just about ray, then I'd suggest to move te run_async method to RayPipeline .
    BTW calling ray in async pattern looks good, but I haven't used it so far.

@zoltan-fedor
Copy link
Contributor Author

zoltan-fedor commented Jan 16, 2023

@vblagoje, @ZanSara,

Responding to the comments from @vblagoje's colleague:

  • "There is 99% code duplication between Pipeline.run and Pipeline.run_async , not sure however whether this can be implemented in a cleaner way." >> Yes, by definition there is always code duplication with having both sync and async versions of the same methods. The only way to minimize that is by taking "utility methods" out of the main methods - in this case by taking chunks out of Pipeline.run and call those same (synchronous) utility methods both from Pipeline.run and Pipeline.run_async - for which I don't see any good candidates in Pipeline.run. But if you see any sections in Pipeline.run` that could be taken out into another method, let me know.

  • "There's no _run_node_async method in Pipeline, although it's called by Pipeline.run_async" and "If it's really just about ray, then I'd suggest to move te run_async method to RayPipeline ." >> Both of these have already been addressed by moving the Pipeline.run_async into RayPipeline.run_async based on @ZanSara's recommendation.

  • "There's no batch counterpart to _run_async" >> That is correct, I was debating whether create one or not. It could have been done, but it would have required LOTS of code duplication (to have both sync and async methods of a chain of methods which are handling the batch calls) and with little benefit. Once you have the async run call, there is little benefit to wanting to make batch calls with async. You typically use async in a quick-fire application scenario - which is the opposite of where the batch would be used. As such, considering the cost of it (lots of code duplication), I think we shouldn't add async batch run.

@ZanSara
Copy link
Contributor

ZanSara commented Jan 17, 2023

Thank you @vblagoje for getting the reviewer's comment! My two cents on code duplication: I agree that in the general case I would not like this amount of duplication, but here I believe we should tolerate it.

That part of Pipeline is rock stable lately, and as we are planning to revisit how Pipeline works from the foundations, no point touching it now for a small optimization. Luckily this is a very simple change that will give us a good example how to run pipelines async and is immediately useful to @zoltan-fedor. So I'm ok for merging it as it is. If you don't agree, let's discuss 🙂

@vblagoje
Copy link
Member

I am ok since all three of you agree that it is good to go. I don't know this part of the codebase well, but since three engineers who know the subject matter agree - I am approving.

@zoltan-fedor
Copy link
Contributor Author

Thank you both for reviewing and approving.
I hope this can be merged soon!
Thanks!

@ZanSara ZanSara merged commit e447bd7 into deepset-ai:main Jan 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adding the ability to use Ray Serve async functionality
3 participants