-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workflows horizontal scaling #134
Comments
It does seem something's not quite right. Could you provide a copy of what your workflow looks like? I'm curious to know how you're seeing stale update errors as there's only a couple of ways this might occur. Also is your postgres persistence shared amongst all pods? |
Yes both pods are identical they connect to the same postgress conneciton pool.
|
Cool, I had a look through it and have a couple of thoughts: Would the Another thought is how an event is mapped to a workflow:
For this to work without collisions, the assumption is that only one workflow instance is ever running for a single Interested to know your thoughts |
Thanks a lot, yes you are right the problem was with the unique id, I was testing on the same id which will never be like that! Now everything works as expected. |
Lets say I have a workflows service.
In there i have 1 workflow with some handlers.
I deployed this service in k8s, scaled up to 2 pods, rabbitmq as a message broker, postgres as db.
Start 2 or more workflows at the time, pods will try to upsert/retrieve workflowdata.
You will see the following errors:
Basically race condition
Although seems with all the errors and retries service will finish the job of the workflow, it takes lots of time to complete with all the retries, also very confusing and does not look like scalable.
Any ideas how to scale horizontally ?
The text was updated successfully, but these errors were encountered: