-
Notifications
You must be signed in to change notification settings - Fork 3k
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trino pods goes down instantly while autoscale factor causes pods to terminate even if terminationGracePeriodseconds
is set to 300 seconds
#22483
Comments
Is this about the Trino Helm chart? If yes, can you include the values to reproduce this? |
it is about Trino Helm Chart. Attaching deployment config and values file for reproducing the issue. |
Which chart version you're using? How do you apply the changes you included in In the latest chart version, you have to set |
we are using helm chart version: |
That's very old. I don't know how the chart was structured back then, and I can't help anymore. Can you try using the latest version? |
we have upgraded the helm chart to 0.25.0, and the |
I checked that the default Trino Docker image entrypoint doesn't handle signals sent to the container in any special way. The Trino server also doesn't do this. To handle graceful shutdown, you have to configure the pod's lifecycle in the |
HI, we have set lifecycle prestop hook and
also we have set
we still see the worker pods getting terminated abruptly without being in terminating state for 300s which is causing queries to fail. is there anything else that needs to be set to make sure pods have graceful shutdown. |
if any of the tasks take longer than the termination grace period then queries are going to fail. See docs at https://trino.io/docs/current/admin/graceful-shutdown.html which explain how graceful shutdown works. The grace period hence needs to be at-least as long as the longest tasks (for simplicity assume queries) that execute on your cluster. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
we have set
terminationGracePeriodSeconds
to 300s in trino coordinator and worker nodes. during autoscaling when the number of worker pods increase and decrease, pods terminate instantly without waiting for the queries in the pod to terminate.we have set
shutdown.grace-period=300s
in trino cooridnator and worker also.Expectation is the trino worker pods must wait for 300sec untill tasks in the worker complete instead of terminating instantly.
we have set
starburstWorkerShutdownGracePeriodSeconds: 300
which corresponds toshutdown.grace-period=300s
anddeploymentTerminationGracePeriodSeconds: 300
which corresponds toterminationGracePeriodSeconds
in starburst and the worker pods terminate after 300sec waiting for query tasks to run to completion as expected.The text was updated successfully, but these errors were encountered: