Restart DelayedJob workers after they crash #5146

javierm · 2023-06-29T14:00:26Z

References

Closes Automatically restart delayed job processes when they stop installer#212

Background

We're receiving reports from Consul installations saying that, once in a while, DelayedJob processes stop.

In some cases we've solved this issue by monitoring these processes with Systemd or Monit, but implementing support for these tools in a way that works on existing installations is difficult.

On the other hand, DelayedJob provides a simple way to monitor its processes that might not be as powerful as systemd but it's much better than doing nothing and it's easy to make it work on existing installations

Objectives

Make it easier to maintain Consul installations running on production

Notes

We might switch to systemd in the future, particularly if we upgrade Puma (see pull request #4922).

Release Notes

⚠️ DelayedJob processes now create processes which monitor and restart DelayedJob processes in case they crash (see pull request #5146). If you're already monitoring these processes with tools like monit or systemd, you might want to disable this feature. Also note that, in order to stop delayed job, we now need to pass the -n option: RAILS_ENV=production bin/delayed_job -n 2 stop.

DelayedJob offers the `--monitor` (aliased as `-m`) option to create a process that monitors the workers and restarts them when they crash. This change implies that, in order to stop the delayed job workers, we now need to pass the `-n` option when running `bin/delayed_job stop`: `RAILS_ENV=production bin/delayed_job -n 2 stop`.

Senen

I tested it in the staging server by killing the delayed_job processes by hand, and the processes were restarted automatically.

Also tested a server reboot and the Capistrano delayed_job tasks (delayed_job:start delayed_job:stop delayed_job:restart).

Also, checked how a deployment will create monitoring processes for existing installations.

javierm added Maintenance 2.0 labels Jun 29, 2023

javierm self-assigned this Jun 29, 2023

javierm added this to Reviewing in Consul Democracy Jun 29, 2023

javierm mentioned this pull request Jun 29, 2023

Automatically restart delayed job processes when they stop consuldemocracy/installer#212

Closed

Senen approved these changes Jun 30, 2023

View reviewed changes

Consul Democracy automation moved this from Reviewing to Testing Jun 30, 2023

javierm merged commit db4db07 into master Jun 30, 2023
15 checks passed

Consul Democracy automation moved this from Testing to Release 2.0.0 Jun 30, 2023

javierm deleted the delayed_job_monitor branch June 30, 2023 15:23

javierm added Release notes and removed 2.0 labels Jun 30, 2023

javierm mentioned this pull request Jul 4, 2023

Restart DelayedJob workers after they crash consuldemocracy/installer#218

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restart DelayedJob workers after they crash #5146

Restart DelayedJob workers after they crash #5146

javierm commented Jun 29, 2023 •

edited

Loading

Senen left a comment

Restart DelayedJob workers after they crash #5146

Restart DelayedJob workers after they crash #5146

Conversation

javierm commented Jun 29, 2023 • edited Loading

References

Background

Objectives

Notes

Release Notes

Senen left a comment

Choose a reason for hiding this comment

javierm commented Jun 29, 2023 •

edited

Loading