Skip to content

Named Queues Proposal

betamatt edited this page May 25, 2011 · 20 revisions

Table of Contents

Purpose

Add Resque-style named queues while retaining DJ-style priority. Named queues provide a convenient system for grouping tasks to be worked by separate pools of workers, which may be scaled and controlled individually.

Named queues should be:

  • arbitrarily named and created on-the-fly (just put a job in a queue and that queue exists)
  • ignored by default (Workers should pull from all queues unless told not to for backwards compatibility)
  • efficient (Backends must properly index so that queues are efficient even when large)
  • ignorant of the priority and delaying system. Priority and run_at should continue to function as before when jobs are segmented by queue.

API Changes

Old ways still work:

  object.delay.method
  Delayed::Job.enqueue job
  handle_asynchronously :send_mailer

Assigning to a queue:

  object.delay(:queue => 'tracking').method
  Delayed::Job.enqueue job, :queue => 'tracking'
  handle_asynchronously :send_mailer, :queue => 'mailers'

Commandline Changes

If no queue option is specified, workers will work all queues. More than one queue may be specified to a worker. Both singular and plural queue option names are provided for convenience and are interchangeable.

rake:

  QUEUE=tracking rake jobs:work
  QUEUES=one,two rake jobs:work

daemon:

  delayed_job --queue=tracking start
  delayed_job --queues=tracking,two start

Schema Change

New column: queue (string, nullable) Change index: add queue to end of existing priority,run_at index

TODO:

  • determine if it's more efficient to make queue not-null and use a default queue name
  • investigate necessary changes to Mongoid, DM backends

Target Release

Since there is both a schema change for the AR backend and implementation changes for all backends to coordinate, this feature should be targeted for the next major release of DJ (3.0). The upgrade process from 2.0 and 2.1 should be documented and a generator provided for the AR schema migration.

Working Branch

https://github.com/collectiveidea/delayed_job/tree/feature/named-queues

Comments/Questions

(John H.) What is the goal you're trying to accomplish here? I'm guessing to separate long-running and short-running jobs so that long-running (or error-prone) classes of jobs don't bottleneck short-running jobs that you might want to be more responsive. What about being able to specify a min-priority on workers? Would that let you accomplish what you're after? To illustrate:

  • create 2 workers
  • set one workers min priority to 50
  • set short-running/responsive jobs to have higher priority than slow or long-running jobs
  • now, one worker will never pick up long-running tasks, and you'll always have one working on just high-priority jobs. You could scale this and manage the priority of jobs to ensure you always have workers waiting for high-priority tasks, and then just have one digging through your low-priority backlog.
(Matt G.) While priority grouping can accomplish many of the same things as named queues, it's inexpressive and difficult to manage. Named queues allow clear management of jobs by queue in ways that aren't possible without segmenting jobs into small, meaningless, priority ranges. For example:
  • temporarily disable processing of a class of jobs (due to load, debugging, etc)
  • only run certain jobs in bursts when there is excess capacity (batch)
  • allow a normally low priority job to have a higher-priority instance (to "catchup" a missed job, etc)
(Zach M.) My opinion is that when I get to this granular of control, I need to move to Resque. I love that DJ is a simple "I want to add background processing to my app". When I exceed that and need to manage workers and queues, I feel like I need to use Resque instead. I'm not saying DJ is unsuitable for named queues, but it seems like when you start using named queues other existing projects will be a better fit.

(Matt G.) I get the argument for not complicating DJ. The "just add backgrounding" use case should always be dirt simple and I don't feel this proposal impacts that case. I worry that "just use Resque" is going to be used as an argument any time we steal a convenient feature. Resque is awesome but it's misleading to assume that anyone with queue needs that are more complicated than average wants to add Redis to their infrastructure.

(Josh G.) Does this allow prioritizing of one queue over another? (Matt G.) No. Specifying queues for a worker filters jobs, setting priority controls the order they are processed.

(Eric A.) I could see named queues being useful for my environment. Jumping to Resque is non-trivial when one considers transactional integrity. DJ can run on the same db, Resque will run on a second datastore introducing XA style transaction problems that I'd rather not tackle.

(Jared M.) I second this proposal. I would love some filtering by queue type. I was thinking of making this proposal this morning (but I was thinking of introducing "tags", and having workers filter on one or many tags. I think I like Matt's approach better). I'm volunteering to work on this if the proposal is accepted.

(Stephen V.T.) This is definitely a feature I would support. We have some jobs (such as PowerPoint processing) that can only happen on Windows, whilst others that run better on a Unix environment. Named queues would allow us to split these tasks effectively. At the moment we use Windows for all our task processing due to this limitation.

(Cameron H.) For the ToDo of determining order of which order the index should be in, have you considered having 3 indexes? I know that this will take up slightly more room on disk, but it will give flexibility. Mysql is getting better at dealing with unions of indexes and in this simple case of only 3 columns there shouldn't be too many issues. If you want to use 1 index, I image that there will be times where you do calls without a queue (when no queue is specified for the worker), which means that the queue column should be added last to the index, putting it first would mean you need to define a queue for every call to use the index. To me this wouldn't lend itself well to calls all tasks and ignoring the queue.