-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transactional Guarantees with Background Work Queueing #3565
Comments
Looked a bit more into this and I ended up throwing together a library that implements a generic transactional outbox event processor https://github.com/dillonstreator/txob as well as a client adapter for Curious to see what thoughts are on this. |
The current state of the fhir server reveals a crucial gap in transactional guarantees between resource updates and the background work queueing mechanism. This exposes the system and clients to potential data inconsistencies.
The problem can be seen in the api servers fhir repo where if the
writeToDatabase
call succeeds butaddBackgroundJobs
fails for some reason (transient BullMQ issue or server crash for example), clients will miss subscriptions, resource downloads, and/or cron registration.medplum/packages/server/src/fhir/repo.ts
Lines 510 to 515 in bba064a
Proposed Solution
To mitigate this issue, the implementation of the transactional outbox pattern is recommended. This pattern not only provides transactional guarantees but also introduces the ability to retry background job queueing, despite the low-probability of failures with redis/bullmq.
High-level Implementation Details
Event Table Introduction
The proposed solution involves the creation of a new
events
table. This table would encompass the following fields:ResourceSaved
A
ResourceSaved
event will be transactionally persisted alongside a resource creation or update.Event Processor and Handler(s)
The solution introduces an event processor capable of running either in-process or as a separate process. This processor checks for unprocessed events in the
events
table on a set interval and runs the events through their respective handlers (a predefined map using keys as event type/name). The handler(s) are responsible for background job queuing through BullMQ. The processor updates the events based on handler completion results. The processor should be capable of scaling horizontally without concern of duplicate event processing.For each background job that requires queuing, the
ResourceSaved
event will have a corresponding 'handler.' These handlers must utilize theaddBulk
method to guarantee atomic queueing in BullMQ. Alternatively, a single handler could be used andaddBulk
can be called to guarantee atomic queueing of all background jobs at once (subscriptions, downloads, and crons). The important part here is that the jobs are atomically queued.Considerations and Drawbacks
Despite the benefits offered by transactional outbox, it is important to acknowledge the drawbacks, including:
removeOnComplete
is not enabled, as the queueing would be idempotent. https://docs.bullmq.io/guide/jobs/job-ids.events
table which would store a record for each persisted change to a resource.Nevertheless, the advantages of improved transactional guarantees and enhanced data consistency significantly outweigh these drawbacks IMO.
The text was updated successfully, but these errors were encountered: