Skip to content

A distributed, fault-tolerant task queue

License

Notifications You must be signed in to change notification settings

macwilk/hatchet

 
 

Repository files navigation

Hatchet Logo

A Distributed, Fault-Tolerant Task Queue

Docs License: MIT Go Reference NPM Downloads

Discord Twitter GitHub Repo stars

Hatchet Cloud · Documentation · Website · Issues

What is Hatchet?

Hatchet replaces difficult to manage legacy queues or pub/sub systems so you can design durable workloads that recover from failure and solve for problems like concurrency, fairness, and rate limiting. Instead of managing your own task queue or pub/sub system, you can use Hatchet to distribute your functions between a set of workers with minimal configuration or infrastructure:

What Makes Hatchet Great?

  • ⚡️ Ultra-low Latency and High Throughput Scheduling: Hatchet is built on a low-latency queue, perfectly balancing real-time interaction capabilities with the reliability required for mission-critical tasks.

  • ☮️ Concurrency, Fairness, and Rate Limiting: Implement FIFO, LIFO, Round Robin, and Priority Queues with Hatchet’s built-in strategies, designed to circumvent common scaling pitfalls with minimal configuration. Read Docs →

  • 🔥🧯 Resilience by Design: With customizable retry policies and integrated error handling, Hatchet ensures your operations recover swiftly from transient failures. You can break large jobs down into small tasks so you can finish a run without rerunning work. Read Docs →

Enhanced Visibility and Control:

  • Observability. All of your runs are fully searchable, allowing you to quickly identify issues. We track latency, error rates, or custom metrics in your run.
  • (Practical) Durable Execution. Replay events and manually pick up execution from specific steps in your workflow.
  • Cron. Set recurring schedules for functions runs to execute.
  • One-Time Scheduling. Schedule a function run to execute at a specific time and date in the future.
  • Spike Protection. Smooth out spikes in traffic and only execute what your system can handle.
  • Incremental Streaming. Subscribe to updates as your functions progress in the background worker.

Example Use Cases:

  • Fairness for Generative AI: Don't let busy users overwhelm your system. Hatchet lets you distribute requests to your workers fairly with configurable policies.
  • Batch Processing for Document Indexing: Hatchet can handle large-scale batch processing of documents, images, and other data and resume mid-job on failure.
  • Workflow Orchestration for Multi-Modal Systems: Hatchet can handle orchestrating multi-modal inputs and outputs, with full DAG-style execution.
  • Correctness for Event-Based Processing: Respond to external events or internal events within your system and replay events automatically.

Quick Start

Hatchet is available as a cloud version or self-hosted. See the following docs to get up and running quickly:

Hatchet supports your technology stack with open-source SDKs for Python, Typescript, and Go. To get started, see the language-specific guides here:

SDK repositories

If you encounter any issues while using the SDKs, please submit an issue in the respective repository:

How does this compare to alternatives (Celery, BullMQ)?

Why build another managed queue? We wanted to build something with the benefits of full transactional enqueueing - particularly for dependent, DAG-style execution - and felt strongly that Postgres solves for 99.9% of queueing use-cases better than most alternatives (Celery uses Redis or RabbitMQ as a broker, BullMQ uses Redis). Since the introduction of SKIP LOCKED and the milestones of recent PG releases (like active-active replication), it's becoming more feasible to horizontally scale Postgres across multiple regions and vertically scale to 10k TPS or more. Many queues (like BullMQ) are built on Redis and data loss can occur when suffering OOM if you're not careful, and using PG helps avoid an entire class of problems.

We also wanted something that was significantly easier to use and debug for application developers. A lot of times the burden of building task observability falls on the infra/platform team (for example, asking the infra team to build a Grafana view for their tasks based on exported prom metrics). We're building this type of observability directly into Hatchet.

For more information for why we built Hatchet, you can check out our writeup on Celery here.

Issues

Please submit any bugs that you encounter via Github issues. However, please reach out on Discord before submitting a feature request - as the project is very early, we'd like to build a solid foundation before adding more complex features.

I'd Like to Contribute

See the contributing docs here, and please let us know what you're interesting in working on in the #contributing channel on Discord. This will help us shape the direction of the project and will make collaboration much easier!

About

A distributed, fault-tolerant task queue

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Go 47.6%
  • TypeScript 32.7%
  • MDX 17.1%
  • PLpgSQL 1.3%
  • Shell 0.5%
  • CSS 0.4%
  • Other 0.4%