Skip to content
/ numaflow Public
forked from numaproj/numaflow

Kubernetes-native platform to run massively parallel data/streaming jobs

License

Notifications You must be signed in to change notification settings

qhuai/numaflow

 
 

Repository files navigation

Numaflow

Go Report Card slack GoDoc License Release Version CII Best Practices

Summary

Numaflow is a Kubernetes-native tool for running massively parallel stream processing. A Numaflow Pipeline is implemented as a Kubernetes custom resource and consists of one or more source, data processing, and sink vertices.

Numaflow installs in a few minutes and is easier and cheaper to use for simple data processing applications than a full-featured stream processing platforms.

Key Features

  • Kubernetes-native: If you know Kubernetes, you already know how to use Numaflow.
  • Language agnostic: Use your favorite programming language.
  • Exactly-Once semantics: No input element is duplicated or lost even as pods are rescheduled or restarted.
  • Auto-scaling with back-pressure: Each vertex automatically scales from zero to whatever is needed.

Data Integrity Guarantees:

  • Minimally provide at-least-once semantics
  • Provide exactly-once semantics for unbounded and near real-time data sources
  • Preserving order is not required

Roadmap

  • Data aggregation (e.g. group-by)

Resources

About

Kubernetes-native platform to run massively parallel data/streaming jobs

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Go 79.1%
  • TypeScript 16.8%
  • Shell 2.1%
  • Makefile 1.0%
  • Smarty 0.3%
  • CSS 0.3%
  • Other 0.4%