Skip to content
/ puro Public

Puro - Highly configurable data streams in Python 3.x

License

Notifications You must be signed in to change notification settings

jvtm/puro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Puro

Build Status License Version Python Versions

Highly configurable data streams in Python 3.x.

Idea is to use JSON schemas for selecting objects from input sources and passing them to action chains.

Both sources and actions implement respective plugin interface, and they can be re-used, configured and glued together in different order via config files.

Priority is to be usable, configurable and human friendly. But according to initial benchmarks this can be also made blazingly fast -- or at least as fast as any custom processing would be.

Overall goal is to battle-tested functional core, easy but powerful configuration language and extendable plugin API modelled after real life data handling scenarios.

These aspects will hopefully make this library worth using.

This is not a message queue. This is a component that sits between message queues, datastores etc.

Also, only basic plugins will be provided here. While usable, they might not adapt to your use-case. Instead of being ready for all possible data formats everywhere, the provided plugins will do just basic (data agnostic) actions. Consider them just a bit more than Hello World examples.

Core

Python 3.x port, asyncio experiments on-going. Stay tuned!

Inputs

Various stream readers and examples will be provided (HTTP, Redis, SQS, Kombu, local dir, ...)

Selectors

jsonschema + possibly others like kmatch

Actions

Can either data modifiers (modify, sanitize, filter, enrich) or data storers (Redis, disk, databases, message queues, ...)

Work In Progress

This project is a full rewrite of an earlier, abandoned Python 2.x project.

Pieces will be committed here once things get ported into Python 3.6+ syntax, and utilizing latest and greatest helper libraries.

Once the basic pieces exist, it is possible to extend the flow by having statistics, throttling, logging, etc plugins too.

About

Puro - Highly configurable data streams in Python 3.x

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published