Skip to content

A repository with Snowplow recovery mechanism for Realtime data pipeline S3 backup files

License

Notifications You must be signed in to change notification settings

grzegorzewald/SnowplowRecovery

Repository files navigation

Snowplow Real-time data pipeline S3 Recovery/data fix

A repository with Snowplow recovery mechanism for Real-time data pipeline S3 backup file.

How to use

Recovery mechanism can either emit base64 encoded records to stdout or write directly to the Kinesis stream.

Backup raw files may be picked up either form S3 bucket (with file name prefix or not) or form a local file (currently only a single one supported).

How to fix bad records?

Line 48 is the answer.

Notes

Thrift schema is taken from official Snowplow repo: collector-payload.thrift

About

A repository with Snowplow recovery mechanism for Realtime data pipeline S3 backup files

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages