About this project:

This project is a solution to the Data Engineer Challenge. It sends statistics to the output topic and the counting error margin is less than 1%. It doesn't deal with latency, but it does work with historically data (you simply re-run it on a Kafka topic, since the entire point is for it to be stateless).

The original solution that got me a job is at commit: 56103583efeb057546bd402882908f7a0fda9ba2

The current and improved solution is for my personal satisfaction...

How to run this project:

You can simply use go run, or if you want you can use Docker. If you are using Docker, just make sure that the container has access to the host network so it can use the local Kafka instance. There are a few environment variables that this project uses, but they have sensible defaults, so you don't have to bother with them. Here's the list of the used environment variables and their default values:

KAFKA_BROKER=localhost:9092
USERS_TOPIC=users
STATS_TOPIC=stats

To send messages to a Kafka topic you can use this command:

gzcat stream.jsonl.gz | kafka-console-producer --bootstrap-server localhost:9092 --topic users

To receive messages from Kafka topic, you can use this command:

kafka-console-consumer --bootstrap-server localhost:9092 --topic stats

You can find sample data (stream.jsonl.gz) here.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
kafkaio		kafkaio
processor		processor
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About this project:

How to run this project:

About

Releases

Packages

Languages

License

MatejaMaric/kafka-go-cardinality

Folders and files

Latest commit

History

Repository files navigation

About this project:

How to run this project:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages