kafka-twitter-stream

Full explanation how to stream twitter data by keyword, using kafka and python on Windows

Installation & Setup guide

Download Java Development Kit 8 from Oracle Website, and install it
Download Apache Kafka from Apache Kafka Website, and set it up at C:/ directory
Edit environment variable, and add C:\<Your Kafka Version>\bin\windows
Add 2 folder named Kafka and Zookeper at C:\<Your Kafka Version\data\kafka and at C:\<Your Kafka Version\data\zookeeper
Edit the Zookeper Configs File at C:\<Your Kafka Version>\config\zookeeper.properties and search for dataDir and edit it to C:/kafka_2.13-2.6.0/data/zookeeper
Edit the Server Properties File at ‪C:\kafka_2.13-2.6.0\config\server.properties and search for log.dirs and change it to log.dirs=C:/<Your kafka Version>/data/kafka

Open cmd at the directory instalation of zookeper and input zookeeper-server-start.bat config\zookeeper.properties
Open another cmd & run Kafka by typing kafka-server-start.bat config\server.properties
Create kafka topics by typing kafka-topics.bat --zookeeper localhost:2181 --create –topic <topic_name> --partitions <numbers_of_partition> --replication-factor 3
Run python producers.py
Run python consumers.py
Check MySql database, if enough data is collected, dump the etl by running python dump.py

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
README.md		README.md
consumers.py		consumers.py
dump.py		dump.py
producers.py		producers.py