-
Installing
mongodb
inUbuntu 20.04 x64
$ sudo dpkg -i mongodb-org-server_4.4.0_amd64.deb
- As personal preference, add these settings to
.bashrc
mongodb-status, mongodb-start, mongodb-stop, mongodb-restart
export APPLICATIONS_HOME="${HOME}/Documents/_applications"
export MONGODB_HOME="${APPLICATIONS_HOME}/mongodb-linux-x86_64-ubuntu1804-4.2.0"
export MONGODB_BIN="${MONGODB_HOME}/bin"
alias mongo-restart='sudo systemctl restart mongodb'
alias mongo-start='sudo systemctl start mongodb'
alias mongo-stop='sudo systemctl stop mongodb'
alias mongo-status='sudo systemctl status mongodb'
alias mongo-enable='sudo systemctl enable mongodb'
alias mongo-disable='sudo systemctl disable mongodb'
- Configure to start
mongodb
automatically with the server
$ sudo cp mongodb.service /etc/systemd/system
- Update the
systemd
service after copyingmongodb
services
$ sudo systemctl daemon-reload
- Enable to auto-start
mongodb
service
$ mongo-enable
- Start/Re-Start
mongodb
service
$ mongo-start
$ mongo-restart
- Check
mongodb
version
$ mongod --version
db.version()
-
Issue installing MongoDB in
Ubuntu 18.04
-
Installing
mongodb
UI -
Sample MongoDB data sets
- How to connect Kafka to MongoDB Source
- Topic name must be in the form
logicalName.databaseName.collectionName
- Kafka Debezium Connector for MongoDB
- So in essence, create the topic name with
mongoConn.sampleGioDB.books
- mongodb.name = mongoConn (see connect-mongodb-source.properties)
- db name is
sampleGioDB
(in Mongo) - collection name is
books
(in Mongo)
$ kafka-topics --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic mongoConn.sampleGioDB.books
$ kafka-topics --list --zookeeper localhost:2181
$ cd $CONFLUENT_HOME
$ bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties etc/kafka/connect-mongodb-source.properties ### This worked so use this
$ bin/connect-distributed etc/schema-registry/connect-avro-distributed.properties etc/kafka/connect-mongodb-source.properties ### Doesn't work. Still needs to investigate why???
$ kafka-console-consumer --bootstrap-server localhost:9092 --topic mongoConn.sampleGioDB.books --from-beginning
Setting up Mongo to run in replicated instances, replica set (Ideal when Mongo is to be used for Kafka Connector)
- copy the replicated mongo configurations/services
$ sudo cp replicated_mongodb*.conf /etc
$ sudo cp replicated_mongodb*.service /etc/systemd/system
$ sudo systemctl daemon-reload
- create DB directories (see replicated_mongodb${n}.conf)
$ mkdir -p $HOME/Documents/_mongoDBData/db1
$ mkdir -p $HOME/Documents/_mongoDBData/db2
$ mkdir -p $HOME/Documents/_mongoDBData/db3
- As personal preference, add these settings to
.bashrc
mongodb-status, mongodb-start, mongodb-stop, mongodb-restart
alias mongodb-restart='sudo systemctl restart replicated_mongodb1.service replicated_mongodb2.service replicated_mongodb3.service'
alias mongodb-start='sudo systemctl start replicated_mongodb1.service replicated_mongodb2.service replicated_mongodb3.service'
alias mongodb-stop='sudo systemctl stop replicated_mongodb1.service replicated_mongodb2.service replicated_mongodb3.service'
alias mongodb-status='sudo systemctl status replicated_mongodb1.service replicated_mongodb2.service replicated_mongodb3.service'
alias mongodb-enable='sudo systemctl enable replicated_mongodb1.service replicated_mongodb2.service replicated_mongodb3.service'
alias mongodb-disable='sudo systemctl disable replicated_mongodb1.service replicated_mongodb2.service replicated_mongodb3.service'
- Open 3 terminal tabs and run each command to get the host names that will be used for primary/secondary nodes
$ mongo --port 27017 # open in terminal tab 1
$ mongo --port 27018 # open in terminal tab 2
$ mongo --port 27019 # open in terminal tab 3
> db.serverStatus() // find the `host` of the node from each `mongo --port 2701n`
> db.help()
$ mongo --port 27017 # this is going to be the primary node
> rs.status()
> rs.initiate()
> rs.add("127.0.0.1:27018") // OR rs.add("gio-Satellite-P70-A:27018") -- based on the host from db.serverStatus()
> rs.add("127.0.0.1:27019") // OR rs.add("gio-Satellite-P70-A:27019") -- based on the host from db.serverStatus()
> rs.status()
$ mongo --port 27017 # this is going to be the primary node
> cfg = rs.conf()
> cfg.members[0].priority = 1
> cfg.members[1].priority = 0.5
> cfg.members[2].priority = 0.5
> rs.reconfig(cfg)
- copy the replicated mongo configuration/service
$ sudo cp replicated_mongodb1.conf /etc
$ sudo cp replicated_mongodb1.service /etc/systemd/system
$ sudo systemctl daemon-reload
- create DB directories (see replicated_mongodb1.conf)
$ mkdir -p $HOME/Documents/_mongoDBData/db1
- As personal preference, add these settings to
.bashrc
mongodb-status, mongodb-start, mongodb-stop, mongodb-restart
alias mongodb-restart='sudo systemctl restart replicated_mongodb1.service'
alias mongodb-start='sudo systemctl start replicated_mongodb1.service'
alias mongodb-stop='sudo systemctl stop replicated_mongodb1.service'
alias mongodb-status='sudo systemctl status replicated_mongodb1.service'
alias mongodb-enable='sudo systemctl enable replicated_mongodb1.service'
alias mongodb-disable='sudo systemctl disable replicated_mongodb1.service'
- Kafka connect plugin install
- How to install connector plugins in Kafka Connect
- Install the needed Kafka Connectors from Confluent Kafka Connectors Hub. e.g.
$ confluent-hub install confluentinc/kafka-connect-elasticsearch:5.5.1
$ confluent-hub install debezium/debezium-connector-mongodb:1.2.1
$ confluent-hub install debezium/debezium-connector-mysql:1.2.1
$ confluent-hub install mongodb/kafka-connect-mongodb:1.2.0
$ confluent-hub install jcustenborder/kafka-connect-redis:0.0.2.11
$ confluent-hub install hpgrahsl/kafka-connect-mongodb:1.4.0
- Create
plugins
directory under$CONFLUENT_HOME
, then createsymlink
fromlib
directory where the.jar
files are
$ cd $CONFLUENT_HOME
$ mkdir plugins
$ cd plugins
$ ln -s ../share/confluent-hub-components/confluentinc-kafka-connect-elasticsearch/lib elasticsearch
$ ln -s ../share/confluent-hub-components/debezium-debezium-connector-mongodb/lib debezium-mongodb
$ ln -s ../share/confluent-hub-components/debezium-debezium-connector-mysql/lib debezium-mysql
$ ln -s ../share/confluent-hub-components/mongodb-kafka-connect-mongodb/lib mongodb
$ ln -s ../share/confluent-hub-components/jcustenborder-kafka-connect-redis/lib redis
- Add the
plugins
path in the following files- $CONFLUENT_HOME/etc/kafka/connect-distributed.properties
- $CONFLUENT_HOME/etc/kafka/connect-standalone.properties
- $CONFLUENT_HOME/etc/schema-registry/connect-avro-distributed.properties
- $CONFLUENT_HOME/etc/schema-registry/connect-avro-standalone.properties
plugin.path=$HOME/Documents/_applications/confluent-5.5.1/share/java,$HOME/Documents/_applications/confluent-5.5.1/share/confluent-hub-components,$HOME/Documents/_applications/confluent-5.5.1/plugins
- The above
plugins
.jar files should work but in case it didn't get added to classpath then manually add the$CLASSPATH
to.bashrc
e.g.
export CLASSPATH="$HOME/Documents/_applications/confluent-5.5.1/share/confluent-hub-components/debezium-debezium-connector-mongodb/*"
- To verify that Mongo is connected to Kafka flowing event stream
- use
Trifecta UI localhost
- OR use Confluent
KSql
- KSql Quick Reference
- KSql Create Stream
- KSql Print Kafka Topic's Content
- alias confluentKSqlStart='cd ${CONFLUENT_HOME}; bin/ksql-server-start etc/ksqldb/ksql-server.properties'
- use
$ trifectaStart
$ confluentKSqlStart
$ ksql
ksql> CREATE STREAM sampleGioBooks (id VARCHAR) WITH (kafka_topic='mongoConn.sampleGioDB.books', value_format='JSON');
ksql> describe extended sampleGioBooks
ksql> print 'mongoConn.sampleGioDB.books' from beginning;