Skip to content
This repository has been archived by the owner on May 14, 2024. It is now read-only.
/ ideal-spork Public archive

A toy, starting with Spark Streaming + Kafka

License

Notifications You must be signed in to change notification settings

btmorr/ideal-spork

Repository files navigation

ideal-spork

A toy, starting with Spark Streaming + Kafka (the repo name was randomly generated). Currently, either the bundled Kafka producer terminal, or the Producer application will emit a series of messages, which are transmitted over Kafka, picked up by the Consumer, written to Cassandra, and said out loud (if you're on a Mac) with the say command. :p

Getting Started

Building and booting the system

[tbd]

Interacting with the system

The primary means of interaction is through the chat interface, where you can register a user, log in, select another user to chat with, and have a conversation. Obviously, when running this locally, the only other users will be those on your local network, and therefore probably all AIs unless you share the link with other humans. The aim is to have user interactions handled the same whether a user is AI or human.

The only other typical interaction will be to inspect Cassandra tables. To access these, from the Cassandra host do:

$ cqlsh
cqlsh> select * from test.messages

Notes

Intial start based on info gleaned from:

The above is sufficient for the general kinds of operations available within the Spark ecosystem, including GraphX ops and whatnot.

For a primer on asynchrony, check out this blog post

Models

The Models object contains interfaces for grabbing various kinds of predictions, whether those are run within the app, or accessed through API calls to external services (as with models that are tied to another language, such as Tensorflow networks)

Worth looking into:

Interaction

Main idea is chatting with the AI in a chatroom. The AI will be logged into a user session just like a human, and the chat application doesn't have to be aware of the AI (aside from possibly exposing whatever the AI pipe needs to know when a message has come in and read it). This means the chat app can be put together from a tutorial, largely independent of the design of the AI. Current state of the app is just an endpoint hooked up to a Kafka producer, so this concept hasn't been put into practice at all, but that's the plan.

This also could be used to capture human<->human chats as data for training the bot.

Http4s bundles twirl for serving templates, so this should be able to avoid JS (or resort to Scala.js if necessary)

Developing

To get started, you have to have the Stanford CoreNLP models downloaded to the "lib" folder.

Launching Zookeeper

Kafka requires a Zookeeper cluster. It comes bundled with Zookeeper and a starter config, so you can test locally without having to put together a proper cluster. Run:

$KAFKA_HOME/bin/zookeeper-server-start.sh $KAFKA_HOME/config/zookeeper.properties

Launching Kafka

Once Zookeeper is up, launch Kafka:

$KAFKA_HOME/bin/kafka-server-start.sh $KAFKA_HOME/config/server.properties

To set up a topic (only has to be done once per topic):

$KAFKA_HOME/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

Example of a larger-scale Kafka deploy setup from clairvoyansoft's blog

Launching Cassandra

If Cassandra's 'bin' directory is included on your PATH, you can simply run cassandra to start the server (the terminal process will return, but the server will still be up. You can verify this and interact with the database by opening the CQL shell: cqlsh.

Before reading/writing, you have to set up the keyspace and table (this only has to be done once, as long as the keyspace and table don't change):

cqlsh> CREATE KEYSPACE IF NOT EXISTS test WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
cqlsh> CREATE TABLE IF NOT EXISTS test.messages(id UUID, message TEXT, response TEXT, PRIMARY KEY(id));
cqlsh>

After sending some messages through, view them by:

cqlsh> select * from test.messages

Notes on Cassandra:

Running the application

Start the Consumer and the Producer, then navigate to localhost:8080. The root page won't display anything at the moment, but you can hit the "send" route and add body text under the msg parameter to send a message along and have the computer say it.

Copyright (c) 2017 Benjamin Morris

About

A toy, starting with Spark Streaming + Kafka

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published