aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorPriyansh <[email protected]>2021-12-25 02:45:48 -0500
committerPriyansh <[email protected]>2021-12-25 02:45:48 -0500
commit6723f72262da53121227121bd099adad8d077ae2 (patch)
tree104520677e93a9e54b6ea963bce5a926c2b5c94a
parent5ef3baaf6575a4ad3b4e37897637d87af1af418a (diff)
downloadKafkaPySpark-6723f72262da53121227121bd099adad8d077ae2.tar.xz
KafkaPySpark-6723f72262da53121227121bd099adad8d077ae2.zip
update readme
-rw-r--r--README.md11
1 files changed, 10 insertions, 1 deletions
diff --git a/README.md b/README.md
index 50d97f6..262edee 100644
--- a/README.md
+++ b/README.md
@@ -4,4 +4,13 @@ bin/zookeeper-server-start.sh config/zookeeper.properties
# Kafka Server
bin/kafka-server-start.sh config/server.properties
# Kafka Topic creation
-bin/kafka-topics.sh --create --partitions 2 --replication-factor 1 --topic twitterdata --bootstrap-server localhost:9092 \ No newline at end of file
+bin/kafka-topics.sh --create --partitions 2 --replication-factor 1 --topic twitterdata --bootstrap-server localhost:9092
+
+## Goals
+
+[x] Create a Streaming application that reads from a Kafka topic and writes to a Kafka topic.
+[x] Read tweets from the Kafka topic and write to a Cassandra table.
+[ ] Run a Sentiment Analysis on the fetched static data from the Cassandra database. Use Spacy library.
+[ ] Run the Sentiment Analysis on the dynamically fetched data from the Kafka streaming tweets and delete the data from the Kafka topic as soon as the data is consumed and written to the Cassandra database.
+[ ] Extend this system to multiple producers and consumers.
+[ ] Try extending this to PubSub model. \ No newline at end of file