diff options
| -rw-r--r-- | README.md | 3 | ||||
| -rw-r--r-- | README.txt | 30 |
2 files changed, 30 insertions, 3 deletions
diff --git a/README.md b/README.md deleted file mode 100644 index 7719109..0000000 --- a/README.md +++ /dev/null @@ -1,3 +0,0 @@ -## EAS 504 Project - -To be updated soon... diff --git a/README.txt b/README.txt new file mode 100644 index 0000000..25c57ce --- /dev/null +++ b/README.txt @@ -0,0 +1,30 @@ +EAS 4/587 -- Data Intensive Computing + +1. Motivation + +1.1. Reddit is a social media website where users can post links to +articles, images, videos, etc. and other users can comment on them. +Authors of the posts, generally look to drive maximum engagement from +their posts. + +1.2. Unlike other social media websites, Reddit has a unique feature of +upvoting and downvoting the posts. Also, since the posts are publicly +visible, factors like the time of posting, the number of upvotes, the +number of comments, etc. matter a lot. + +1.3. Since there are a lot of posts being made every minute, significant +posts can get lost in the crowd. Also, the posts that are made at a +particular time of the day, may not be visible to the users who are +active at a different time of the day. + +2. Problem Statement + +2.1 Fetch the data using the Reddit Developer API from different +programming related subreddits (communities). Since, there are a lot +of subreddits on Reddit; we will keep the scope of the project +limited. + +2.2 Analyze the data and find relevant insights after cleaning and +preprocessing the data. + +Further report is available in reports folder. |
