Siddhantmest / Reddit-Post-Popularity

Predicting the popularity of Reddit posts, using NLP techniques

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reddit Post Popularity

The aim of this group project is to predict the popularity of the reddit posts.

Steps followed

1. Import required libraries and create a dataframe of the JSON file
2. Perform data cleaning, text preprocessing, and feature engineering
3. Translate non-English post titles using google translator and then perform sentiment analysis on them
4. Perform analysis on the cleaned data using correlation to derive insights
5. Model is then trained on training data, model performance evaluated on testing data
6. Results verified with the insights

Reddit Post Popularity.ipynb is used to clean the data and Reddit Post Popularity - modeling.ipynb is used to implement a model on the cleaned data.

About

Predicting the popularity of Reddit posts, using NLP techniques


Languages

Language:Jupyter Notebook 100.0%