awzucker / world_news_nlp

An NLP project in Python using SpaCy, NLTK, and scikit-learn to predict positive user engagement (measured in "upvotes") with posts from a sample online "world news" message board.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

World News NLP Project

Problem

Using industry-standard NLP libraries SpaCy, NLTK, and scikit-learn, this study will examine the key words in a post title that most positively affect user engagement. The exploratory data analysis and visualizations in the following notebook will also factor in other features of the supplied data, including author, post time, and date. For the purposes of this study, positive user engagement will be measured in upvotes.


Datasets Used

  • world_news_posts.csv: Supplied dataframe with roughly 500,000 titles of posts on a "world news" message board, including data for the date, time, and author of the post, along with user interaction.
  • world_news_posts_az.csv: Cleaned version of the original world_news_posts dataframe with additional engineered features.

Data Dictionary

Feature Type Dataset Description

Analysis Summary


Conclusions & Considerations


Sources Cited:

About

An NLP project in Python using SpaCy, NLTK, and scikit-learn to predict positive user engagement (measured in "upvotes") with posts from a sample online "world news" message board.


Languages

Language:Jupyter Notebook 100.0%