atul6876 / NYT_Topic_Modeling

Topic Modeling on New York Times's April 2017 articles (text used- headline and snippet)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NYT_Topic_Modeling

Topic Modeling on New York Times's April 2017 articles (text used- headline and snippet)

I'm using the article headline and snippet (combining them into one text body) to discover the topics beig discussed in these articles. I'm using the widely used LDA model for this purpose. For any NLP task, the majority of the time is spent on cleaning the data. This project is no different. So far, I've uploaded the code to perform basic preprocessing, removing stopwords, some simple visualization techniques to study word-frequencies, POS tagging and Lemmatization.

About

Topic Modeling on New York Times's April 2017 articles (text used- headline and snippet)


Languages

Language:Jupyter Notebook 100.0%