jass228 / TopicModeling

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TopicModeling

Description

The goal of that project is to study the performance of Topic Modeling Algorithms on the News Category Dataset.

Project Plan

  • Analyse the text corpus ( mean size, types of words used, stopwords, most common words, etc)
  • Select 3 methodologies of Topic Modeling/Clustering for our problem
  • Define one or multiple metrics to measure the quality of our models
  • Make a comparative test between each model
  • Conclude on the best methodology to use in our case and identify areas for improvement in our analysis

About


Languages

Language:Jupyter Notebook 100.0%