abhishakvarshney / News-Classification

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

News-Classification

Problem Statement: News Classification

Large volumes of news articles are received everyday from various sources. To simplify downstream processing tasks, one of the first steps is to properly categorise news into well defined categories. In this problem, you have been given a set of news articles and a set of categories. Task is to develop a model which will be able to predict the category of unseen news.

Dataset

There are three files:

  1. categories.csv - This file defines the different type of categories
  2. news_details.xlsx - This file contains the news details like short snippet, title and description.
  3. category_mapping.xlsx - This file contains the mapping between news articles and categories.

Expectation:

You are expected to develop a model to predict the categories of unseen news and summarise the following details:

  • Preprocessing steps performed on the dataset
  • Feature selection
  • Different types of models tested
  • Model Tuning techniques used to give best results
  • Any other comments/experiment details.

About


Languages

Language:Jupyter Notebook 100.0%