20 NewsGroup Classification

The dataset, which is very popular in NLP tasks, contains about 20.000 documents divided into 20 categories. Our task was to classify the documents based on their content with three approaches: Machine Learning, Neural networks, Transformers.

The notebook in this folder contains code, a high detailed explanation and the results we've obtained.

For this project I worked in group with 3 colleagues of my university course.

Notebook

About

The dataset, which is very popular in NLP tasks, contains about 20.000 documents divided into 20 categories. Our task was to classify the documents based on their content with three approaches: Machine Learning, Neural networks, Transformers.

Languages

Language:Jupyter Notebook 100.0%