Udrasht / Multinomial-Naive-Bayes-from-Scratch

Classify the message is spam or not using Multinomial Naive Bayes.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SMAI-Mini-Project-1

Classify the message is spam or not using Multinomial Naive Bayes.

Introduction

This question will have you working and experimenting with the Multinomial Naïve Bayes classifier. Initially, you will transform the given data in csv file to count matrix, then calculate the priors. Use those priors to compute likelyhoods according to Multinomial Naive Bayes and then classify the test data. Please note that use of sklearn implementations is only for the final question of the assignment.

The dataset is about Spam SMS. There is 1 attribute that is the message, and the class label which could be spam or ham. The data is present in spam.csv. It contains about 5-6000 samples. For your convinience the data is already pre-processed and loaded, but I suggest you to just take a look at the code for your own knowledge, and parts vectorization is left up to you which could be easily done with the help of the given example code.

About

Classify the message is spam or not using Multinomial Naive Bayes.


Languages

Language:Jupyter Notebook 100.0%