Spam/Ham sms detector

A spam/ham dataset from kaggle was used to train with random forest model. The model had an accuracy of 98% when tested against the test subset from the dataset. The model was saved and used within the flask web app. The flask app was deployed on AWS elastic bean stalk. Parallely, a free twilio account was created and a free mobile number was activated. The deployed app’s link was given to the twilio account’s web hook so that everytime a message is sent to the number, the web hook is triggered and the flask app will respond back with the prediction. The message logs were stored on AWS dynamodb. Along with the prediction, the users were also asked if the prediction was correct. The users can reply back by evaluating the prediction. The evaluation will also be stored in the database. This way on more misclassifications, the model can be retrained and deployed in the future.

To test it

Send an sms to the deployed number and get a prediction if the sent message is spam or ham. You can validate the prediction with 'correct' or 'wrong'.

(Suspended twilio account and it is not live)

Tech stack

Python
Jupyter notebooks
Twilio API
flask
AWS dynamodb
AWS elastic bean stalk

Reference

https://www.kaggle.com/datasets/uciml/sms-spam-collection-dataset

About

Classify an sms sent to the deployed number as spam or ham

Languages

Language:Jupyter Notebook 94.1%Language:Python 5.4%Language:HTML 0.5%