Classifying Fake News Using NLP and ML

A formal report written in R that uses Natural Language Processing and Machine Learning to classify news article claims as either true or false.

This project used a dataset of 1,911 unique PolitiFact claims and their associated truth ratings. Features were extracted from each claim using a bag-of-n-grams model with tf-idf as the scoring metric (the vocabulary size was first reduced using using lemmatization and stop word removal among other text cleaning procedures).

Seven machine learning classification models (including a random forest, a multilayer perceptron, and a recurrent neural network) were fit and a maximum classification accuracy of 71% was achieved.

Viewing the Project

The project can be viewed either as a formal PDF here or as a Bookdown website here.

License

MIT License
Newspaper icon icon by Icons8

About

A formal report written in R that uses Natural Language Processing and Machine Learning to classify news article claims as either true or false.

https://oliver-be.ml/fake-news-nlp/

MIT License

Languages

Language:HTML 81.0%Language:TeX 11.5%Language:CSS 3.8%Language:JavaScript 2.7%Language:R 1.0%