A simple implementation of text classification on a highly unbalanced dataset.
A quick snapshot of what you can expect -
- Text preprocessing in R using 'tm' library
- Vectorization of pre-processed text using Keras text vectorization layer
- Training neural nets based classifier using Keras layers
- Creating Document term matrix from text and removing sparsity
- Using document term matrix as features set for classification
- Using boosted model to train the classifier
- Evaluating the model on the test set at different thresholds