I use TensorFlow framework to create CharCNN model to detect sarcasm in English text. You can detect other characteristics when you provide data in data folder.
The architecture was created by Character-level Convolutional Networks for Text Classification.
I use dataset supplied by google api. You can find it at here.
The model utilizes Convolution 1D to extract features of the text, after that go throguh fully connected layers to classify.
Small CharCNN and Large CharCNN take me only 5 epochs to get nearly 85% accuracy, but it will take a lot of time to train for Large CharCNN due to high number of parameters.
git clone https://github.com/hoangcaobao/CharCNN.git
cd CharCNN
pip install -r requirements.txt
Go to folder data, put data in json file in it. I already put sacarsm data in this folder.
This step makes you wait very long but it is required to get weights of model before next step.
python3 train.py --data-path ${} --epochs ${} --num-classes ${}
For example:
python3 train.py --data-path data/sarcasm.json --epochs 5 --num-classes 2
python3 predict.py --data-path ${} --test-path ${} --num-classes ${}
For exmaple:
python3 predict.py --data-path data/sarcasm.json --test-path test/sentences.json