VDCNN-for-text-classification

Modifications

Average pooling with dropout before output layer replaces size 2048 fully connected layers
Alphabet distinguishes between unknown, padded and space characters
Depth increased

To retrain each model, run the corresponding shell script. eg:

$ bash train_ag_news.sh

Best reported in paper / my model

	imdb	ag_news	yahoo_answer
VDCNN (17 layers, avg-pooling + dropout)		91.33 / -	-/

K-max-pool implementation screws with learning rate. Using mxnet's native pooling fixes issue.
Add bucketing to massively reduce training time.
Benchmark on AG news dataset

implementing this paper in MXNet: https://arxiv.org/pdf/1606.01781.pdf

Language:Python 90.6%Language:Shell 9.4%