adversarial_text

Qi Lei, Lingfei Wu, Pin-Yu Chen, Alexandros G. Dimakis, Inderjit S. Dhillon, Michael Witbrock. "Discrete Adversarial Attacks and Submodular Optimization with Applications to Text Classification” Systems and Machine Learning (sysML). 2019 (arXiv,slides)
Press coverage: <Nature Story> <Vecturebeat> <Tech Talks> <机器之心>

step 1: train the original model

download training/testing dataset and put it in ./data/train.tsv and ./data/test.tsv, each line should consist of the text and the label, seprated by \t
cd src/
make train_LSTM (to train LSTM classifier)
make train_CNN (to train the word-level CNN classifier)
Move the models to targeted directory, e.g. ../model/model_lstm.pt and ../model/model_cnn.pt

Download the sentence paraphrasing model from https://github.com/vsuthichai/paraphraser
put it the the same parent path as the text_adversarial repository

In the Makefile, change the input parameter model_path to the above generated models; also, change the input parameter first_label to the first label name (e.g. FAKE for the news data) appeared in the training file. (Otherwise the model doesn't distinguish positive and negative labels)
"make attack_cnn" to generate adversarial examples of the wcnn model
"make attack_lstm" to generate adversarial examples of the lstm classifier
To use joint sentence and word level attacks, do step 3 and run the following
make attack_cnn_joint
make attack_lstm_joint

Finally, the datasets we used could be obtained from https://www.dropbox.com/sh/jdkhvdgzmytu78i/AACo53pUyerYO6jwVds5SZyPa?dl=0
The dataset in ./data folder is the fake news dataset