JHL-HUST / RSV

Detecting Textual Adversarial Examples through Randomized Substitution and Vote

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Randomized Substitution and Vote (RS&V)

This repository contains code to reproduce results from the paper:

Detecting Textual Adversarial Examples through Randomized Substitution and Vote (UAI 2022)

Xiaosen Wang, Yifeng Xiong, Kun He

Datesets and Dependencies

There are three datasets used in our experiments. Download and put the dataset into the directory ./data/ag_news, ./data/imdb and ./data/yahoo_answers, respectively.

There are Three dependencies for this project. Download and put the files glove.840B.300d.txt and counter-fitted-vectors.txt into the directory ./data/vectors, put the directory stanford-postagger-2018-10-16/ into the directory ./data/aux_files.

You can run the get_data_and_dependencies.sh to get test data:

bash get_data_and_dependencies.sh

File Description

  • ./model: Detail code for model architecture.

  • ./utils: Helper functions for training models and processing data.

  • ./adversary: Files for attack methods.

  • ./data: Datasets and GloVe vectors.

  • cnn_classifier.py, bert_classifier.py, robert_classifier.py : Training code for CNN, bert and RoBERTa.

  • cnn_attack.py: Attacking CNN model.

  • bert_attack.py Attacking BERT and RoBERTa model.

  • build_embs.py: Generating the dictionary, embedding matrix and distance matrix.

  • synonym_selector.py: Generating synonyms set.

  • detect_transfer.py: Converting adversarial examples through Randomized Substitution.

  • detect_eval.py: Vote and Detection.

Experiments

  1. Generating the dictionary, embedding matrix and distance matrix:

    python build_embs.py --data_dir ./data/ --task_name ag_news
  2. Training and attacking the models:

    For CNN:

    python cnn_classifier.py --output_dir ./output/model_file/ag_news/cnn --data_dir ./data/ --task_name ag_news --max_seq_length 128 --do_train --do_eval --vGPU 0
    python cnn_attack.py  --output_dir ./output/model_file/ag_news/cnn  --data_dir ./data/ --attack textfooler --task_name ag_news --max_seq_length 128  --max_candidate 50 --save_to_file ./output/adv_example/ag_news_cnn_textfooler --vGPU 0

    For BERT:

    python bert_classifier.py  --output_dir ./output/model_file/ag_news/bert --bert_model bert-base-uncased  --data_dir ./data/  --task_name ag_news --max_seq_length 128  --do_train --do_eval  --vGPU 0
    python bert_attack.py --data_dir ./data/ --task_name ag_news --attack textfooler --output_dir ./output/model_file/ag_news/bert/ --attack_batch 1000 --save_to_file ./output/adv_example/ag_news --bert_model bert-base-uncased  --max_candidate 50 --max_seq_length 128 --vGPU 0
  3. Evaluating the detection performance:

    python detect_transfer.py --task_name ag_news --data_dir ./data/  --votenum 25 --randomrate 0.6 --fixrate 0.02 --advfile ./output/adv_example/ag_news_cnn_textfooler.pkl --out_file ./output/transfer/ag_news_cnn_textfooler.pkl
    python detect_eval.py --task_name ag_news --data_dir ./data/  --max_seq_length 128  --modeltype cnn --output_dir ./output/model_file/ag_news/cnn --eval_file ./output/transfer/ag_news_cnn_textfooler.pkl
    

Contact

Questions and suggestions can be sent to xswanghuster@gmail.com.

About

Detecting Textual Adversarial Examples through Randomized Substitution and Vote

License:MIT License


Languages

Language:Python 99.5%Language:Shell 0.5%