zzk0 / SEM

Code for Natural Language Adversarial Attacks and Defenses in Word Level

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Synonyms Encoding Method (SEM)

This repository contains necessary code for reproducing main results in the paper:

Natural Language Adversarial Attacks and Defenses in Word Level

We also add the code for IGA into the framework TextAttack.

Datesets

There are three datasets used in our experiments:

Requirements

The code was tested with:

  • python 3.6.5
  • numpy 1.16.4
  • tensorflow 1.8.0
  • tensorflow-gpu 1.5.0
  • pandas 0.23.0
  • keras 2.2.0
  • scikit-learn 0.19.1
  • scipy 1.0.1

File Description

  • textrnn.py,textcnn.py,textbirnn.py : The models for LSTM, Word-CNN and Bi-LSTM.
  • train_orig.py,train_enc.py: Training models with or without SEM.
  • glove_utils.py : Loading the glove model and create embedding matrix for word dictionary.
  • attack_utils.py : Helper functions for calculating the classification and score for the input.
  • build_embeddings.py : Generating the embedding matrix for original word dictionary and encoded word dictionary
  • improved_genetic.py : Attacking the models with or without defense by the improved genetic algorithm (IGA).

Experiments

  1. Generating the embedding matrix for original dictionary and encoded dictionary:

    python build_embedding.py
  2. Training the models with the original word dictionary:

    python train_orig.py --data aclImdb --sn 10 --sigma 0.5 --nn_type textrnn
  3. Training the models with the encoded word dictionary:

    python train_enc.py --data aclImdb --sn 10 --sigma 0.5 --nn_type textrnn
  4. To attack the models by IGA, run:

    python improved_genetic.py --pre enc --sn 10 --data aclImdb --sigma 0.5 --time xxx --nn_type textrnn

Contact

This repository is under active development. Questions and suggestions can be sent to xswanghuster@gmail.com.

About

Code for Natural Language Adversarial Attacks and Defenses in Word Level


Languages

Language:Python 100.0%