google-research / domain-robust

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

This is not an officially supported Google product.

Welcome to DomainRobust

DomainRobust is a PyTorch testbed for evaluating the adversarial robustness of domain adaptation algorithms. Specifically, we consider a unsupervised domain adaptation (UDA) setting: given a labeled dataset from a source domain and an unlabeled dataset from a (related) target domain, the goal is to train a classifier that is robust against adversarial attacks on the target domain. The implementation builds on DomainBed, AutoAttack, and CleverHans.

Available algorithms

The following algorithms are currently available:

  • Empirical Risk Minimization (ERM): standard ERM without adversarial training.
  • Domain Adversarial Neural Network (DANN, Ganin et al., 2015): standard DANN without adversarial training.
  • Adversarial Training (AT, Madry et al., 2017). Three variants are supported: (i) AT only on the labeled source data, (ii) AT on pseudo-labeled target data (where pseudo labels are obtained a-priori using DANN), and (iii) AT on the source data along with a DANN regularizer. See our paper for details.
  • TRadeoff-inspired Adversarial DEfense via Surrogate-loss minimization (TRADES, Zhang et al., 2019). Two variants are supported: (i) TRADES only on the labeled source data, and (ii) TRADES on pseudo-labeled target data (where pseudo labels are obtained a-priori using DANN). See our paper for details.
  • Adversarially Robust Training method for UDA (ARTUDA, Lo, S.Y. and Patel, V., 2022)
  • Meta self-training for robust unsupervised domain adaptation (SRoUDA, Zhu et al., 2022)
  • Divergence Aware adveRsarial Training (DART, in submission)

Available datasets

DomainRobust includes the following datasets:

Quick start

To download the datasets:

python3 -m domainrobust.download \
       --data_dir=./domainrobust/data

We first generate a pretrained model using DANN (this pretrained model will be used by algorithms like SROUDA and DART):

python3 -m domainrobust.scripts.train\
       --data_dir=/my/datasets/path\
       --output_dir=/my/pretrained/model/path\
       --algorithm DANN\
       --dataset DIGIT\
       --task domain_adaptation\
       --source_envs 0\
       --target_envs 2

To train a single model:

python3 -m domainrobust.scripts.train\
       --data_dir=/my/datasets/path\
       --output_dir=/output/path\
       --algorithm AT\
       --dataset DIGIT\
       --task domain_adaptation\
       --source_envs 0\
       --target_envs 2\
       --eps 0.008\
       --atk_lr 0.004\
       --atk_iter 5\
       --attack pgd\
       --source_type clean\
       --pretrain_model_dir=/my/pretrained/model/path \
       --pretrain

To launch a sweep (over a range of hyperparameters and possibly multiple algorithms and datasets):

python -m domainrobust.scripts.sweep launch\
       --data_dir=/my/datasets/path\
       --output_dir=/my/sweep/output/path\
       --command_launcher MyLauncher\
       --algorithms AT TRADES SROUDA\
       --datasets DIGIT\
       --n_hparams 20\
       --n_trials 3\
       --task domain_adaptation\
       --source_envs 0\
       --target_envs 1 2 3 4\
       --eps 0.008\
       --atk_lr 0.004\
       --atk_iter 5\
       --attack pgd\
       --source_type clean\
       --pretrain_model_dir=/my/pretrained/model/path \
       --pretrain

Here, MyLauncher has three options as implemented in command_launchers.py: local, dummy, multi_gpu. The command above trains multiple models for a number of randomly sampled hyperparameter sets (specified using n_hparams). For each model (defined by a particular choice of an algorithm, a dataset, and a hyperparameter set), an output directory is automatically created. When the training process is complete, an empty file named Done is created in the directory. Moreover, the directory will be populated with checkpoints of the best models based on clean/robust source/target validation set accuracy, in addition to a training log.

Once all jobs have reached either a successful or failed state, you can proceed to remove the records associated with the failed jobs using the command python -m domainrobust.scripts.sweep delete_incomplete, so that the folders associated with the incomplete jobs can be deleted, otherwise other sweep will not launch the job with the same hyperparameters. After deleting the incomplete jobs, you can re-launch them by executing python -m domainrobust.scripts.sweep launch. Please ensure that you provide the same command-line arguments when you re-launch as you did during the initial launch.

To collect the results from all folders:

python -m domainrobust.scripts.collect_results\
       --latex\
       --input_dir=/my/sweep/output/path\
       --task domain_adaptation\
       --attack pgd

To evaluate on existing models:

python -m domainrobust.scripts.test\
       --data_dir=/my/datasets/path \
       --input_dir=/gcs/xcloud-shared/${USER}/output \
       --dataset DIGIT \
       --eps 0.008 \
       --atk_lr 0.004 \
       --atk_iter 20 \
       --attack pgd 

About

License:MIT License


Languages

Language:Python 100.0%