Papers-of-Robust-ML

Related papers for robust machine learning

General Defenses (training phase)
General Defenses (inference phase)
Adversarial Detection
Verification
Theoretical Analysis
Empirical Analysis

General Defenses (training phase)

Intriguing Properties of Adversarial Training at Scale (ICLR 2020)
This paper investigates the effects of BN and deeper models for adversarial training on ImageNet.
You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle (NeurIPS 2019)
This paper provides a fast method for adversarial training from the perspective of optimal control.
Rethinking Softmax Cross-Entropy Loss for Adversarial Robustness
This paper rethink the drawbacks of softmax cross-entropy in the adversarial setting, and propose the MMC method to induce high-density regions in the feature space.
Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Accuracy
This paper introduces the mixup method into adversarial training to improve the model performance on clean images.
Theoretically Principled Trade-off between Robustness and Accuracy (ICML 2019)
A variant of adversarial training: TRADES, which won the defense track of NeurIPS 2018 Adversarial Competation.
Robust Decision Trees Against Adversarial Examples (ICML 2019)
A method to enhance the robustness of tree models, including GBDTs.
Adversarial Training for Free! (NeurIPS 2019)
A fast method for adversarial training, which shares the back-propogation gradients of updating weighs and crafting adversarial examples.
Improving Adversarial Robustness via Promoting Ensemble Diversity (ICML 2019)
Previous work constructs ensemble defenses by individually enhancing each memeber and then directly average the predictions. In this work, the authors propose the adaptive diversity promoting (ADP) to further improve the robustness by promoting the ensemble diveristy, as an orthogonal methods compared to other defenses.
Ensemble Adversarial Training- Attacks and Defenses (ICLR 2018)
Ensemble adversarial training use sevel pre-trained models, and in each training batch, they randomly select one of the currently trained model or pre-trained models to craft adversarial examples.
Max-Mahalanobis Linear Discriminant Analysis Networks (ICML 2018)
This is one of our work. We explicitly model the feature distribution as a Max-Mahalanobis distribution (MMD), which has max margin among classes and can lead to guaranteed robustness.
A Spectral View of Adversarially Robust Features (NeurIPS 2018)
Given the entire dataset X, use the eigenvectors of spectral graph as robust features. [Appendix]
Adversarial Logit Pairing
Adversarial training by pairing the clean and adversarial logits.
Deep Defense: Training DNNs with Improved Adversarial Robustness (NeurIPS 2018)
They follow the linear assumption in DeepFool method. DeepDefense pushes decision boundary away from those correctly classified, and pull decision boundary closer to those misclassified.
Feature Denoising for Improving Adversarial Robustness (CVPR 2019)
This paper applies non-local neural network and large-scale adversarial training with 128 GPUs (with training trick in 'Accurate, large minibatch SGD: Training ImageNet in 1 hour'), which shows large improvement than previous SOTA trained with 50 GPUs.
Towards Deep Learning Models Resistant to Adversarial Attacks (ICLR 2018)
This paper proposed projected gradient descent (PGD) attack, and the PGD-based adversarial training.

General Defenses (inference phase)

Barrage of Random Transforms for Adversarially Robust Defense (CVPR 2019)
This paper applies a set of different random transformations as an off-the-shelf defense.
Mitigating Adversarial Effects Through Randomization (ICLR 2018)
Use random resizing and random padding to disturb adversarial examples, which won the 2nd place in th defense track of NeurIPS 2017 Adversarial Competation.
Countering Adversarial Images Using Input Transformations (ICLR 2018)
Apply bit-depth reduction, JPEG compression, total variance minimization and image quilting as input preprocessing to defend adversarial attacks.

Adversarial Detection

Towards Robust Detection of Adversarial Examples (NeurIPS 2018)
This is one of our work. We train the networks with reverse cross-entropy (RCE), which can map normal features to low-dimensional manifolds, and then detectors can better separate between adversarial examples and normal ones.
A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks (NeurIPS 2018)
Fit a GDA on learned features, and use Mahalanobis distance as the detection metric.
Robust Detection of Adversarial Attacks by Modeling the Intrinsic Properties of Deep Neural Networks (NeurIPS 2018)
They fit a GMM on learned features, and use the probability as the detection metric.

Verification

Automated Verification of Neural Networks: Advances, Challenges and Perspectives
This paper provides an overview of main verification methods, and introduces previous work on combining automated verification with machine learning. They also give some insights on future tendency of the combination between these two domains.
Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope (ICML 2018)
By robust optimization (via a linear program), they can get a point-wise bound of robustness, where no adversarial example exists in the bound. Experiments are done on MNIST.
Scaling Provable Adversarial Defenses (NeurIPS 2018)
They add three tricks to improve the scalability of previously proposed method. Experiments are done on MNIST and CIFAR-10.

Theoretical Analysis

Adversarial Examples Are a Natural Consequence of Test Error in Noise (ICML 2019)
This paper connects the relation between the general corruption robustness and the adversarial robustness, and recommand the adversarial defenses methods to be also tested on general-purpose noises.
Adversarial Examples Are Not Bugs, They Are Features (NeurIPS 2019)
They claim that adversarial examples can be directly attributed to the presence of non-robust features, which are highly predictive but locally quite sensitive.
On Evaluating Adversarial Robustness
Some analyses on how to correctly evaluate the robustness of adversarial defenses.
Robustness of Classifiers:from Adversarial to Random Noise (NeurIPS 2016)
Adversarial Vulnerability for Any Classifier (NeurIPS 2018)
Uniform upper bound of robustness for any classifier on the data sampled from smooth genertive models.
Adversarially Robust Generalization Requires More Data (NeurIPS 2018)
This paper show that robust generalization requires much more sample complexity compared to standard generlization on two simple data distributional models.

Empirical Analysis

Adversarial Example Defenses: Ensembles of Weak Defenses are not Strong
This paper tests some ensemble of existing detection-based defenses, and claim that these ensemble defenses could still be evade by white-box attacks.

hendrycks / Papers-of-Robust-ML

Papers-of-Robust-ML

Contents

General Defenses (training phase)

General Defenses (inference phase)

Adversarial Detection

Verification

Theoretical Analysis

Empirical Analysis

About