Provable Training and Verification Approaches Towards Robust Neural Networks

Recently, provable (i.e. certified) adversarial robustness training and verification methods have demonstrated their effectiveness against adversarial attacks. In contrast to empirical robustness and empirical adversarial attacks, the provable robustness verification provides rigorous lower bound of robustness for a given neural network, such that no existing or future attacks will attack further.

Note that the training methods towards robust networks are usually connected with the corresponding verification approach. For instance, after training, the robustness bound is often measure on the test set in terms of "robust accuracy"(RACC). One data sample is considered to be provable robust if and only if we can prove that there is no adversarial samples exist in the neighborhood, i.e., the model always outputs the current prediction label in the neighborhood. The neighborhood is usually defined by L-norm distance.

Tighter provable robustness bound can be achieved by better robust training approaches, and tighter robustness verification approaches, or jointly.

Scope of the Repo

Current works mainly focus on image classification tasks with datasets MNIST, CIFAR10, ImageNet, FashionMNIST, and SVHN.

We focus on perturbation measured by L-2 and L-infty norms.

This repo mainly records recent progress of above settings, while advances in other settings are recorded in the attached paperlist.

We only consider single model robustness.

Contact & Updates

We are trying to keep track of all important advances of provable robustness approaches, but may still miss some.

Please feel free to contact us (Linyi(linyi2@illinois.edu) @ UIUC Secure Learning Lab & Illinois ASE Group) or commit your updates :)

Main Leaderboard

ImageNet

All input images contain three channels; each pixel is in range [0, 255].

L2

eps = 0.2

Defense	Author	Model Structure	RACC
Certified Robustness to Adversarial Examples with Differential Privacy	Lecuyer et al	Inception V3	40%

eps = 0.5

Defense	Author	Model Structure	RACC
Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers	Salman et al	ResNet-50	56%
Certified Adversarial Robustness via Randomized Smoothing	Cohen et al	ResNet-50	49%

All above approaches use Randomized Smoothing (Cohen et al) to derive certification, with wrong probability at 0.1%.

eps = 1.0

Defense	Author	Model Structure	RACC
Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers	Salman et al	ResNet-50	43%
Certified Adversarial Robustness via Randomized Smoothing	Cohen et al	ResNet-50	37%

All above approaches use Randomized Smoothing (Cohen et al) to derive certification, with wrong probability at 0.1%.

eps = 2.0

Defense	Author	Model Structure	RACC
Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers	Salman et al	ResNet-50	27%
Certified Adversarial Robustness via Randomized Smoothing	Cohen et al	ResNet-50	19%

All above approaches use Randomized Smoothing (Cohen et al) to derive certification, with wrong probability at 0.1%.

eps = 3.0

Defense	Author	Model Structure	RACC
Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers	Salman et al	ResNet-50	20%
Certified Adversarial Robustness via Randomized Smoothing	Cohen et al	ResNet-50	12%

All above approaches use Randomized Smoothing (Cohen et al) to derive certification, with wrong probability at 0.1%.

L-Infty

eps=1/255

Defense	Author	Model Structure	RACC
Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers	Salman et al	ResNet-50	36.8%	transformed from L-2 robustness; wrong prob. 0.001
Certified Adversarial Robustness via Randomized Smoothing	Cohen et al	ResNet-50	28.6%	transformed from L-2 robustness by Salman et al; wrong prob. 0.001
On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models	Gowal et al	WideResNet-10-10	6.13%	Dataset downscaled to 64 x 64

eps=1.785/255

Defense	Author	Model Structure	RACC
MixTrain: Scalable Training of Verifiably Robust Neural Networks	Wang et al	ResNet	19.4%
Scaling provable adversarial defenses	Wong et al	ResNet	5.1%	Run and reported by Wang et al

In above table, the dataset is ImageNet-200 rather than ImageNet-1000 in other tables.

CIFAR-10

All input images have three channels; 32 x 32 x 3 size; each pixel is in range [0, 255].

L2

eps=0.14

Defense	Author	Model Structure	RACC
Scaling provable adversarial defenses	Wong et al	Resnet	51.96%	36/255; transformed from L-infty 2/255
Certified Robustness to Adversarial Examples with Differential Privacy	Lecuyer et al	Resnet	40%
(Verification) Efficient Neural Network Robustness Certification with General Activation Functions	Zhang et al	ResNet-20	0%	Reported by Cohen et al

eps=0.25

Defense	Author	Model Structure	RACC
Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers	Salmon et al	ResNet-110	82%
Unlabeled Data Improves Adversarial Robustness	Carmon et al	ResNet 28-10	72%	interpolated from Fig. 1
Certified Adversarial Robustness via Randomized Smoothing	Cohen et al	ResNet-110	61%
(Verification) Efficient Neural Network Robustness Certification with General Activation Functions	Zhang et al	ResNet-20	0%	Reported by Cohen et al

All above approaches use Randomized Smoothing (Cohen et al) to derive certification, with wrong probability at 0.1%.

eps=0.5

Defense	Author	Model Structure	RACC
Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers	Salmon et al	ResNet-110	65%
Unlabeled Data Improves Adversarial Robustness	Carmon et al	ResNet 28-10	61%	interpolated from Fig. 1
Certified Adversarial Robustness via Randomized Smoothing	Cohen et al	ResNet-110	43%

All above approaches use Randomized Smoothing (Cohen et al) to derive certification, with wrong probability at 0.1%.

eps=1.0

Defense	Author	Model Structure	RACC
Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers	Salmon et al	ResNet-110	39%
Certified Adversarial Robustness via Randomized Smoothing	Cohen et al	ResNet-110	22%

All above approaches use Randomized Smoothing (Cohen et al) to derive certification, with wrong probability at 0.1%.

eps=1.5

Defense	Author	Model Structure	RACC
Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers	Salmon et al	ResNet-110	32%
Certified Adversarial Robustness via Randomized Smoothing	Cohen et al	ResNet-110	14%

All above approaches use Randomized Smoothing (Cohen et al) to derive certification, with wrong probability at 0.1%.

L Infty

eps=2/255

Defense/Verification	Author	Model Structure	RACC
Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers	Salmon et al	ResNet-110	68.2%	transformed from L-2 robustness; wrong prob. 0.1%
(Verification) Efficient Neural Network Verification with Exactness Characterization	Dvijotham et al	Small CNN	65.4%
Unlabeled Data Improves Adversarial Robustness	Carmon et al	ResNet 28-10	63.8% ± 0.5	wrong prob. 0.1%
Robustra: Training Provable Robust Neural Networks over Reference Adversarial Space	Li et al	CNN	56.32%
Scaling provable adversarial defenses	Wong et al	Resnet	53.89%
Differentiable Abstract Interpretation for Provably Robust Neural Networks	Mirman et al	Residual	52.2%	~1.785/255, only evaluated from 500 samples among all 10,000
MixTrain: Scalable Training of Verifiably Robust Neural Networks	Wang et al	Resnet	50.4%
(Verification)Evaluating Robustness of Neural Networks with Mixed Integer Programming	Tjeng et al	CNN	50.20%	MILP verification on Wong et al. model
On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models	Gowal et al	CNN	50.02%
Towards Stable and Efficient Training of Verifiably Robust Neural Networks	Zhang et al	small CNN	47.54%	pick the best number
Training for Faster Adversarial Robustness Verification via Inducing ReLU Stability	Xiao et al	CNN	45.93%
(Verification) Beyond the Single Neuron Convex Barrier for Neural Network Certification	Singh et al	ConvBig	45.9%	evaluated on first 1,000 images
(Verification) An Abstract Domain for Certifying Neural Networks	Singh et al	CNN	40%	~2.04/255, only evaluated from 100 samples among all 10,000
(Verification)A Dual Approach to Scalable Verification of Deep Networks	Dvijotham et al	CNN	20%	LP-Dual verification on Uesato et al. model; interpolated from Fig. 2(a)

eps=8/255

Defense/Verification	Author	Model Structure	RACC
Fast and Stable Interval Bounds Propagation for Training Verifiably Robust Models	Morawiecki et al	Small CNN	39.88%
Differentiable Abstract Interpretation for Provably Robust Neural Networks	Mirman et al	Small CNN	37.4%	~7.65/255, only evaluated from 500 samples among all 10,000
On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models	Gowal et al	CNN	32.04%	Practically reproducible verified error is about 28% - 29% according to Zhang et al
Towards Stable and Efficient Training of Verifiably Robust Neural Networks	Zhang et al	large CNN	29.21%	pick the best number
Training Verified Learners with Learned Verifiers	Dvijotham et al	Predictor-Verifier	26.67%
Provably Robust Boosted Decision Stumps and Trees against Adversarial Attacks	Andriushchenko & Hein	Boosted trees	25.31%
Robustra: Training Provable Robust Neural Networks over Reference Adversarial Space	Li et al	CNN	25.13%
(Verification) Beyond the Single Neuron Convex Barrier for Neural Network Certification	Singh et al	ResNet	24.5%	Evaluated on first 1,000 images
A Provable Defense for Deep Residual Networks	Mirman et al	ResNet-Tiny	23.2%
Evaluating Robustness of Neural Networks with Mixed Integer Programming	Tjeng et al	CNN	22.40%
Scaling provable adversarial defenses	Wong et al	Resnet	21.78%
(Verification) Boosting Robustness Certification of Neural Networks	Singh et al	Small CNN	21%	~7.65/255, only evaluated from 500 samples among all 10,000
Training for Faster Adversarial Robustness Verification via Inducing ReLU Stability	Xiao et al	CNN	20.27%
(Verification) A Dual Approach to Scalable Verification of Deep Networks	Dvijotham et al	CNN	0%	LP-Dual verification on Uesato et al. model; interpolated from Fig. 2(a)

MNIST

All input images are grayscale; 28 x 28 size; each pixel is in range [0, 1].

L2

eps=1.58

eps=1.58 is transformed from L-infty eps=0.1.

Defense/Verification	Author	Model Structure	RACC
Scaling provable adversarial defenses	Wong et al	Small CNN	88.14%

L-Infty

eps=0.1

Defense/Verification	Author	Model Structure	RACC
Robustra: Training Provable Robust Neural Networks over Reference Adversarial Space	Li et al	large CNN	97.91%
On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models	Gowal et al	CNN	97.77%
(Verification) Evaluating Robustness of Neural Networks with Mixed Integer Programming	Tjeng et al	large CNN	97.26%
MixTrain: Scalable Training of Verifiably Robust Neural Networks	Wang et al	small CNN	97.1%
(Verification) Boosting Robustness Certification of Neural Networks	Singh et al	ConvSuper	97%	Only evaluated from 100 samples among all 10,000
Differentiable Abstract Interpretation for Provably Robust Neural Networks	Mirman et al	big CNN	96.6%	Only evaluated from 500 samples among all 10,000
Scaling Provable Adversarial Defenses	Wong et al	large CNN	96.33%
Towards Stable and Efficient Training of Verifiably Robust Neural Networks	Zhang et al	small CNN	95.79%	pick the best number
Training Verified Learners with Learned Verifiers	Dvijotham et al	Predictor-Verifier	95.56%
Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope	Wong et al	CNN	94.18%
(Verification) Efficient Neural Network Verification with Exactness Characterization	Dvijotham et al	Grad-NN	83.68%
(Verification) A Convex Relaxation Barrier to Tight Robustness Verification of Neural Networks	Salman et al	CNN	82.76%
(Verification) Semidefinite relaxations for certifying robustness to adversarial examples	Raghunathan et al	small NN	82%
Certified Defenses against Adversarial Examples	Raghunathan et al	2-layer NN	65%

eps=0.3

Defense/Verification	Author	Model Structure	RACC
Towards Stable and Efficient Training of Verifiably Robust Neural Networks	Zhang et al	large CNN	92.54%	pick the best number
On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models	Gowal et al	CNN	91.95%
Provably Robust Boosted Decision Stumps and Trees against Adversarial Attacks	Andriushchenko & Hein	Boosted trees	87.54%
Robustra: Training Provable Robust Neural Networks over Reference Adversarial Space	Li et al	small CNN	83.09%
(Verification) Evaluating Robustness of Neural Networks with Mixed Integer Programming	Tjeng et al	large CNN	75.81%
(Verification) Beyond the Single Neuron Convex Barrier for Neural Network Certification	Singh et al	ConvBig	73.6%	evaluated on first 1,000 images
MixTrain: Scalable Training of Verifiably Robust Neural Networks	Wang et al	small CNN	60.1%
Scaling Provable Adversarial Defenses	Wong et al	small CNN	56.90%
(Verification) A Convex Relaxation Barrier to Tight Robustness Verification of Neural Networks	Salman et al	CNN	39.83%

eps=0.4

Defense/Verification	Author	Model Structure	RACC
Towards Stable and Efficient Training of Verifiably Robust Neural Networks	Zhang et al	large CNN	87.04%	pick the best number
On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models	Gowal et al	CNN	85.12%
Fast and Stable Interval Bounds Propagation for Training Verifiably Robust Models	Morawiecki et al	large CNN	84.42%
(Verification) Evaluating Robustness of Neural Networks with Mixed Integer Programming	Tjeng et al	small CNN	51.02%

SVHN

The image size is 32 x 32 x 3 (3-channel in color). Pixel colors in [0, 255]. When calculating eps, these values are rescaled to [0, 1].

L2

eps=0.1

Defense/Verification	Author	Model Structure	RACC
Certified Adversarial Robustness via Randomized Smoothing	Cohen et al	Resnet-20	~95%	Interpolate from Cohen et al
Lipschitz-Margin Training: Scalable Certification of Perturbation Invariance for Deep Neural Networks	Tsuzuku et al	Resnet-20	0%	Interpolate from Cohen et al

eps=0.2

Defense/Verification	Author	Model Structure	RACC
Certified Adversarial Robustness via Randomized Smoothing	Cohen et al	Resnet-20	~88%	Interpolate from Cohen et al
Lipschitz-Margin Training: Scalable Certification of Perturbation Invariance for Deep Neural Networks	Tsuzuku et al	Resnet-20	0%	Interpolate from Cohen et al

L-Infty

eps=0.01

Defense/Verification	Author	Model Structure	RACC
Training Verified Learners with Learned Verifiers	Dvijotham et al	Predictor-Verifier	62.44%
On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models	Gowal et al	CNN	62.40%
Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope	Wong et al	CNN	59.33%
Differentiable Abstract Interpretation for Provably Robust Neural Networks	Mirman et al	small CNN	11.0%

eps=8/255

Defense/Verification	Author	Model Structure	RACC
On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models	Gowal et al	large CNN	47.63%	Report by Morawiecki et al
Fast and Stable Interval Bounds Propagation for Training Verifiably Robust Models	Morawiecki et al	small CNN	46.03%

Fashion-MNIST

This is a MNIST-like dataset. Images are 28 x 28 and grayscale. Values are in [0, 1].

L-Infty

eps=0.1

Defense/Verification	Author	Model Structure	RACC
Towards Stable and Efficient Training of Verifiably Robust Neural Networks	Zhang et al	large CNN	78.73%	pick the best number
On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models	Gowal et al	large CNN	77.63%	pick the best number, reported by Zhang et al
Provably Robust Boosted Decision Stumps and Trees against Adversarial Attacks	Andriushchenko & Hein	Boosted trees	76.83%
Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope	Wong et al	CNN	65.47%

*. Within one dataset, L-2 and L-Infty balls are mutually transformable. After transformation, a corresponding tight bound may exist but not listed.

Notes:

Some papers use rarely-used epsilon to report their results, which may increase comparison difficulty. Some papers use epsilon after regularization instead of raw one, which may also induce confusion.

We would suggest to adapt common evaluation epsilon values and settings.
Instead of evaluating on above benchmarks and reporting the robust accuracy, some papers tend to report average robust radius. We will add comparison table for such metric later.
Besides the on-the-board results, all these papers have their own unique takeaways. For interested reader and stackholders, we recommend to not only value the approach with higher numbers, but also dig into their technical meat.

Reference: Empirical Robustness

For comparison, here we cite numbers from MadryLab repositories for MNIST challenge and CIFAR-10 challenge, which records the best attacks towarding their robust model with secret weights.

CIFAR-10

L-Infty

eps=8/255

Block-Box

Attack	Submitted by	Accuracy	Submission Date
PGD on the cross-entropy loss for the adversarially trained public network	(initial entry)	63.39%	Jul 12, 2017

White-Box

Attack	Submitted by	Accuracy	Submission Date
MultiTargeted	Sven Gowal	44.03%	Aug 28, 2019

MNIST

L-Infty

eps=0.3

Black-Box

Attack	Submitted by	Accuracy	Submission Date
AdvGAN from "Generating Adversarial Examples with Adversarial Networks"	AdvGAN	92.76%	Sep 25, 2017

White-Box

Attack	Submitted by	Accuracy	Submission Date
First-Order Adversary with Quantized Gradients	Zhuanghua Liu	88.32%	Oct 16, 2019

An (Incomplete) Paper List

Works in this field include provable training approaches and verification approaches. Related analysis or discussion papers are also listed.

Exact Verifiers

Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks

(CAV 2017, arxiv:1702.01135)

Feb 2017

Guy Katz, Clark Barrett, David Dill, Kyle Julian, Mykel Kochenderfer
(Planet) Formal Verification of Piece-Wise Linear Feed-Forward Neural Networks

(ISATVA 2017, arxiv: 1705.01320)

May 2017

Ruediger Ehlers
(Survery paper) Algorithms for Verifying Deep Neural Networks

(arxiv: 1903.06758)

Mar 2019

Changliu Liu, Tomer Arnon, Christopher Lazarus, Clark Barrett, Mykel J. Kochenderfer

MILP (Mixed Interger Programming)

A fast exact verifier.

Evaluating Robustness of Neural Networks with mixed Integer Programming

(ICLR 2019, arxiv:1711.07356)

Nov 2017

Vincent Tjeng, Kai Xiao, Russ Tedrake
Training for Faster Adversarial Robustness Verification via Inducing ReLU Stability

(ICLR 2019, arxiv: 1809.03008)

Sept 2018

Kai Y. Xiao, Vincent Tjeng, Nur Muhammad Shafiullah, Aleksander Madry

Lipschitz-Based

(Heuristic, CLEVER) Evaluating the Robustness of Neural Networks: An Extreme Value Theory Approach

(ICLR 2018, arxiv: 1801.10578)

Jan 2018

*Tsui-Wei Weng, *Huan Zhang, Pin-Yu Chen, Jinfeng Yi, Dong Su, Yupeng Gao, Cho-Jui Hsieh, Luca Daniel
Lipschitz-Margin Training: Scalable Certification of Perturbation Invariance for Deep Neural Networks

(NeurIPS 2018, arxiv: 1802.04034)

Feb 2018

Yusuke Tsuzuku, Issei Sato, Masashi Sugiyama
(Fast-Lip) Towards Fast Computation of Certified Robustness for ReLU Networks

(ICML 2018, arxiv: 1804.09699)

Apr 2018

*Tsui-Wei Weng, *Huan Zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Duane Boning, Inderjit S. Dhillon, Luca Daniel
On Extensions of CLEVER: A Neural Network Robustness Evaluation Algorithm

(GlobalSIP 2018, arxiv: 1810.08640)

Oct 2018

*Tsui-Wei Weng, *Huan Zhang, Pin-Yu Chen, Aurelie Lozano, Cho-Jui Hsieh, Luca Daniel
RecurJac: An Efficient Recursive Algorithm for Bounding Jacobian Matrix of Neural Networks and Its Applications

(AAAI 2019, arxiv: 1810.11783)

Oct 2018

Huan Zhang, Pengchuan Zhang, Cho-Jui Hsieh

IBP (Interval Bound Propagation)

(IBP + Dual) Training Verified Learners with Learned Verifiers

(arxiv: 1805.10265)

May 2018

Krishnamurthy Dvijotham, Sven Gowal, Robert Stanforth, Relja Arandjelovic, Brendan O'Donoghue, Jonathan Uesato, Pushmeet Kohli
On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models

(NeurIPS 2018 Workshop Best Paper, arxiv: 1810.12715)

Oct 2018

Sven Gowal, Krishnamurthy Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin, Jonathan Uesato, Relja Arandjelovic, Timothy Mann, Pushmeet Kohli
Fast and Stable Interval Bounds Propagation for Training Verifiably Robust Models

(arxiv: 1906.00628)

Jun 2019

Paweł Morawiecki, Przemysław Spurek, Marek Śmieja, Jacek Tabor
On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models

(arxiv: 1906.06316)

Jun 2019

Huan Zhang, Hongge Chen, Chaowei Xiao, Bo Li, Duane Boning, Cho-Jui Hsieh

Linear Relaxations

(Fast-Lin) Towards Fast Computation of Certified Robustness for ReLU Networks

(ICML 2018, arxiv: 1804.09699)

Apr 2018

*Tsui-Wei Weng, *Huan Zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Duane Boning, Inderjit S. Dhillon, Luca Daniel
(Zonotope) Differentiable Abstract Interpretation for Provably Robust Neural Networks

(ICML 2018)

Jul 2018

Matthew Mirman, Timon Gehr, Martin Vechev
(Zonotope) Boosting Robustness Certification of Neural Networks

(ICLR 2019)

Sep 2018

Gagandeep Singh, Timon Gehr, Markus Püschel, Martin Vechev
(CROWN) Efficient Neural Network Robustness Certification with General Activation Functions

(NIPS 2018, arxiv: 1811.00866)

Nov 2018

*Huan Zhang, *Tsui-Wei Weng, Pin-Yu Chen, Cho-Jui Hsieh, Luca Daniel
(Zonotope) Fast and Effective Robustness Certification

(NIPS 2018)

Dec 2018

Gagandeep Singh, Timon Gehr, Matthew Mirman, Markus Püschel, Martin Vechev
(Zonotope) An Abstract Domain for Certifying Neural Networks

(POPL 2019)

Jan 2019

Gagandeep Singh, Timon Gehr, Markus Püschel, Martin Vechev
(Unification) A Convex Relaxation Barrier to Tight Robust Verification of Neural Networks

(NeurIPS 2019, arxiv: 1902.08722)

Feb 2019

Hadi Salman, Greg Yang, Huan Zhang, Cho-Jui Hsieh, Pengchuan Zhang
On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models

(arxiv: 1906.06316)

Jun 2019

Huan Zhang, Hongge Chen, Chaowei Xiao, Bo Li, Duane Boning, Cho-Jui Hsieh
(kReLU) Beyond the Single Neuron Convex Barrier for Neural Network Certification

(NeurIPS 2019)

Nov 2019

Gagandeep Singh, Rupanshu Ganvir, Markus Püschel, Martin Vechev

Linear Dual Space Relaxations

Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope

(ICML 2018, arxiv: 1711.00851)

Nov 2017

Eric Wong, J. Zico Kolter
A Dual Approach to Scalable Verification of Deep Networks

(UAI 2018 Best Paper, arxiv: 1803.06567)

Mar 2018

Krishnamurthy (Dj)Dvijotham, Robert Stanforth, Sven Gowal, Timothy Mann, Pushmeet Kohli
Scaling Provable Adversarial Defenses

(NIPS 2018, arxiv: 1805.12514)

May 2018

Eric Wong, Frank R. Schmidt, Jan Hendrik Metzen, J. Zico Kolter
Robustra: Training Provable Robust Neural Networks over Reference Adversarial Space

(IJCAI 2019)

*Linyi Li, *Zexuan Zhong, Bo Li, Tao Xie

SDP and SDP-Dual

Certified Defenses against Adversarial Examples

(ICLR 2018, arxiv: 1801.09344)

Jan 2018

Aditi Raghunathan, Jacob Steinhardt, Percy Liang
Semidefinite relaxations for certifying robustness to adversarial examples

(NIPS 2018, arxiv: 1811.01057)

Nov 2018

Aditi Raghunathan, Jacob Steinhardt, Percy Liang
Safety Verification and Robustness Analysis of Neural Networks via Quadratic Constraints and Semidefinite Programming

(arxiv: 1903.01287)

Mar 2019

Mahyar Fazlyab, Manfred Morari, George J. Pappas
(SDP for Lipschitz) Efficient and Accurate Estimation of Lipschitz Constants for Deep Neural Networks

(NeurIPS 2019, arxiv: 1906.04893)

Jun 2019

Mahyar Fazlyab, Alexander Robey, Hamed Hassani, Manfred Morari, George J. Pappas
Efficient Neural Network Verification with Exactness Characterization

(UAI 2019, paper 164)

Jul 2019

Krishnamurthy (Dj) Dvijotham, Robert Stanforth, Sven Gowal, Chongli Qin, Soham De, Pushmeet Kohli

Differential Privacy and Randomized Smoothing

Certified Robustness to Adversarial Examples with Differential Privacy

(S&P 2019, arxiv: 1802.03471)

Feb 2018

Mathias Lecuyer, Vaggelis Atlidakis, Roxana Geambasu, Daniel Hsu, Suman Jana
Certified Adversarial Robustness via Randomized Smoothing

(ICML 2019, arxiv: 1902.02918)

Feb 2019

Jeremy M Cohen, Elan Rosenfeld, J. Zico Kolter
Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers

(NeurIPS 2019, arxiv: 1906.04584)

Jun 2019

Hadi Salman, Greg Yang, Jerry Li, Pengchuan Zhang, Huan Zhang, Ilya Razenshteyn, Sebastien Bubeck
A Stratified Approach to Robustness for Randomly Smoothed Classifiers

(arxiv: 1906.04948)

Jun 2019

Guang-He Lee, Yang Yuan, Shiyu Chang, Tommi S. Jaakkola

Hybrid

(Reluval) Formal security analysis of neural networks using symbolic intervals

(USENIX security 2018, arxiv: 1804.10829)

Apr 2018

Shiqi Wang, Kexin Pei, Justin Whitehouse, Junfeng Yang, Suman Jana
(Zonotope) AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation

(S&P 2018)

May 2018

Timon Gehr, Matthew Mirman, Dana Drachsler-Cohen, Petar Tsankov, Swarat Chaudhuri, Martin Vechev
Optimization + Abstraction: A Synergistic Approach for Analyzing Neural Network Robustness

(PLDI 2019, arxiv: 1904.09959)

Apr 2019

Greg Anderson, Shankara Pailoor, Isil Dillig, Swarat Chaudhuri

Ensemble

(Cascade)Scaling Provable Adversarial Defenses

(NIPS 2018, arxiv: 1805.12514)

May 2018

Eric Wong, Frank R. Schmidt, Jan Hendrik Metzen, J. Zico Kolter
(Cascade)Enhancing Certifiable Robustness via a Deep Model Ensemble

(ICLR 2019 Workshop, arxiv: 1910.14655)

Oct 2019

Huan Zhang, Minhao Cheng, Cho-Jui Hsieh

Distributional and Probabilistic

Certifying some distributional robustness with principled adversarial training

(ICLR 2018, arxiv: 1710.10571)

Oct 2017

Aman Sinha, Hongseok Namkoong, John Duchi
PROVEN: Certifying Robustness of Neural Networks with a Probabilistic Approach

(ICML 2019, arxiv: 1812.08329)

Dec 2018

Tsui-Wei Weng, Pin-Yu Chen, Lam M. Nguyen, Mark S. Squillante, Ivan Oseledets, Luca Daniel

Theory and Analysis

(Lp Bound Unreliable) On the sensitivity of adversarial robustness to input data distributions

(ICLR 2019, arxiv: 1902.08336)

Feb 2019

Gavin Weiguang Ding, Kry Yik Chau Lui, Xiaomeng Jin, Luyu Wang, Ruitong Huang
Universal Approximation with Certified Networks

(ICLR 2020 Submission, arxiv:1909.13846)

Sep 2019

Maximilian Baader, Matthew Mirman, Martin Vechev

Other Approaches

Provable Robustness of ReLU networks via Maximization of Linear Regions

(AISTATS 2019, arxiv: 1810.07481)

Oct 2018

Francesco Croce, Maksym Andriushchenko, Matthias Hein
Provable Certificates for Adversarial Examples: Fitting a Ball in the Union of Polytopes

(ICML 2019 SPML Workshop, ArXiv: 1903.08778)

Mar 2019

Matt Jordan, Justin Lewis, Alexandros G. Dimakis

Dealing with General Settings

Note that many papers above can be generalized to activation functions beyond ReLU. Here we only list those which deem their main contribution as dealing with general settings.

(Zonotope) An Abstract Domain for Certifying Neural Networks

(POPL 2019)

Jan 2019

Gagandeep Singh, Timon Gehr, Markus Püschel, Martin Vechev
(Non-linear Specs) Verification of Non-Linear Specifications for Neural Networks

(ICLR 2019, arxiv: 1902.09592)

Feb 2019

Chongli Qin, Krishnamurthy (Dj)Dvijotham, Brendan O'Donoghue, Rudy Bunel, Robert Stanforth, Sven Gowal, Jonathan Uesato, Grzegorz Swirszcz, Pushmeet Kohli
(Geometric Perturbation) Certifying Geometric Robustness of Neural Networks

(NeurIPS 2019)

Nov 2019

Mislav Balunovic, Maximilian Baader, Gagandeep Singh, Timon Gehr, Martin Vechev

Applications

The following papers either apply the above approaches to specific domains, or deal with different but closely related problems.

NLP

Certified Robustness to Adversarial Word Substitutions

(EMNLP 2019, arxiv: 1909.00986)

Sep 2019

Robin Jia, Aditi Raghunathan, Kerem Göksel, Percy Liang
Achieving Verified Robustness to Symbol Substitutions via Interval Bound Propagation

(arxiv: 1909.01492)

Sep 2019

Po-Sen Huang, Robert Stanforth, Johannes Welbl, Chris Dyer, Dani Yogatama, Sven Gowal, Krishnamurthy Dvijotham, Pushmeet Kohli

Tree Model

Provably Robust Boosted Decision Stumps and Trees against Adversarial Attacks

(NeurIPS 2019, arxiv: 1906.03526)

Jun 2019

Maksym Andriushchenko, Matthias Hein
Robustness Verification of Tree-based Models

(NeurIPS 2019, arxiv: 1906.03849)

Jun 2019

Hongge Chen, Huan Zhang, Si Si, Yang Li, Duane Boning, Cho-Jui Hsieh

CNN

CNN-Cert: An Efficient Framework for Certifying Robustness of Convolutional Neural Networks

(arxiv: 1811.12395)

Nov 2018

Akhilan Boopathy, Tsui-Wei Weng, Pin-Yu Chen, Sijia Liu, Luca Daniel

(blackbox CV) Towards Practical Verification of Machine Learning: The Case of Computer Vision Systems

(arxiv: 1712.01785)

Dec 2017

Kexin Pei, Yinzhi Cao, Junfeng Yang, Suman Jana

Reinforcement Learning

(Policy Verify) Verification of Neural Network Control Policy Under Persistent Adversarial Perturbation

(arxiv:1908.06353)

Aug 2019

Yuh-Shyang Wang, Tsui-Wei Weng, Luca Daniel

Probabilistic Models

Verification of Deep Probabilistic models

(NIPS 2018 Workshop, arxiv: 1812.02795)

Dec 2018

Krishnamurthy Dvijotham, Marta Garnelo, Alhussein Fawzi, Pushmeet Kohli

Maintained by Linyi.

Last updated: Nov 14, 2019

Provable Training and Verification Approaches Towards Robust Neural Networks

Scope of the Repo

Contact & Updates

Main Leaderboard

ImageNet

L2

eps = 0.2

eps = 0.5

eps = 1.0

eps = 2.0

eps = 3.0

L-Infty

eps=1/255

eps=1.785/255

CIFAR-10

L2

eps=0.14

eps=0.25

eps=0.5

eps=1.0

eps=1.5

L Infty

eps=2/255

eps=8/255

MNIST

L2

eps=1.58

L-Infty

eps=0.1

eps=0.3

eps=0.4

SVHN

L2

eps=0.1

eps=0.2

L-Infty

eps=0.01

eps=8/255

Fashion-MNIST

L-Infty

eps=0.1

Reference: Empirical Robustness

CIFAR-10

L-Infty

eps=8/255

MNIST

L-Infty

eps=0.3

An (Incomplete) Paper List

Applications

About