ryderling / DEEPSEC

DEEPSEC: A Uniform Platform for Security Analysis of Deep Learning Model

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Paper uses averages instead of the minimum for security analysis

carlini opened this issue · comments

Perhaps the one key factor that differentiates security (and adversarial robustness) from other general forms of robustness is the worst-case mindset from which we evaluate. This paper uses the mean throughout to evaluate both attacks and defenses.

Using the mean over various attacks to compute the “security” of a defense completely misunderstands what it means to perform a security evaluation in the first place.

For example, the paper bolds the column for the NAT defense when evaluated on CIFAR-10 because it gives the highest “average security” against all attacks. However, this is fundamentally the incorrect evaluation to make: the only metric that matters in security is how well a defense withstands attacks targeting that defense. And in this setting, the alternate adversarial training approach of Madry et al. is strictly stronger.

For this reason, when the paper says that all the defenses are "more or less" effective, it's completely misrepresenting what is actually going on. In fact, almost all of the defenses studied offer 0% robustness to any actual attack. By misrepresenting this fact, the work of Madry et al. and Xu et al. which actually mostly satisfy their security claims aren't appropriately

commented

We do agree that security is the worst-case guarantee, and no system/model is absolutely secure from this perspective. However, from a practical or statistical point of view, security is often a kind of relative or statistical security. For instance, Windows 10 is often considered to be more secure for users than Windows XP, although both of them offer 0% security guarantee to actual attacks in your opinion.

Therefore, from a statistical or practical security perspective, rather than whether one particular defense is secure or not in the worst case scenario, we try to capture the overall security differences of different types of defenses, such as "For complete defenses, most of them have the capability of defending against some adversarial attacks, but no defense is universal. Particularly, the defenses that retrain their models usually perform better than others without retraining".
We believe it is also important to let the readers know which model is statistically more secure, while we agree that no model is 100% secure against worst-case attacks. In addition, we would also like to point out that Madry et al. is only more secure within the strict perturbation magnitude bound; because it is trained intensely within the norm-bounded ball, and this cannot be viewed as the “worst case scenario” in security. So it is not exactly accurate to claim “Madry et al. is more robust than other models”, and again it is for sure not secure under powerful attackers who can add larger perturbation in the worst case scenario attack, but this doesn’t mean this model is not useful.

Definitely, there is no such thing as perfect security and some things can be more robust than others.

Fortunately, we have a way to measure this. Accuracy.

For example, the model of Madry et al. gives roughly 45% accuracy on CIFAR-10 with l_infinity distortion of 0.031. PixelDefend give 9%. I'm not making the argument "neither model gives 100% robustness and so they're both useless", but rather making the argument that we must compare worst-case robustness when dealing with security. That is what makes security different than other fields.