adversariel / cleverhans

An adversarial example library for constructing attacks, building defenses, and benchmarking both

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CleverHans (latest release: v2.1.0)

cleverhans logo

Build Status

This repository contains the source code for CleverHans, a Python library to benchmark machine learning systems' vulnerability to adversarial examples. You can learn more about such vulnerabilities on the accompanying blog.

The CleverHans library is under continual development, always welcoming contributions of the latest attacks and defenses. In particular, we always welcome help towards resolving the issues currently open.

Setting up CleverHans

Dependencies

This library uses TensorFlow to accelerate graph computations performed by many machine learning models. Installing TensorFlow is therefore a pre-requisite.

You can find instructions here. For better performance, it is also recommended to install TensorFlow with GPU support (detailed instructions on how to do this are available in the TensorFlow installation documentation).

Installing TensorFlow will take care of all other dependencies like numpy and scipy.

Installation

Once dependencies have been taken care of, you can install CleverHans using pip or by cloning this Github repository.

pip installation

If you are installing CleverHans using pip, run the following command after installing TensorFlow:

pip install cleverhans

This will install the last version uploaded to Pypi. If you'd instead like to install the bleeding edge version, use:

pip install -e git+https://github.com/tensorflow/cleverhans.git#egg=cleverhans

Manual installation

If you are installing CleverHans manually, install TensorFlow first. Then, run the following command to clone the CleverHans repository into a folder of your choice:

git clone https://github.com/tensorflow/cleverhans

You can then install the local package in "editable" mode in order to add it to your PYTHONPATH

pip install -e ./cleverhans

Currently supported setups

Although CleverHans is likely to work on many other machine configurations, we currently test it it with Python {2.7, 3.5} and TensorFlow {1.4, 1.8} on Ubuntu 14.04.5 LTS (Trusty Tahr). Support for TensorFlow 1.3 and earlier is deprecated. After 2018-11-1 we will not fix bugs reported for these versions and we will eliminate wrapper code needed for backwards compatibility with these versions.

Tutorials

To help you get started with the functionalities provided by this library, the `cleverhans_tutorials/' folder comes with the following tutorials:

  • MNIST with FGSM (code): this tutorial covers how to train a MNIST model using TensorFlow, craft adversarial examples using the fast gradient sign method, and make the model more robust to adversarial examples using adversarial training.
  • MNIST with FGSM using Keras (code): this tutorial covers how to define a MNIST model with Keras and train it using TensorFlow, craft adversarial examples using the fast gradient sign method, and make the model more robust to adversarial examples using adversarial training.
  • MNIST with JSMA (code): this second tutorial covers how to define a MNIST model with Keras and train it using TensorFlow and craft adversarial examples using the Jacobian-based saliency map approach.
  • MNIST using a black-box attack (code): this tutorial implements the black-box attack described in this paper. The adversary train a substitute model: a copy that imitates the black-box model by observing the labels that the black-box model assigns to inputs chosen carefully by the adversary. The adversary then uses the substitute model’s gradients to find adversarial examples that are misclassified by the black-box model as well.

Some models used in the tutorials are defined using Keras, which should be installed before running these tutorials. Installation instructions for Keras can be found here. Note that you should configure Keras to use the TensorFlow backend. You can find instructions for setting the Keras backend on this page.

Examples

The examples/ folder contains additional scripts to showcase different uses of the CleverHans library or get you started competing in different adversarial example contests.

List of attacks

You can find a full list attacks along with their function signatures at cleverhans.readthedocs.io

Reporting benchmarks

When reporting benchmarks, please:

  • Use a versioned release of CleverHans. You can find a list of released versions here.
  • Either use the latest version, or, if comparing to an earlier publication, use the same version as the earlier publication.
  • Report which attack method was used.
  • Report any configuration variables used to determine the behavior of the attack.

For example, you might report "We benchmarked the robustness of our method to adversarial attack using v2.1.0 of CleverHans. On a test set modified by the FastGradientMethod with a max-norm eps of 0.3, we obtained a test set accuracy of 71.3%."

Contributing

Contributions are welcomed! To speed the code review process, we ask that:

Bug fixes can be initiated through Github pull requests.

Citing this work

If you use CleverHans for academic research, you are highly encouraged (though not required) to cite the following paper:

@article{papernot2018cleverhans,
  title={Technical Report on the CleverHans v2.1.0 Adversarial Examples Library},
  author={Nicolas Papernot and Fartash Faghri and Nicholas Carlini and
  Ian Goodfellow and Reuben Feinman and Alexey Kurakin and Cihang Xie and
  Yash Sharma and Tom Brown and Aurko Roy and Alexander Matyasko and
  Vahid Behzadan and Karen Hambardzumyan and Zhishuai Zhang and
  Yi-Lin Juang and Zhi Li and Ryan Sheatsley and Abhibhav Garg and 
  Jonathan Uesato and Willi Gierke and Yinpeng Dong and David Berthelot and
  Paul Hendricks and Jonas Rauber and Rujun Long},
  journal={arXiv preprint arXiv:1610.00768},
  year={2018}
}

About the name

The name CleverHans is a reference to a presentation by Bob Sturm titled “Clever Hans, Clever Algorithms: Are Your Machine Learnings Learning What You Think?" and the corresponding publication, "A Simple Method to Determine if a Music Information Retrieval System is a 'Horse'." Clever Hans was a horse that appeared to have learned to answer arithmetic questions, but had in fact only learned to read social cues that enabled him to give the correct answer. In controlled settings where he could not see people's faces or receive other feedback, he was unable to answer the same questions. The story of Clever Hans is a metaphor for machine learning systems that may achieve very high accuracy on a test set drawn from the same distribution as the training data, but that do not actually understand the underlying task and perform poorly on other inputs.

Authors

This library is managed and maintained by Ian Goodfellow (Google Brain), Nicolas Papernot (Pennsylvania State University), and Ryan Sheatsley (Pennsylvania State University).

The following authors contributed 100 lines or more (ordered according to the GitHub contributors page):

  • Nicolas Papernot (Pennsylvania State University, Google Brain intern)
  • Fartash Faghri (University of Toronto, Google Brain intern)
  • Nicholas Carlini (UC Berkeley)
  • Ian Goodfellow (Google Brain)
  • Reuben Feinman (Symantec)
  • Alexey Kurakin (Google Brain)
  • Cihang Xie (Johns Hopkins)
  • Yash Sharma (The Cooper Union)
  • Tom Brown (Google Brain)
  • Aurko Roy (Google Brain)
  • Alexander Matyasko (Nanyang Technological University)
  • Vahid Behzadan (Kansas State)
  • Karen Hambardzumyan (YerevaNN)
  • Zhishuai Zhang (Johns Hopkins)
  • Yi-Lin Juang (NTUEE)
  • Zhi Li (University of Toronto)
  • Ryan Sheatsley (Pennsylvania State University)
  • Abhibhav Garg (IIT Delhi)
  • Jonathan Uesato (MIT)
  • Willi Gierke (Hasso Plattner Institute)
  • Yinpeng Dong (Tsinghua University)
  • David Berthelot (Google Brain)
  • Paul Hendricks (NVIDIA)
  • Jonas Rauber (IMPRS)
  • Rujun Long (0101.AI)

Copyright

Copyright 2018 - Google Inc., OpenAI and Pennsylvania State University.

About

An adversarial example library for constructing attacks, building defenses, and benchmarking both

License:MIT License


Languages

Language:Python 100.0%