mlxyz/insynth

InSynth

Robustness testing of Keras models using domain-specific input generation in Python
Explore the docs »

View Demo · Report Bug · Request Feature

Table of Contents

About The Project
- Built With
Getting Started
- Prerequisites
- Installation
Usage
Reproduce Experimental Results
Contributing
License
Contact

About The Project

The robustness of machine learning models is crucial to their safe and reliable operation in real-world applications. However, conducting robustness tests is hard as it requires evaluating the model under test repeatedly on different datasets.

InSynth provides an easy-to-use, efficient and reliable framework for conducting robustness tests.

It works by applying a set of domain-specific input generation techniques (image, audio or text) to the seed dataset, and then evaluating the model under test on the generated inputs. Then, a set of coverage criteria are evaluated to determine how well each dataset covers the model. Finally, a report is generated comparing the models' performance and coverage on different generated datasets.

(back to top)

Built With

(back to top)

Getting Started

This section describes the steps to follow when you want to get started with the InSynth project.

Prerequisites

Before installing InSynth, make sure you have the following software applications installed and updated to the latest version.

Installation

To install InSynth, only one step is required.

Run the following command to install the python package from the PyPI repository:

pip install insynth

(back to top)

Usage

InSynth can be used in a variety of ways depending on the goal you are trying to achieve.

For an end-to-end complete robustness testing example, look into the docs/robustness_test_example.ipynb notebook.

Data Generation

To mutate an existing dataset using any of the perturbators provided in the framework, follow the steps below.

Import the perturbator (e.g. the ImageNoisePerturbator) from the respective module.
```
from insynth.perturbators.image import ImageNoisePerturbator
```
Create an instance of the perturbator.
```
perturbator = ImageNoisePerturbator()
```
Create a PIL image object from a file stored on disk and apply the perturbator to it.
```
seed_image = Image.open('path/to/image.jpg')
mutated_image = perturbator.apply(seed_image)
```
For audio perturbators, the same procedure applies but using the librosa.load method. Similarly, text perturbators expect the seed text to be provided as a string.

Save the mutated image to disk or display it.

mutated_image.save('path/to/mutated_image.jpg')
mutated_image.show()

Coverage Criteria Calculation

To calculate the coverage criteria for a model, follow the steps below.

Import the coverage criteria (e.g. the CoverageCriteria) from the respective module.
```
from insynth.calculators.neuron import StrongNeuronActivationCoverageCalculator
```
Create an instance of the coverage criteria and pass the model to be tested to the constructor.
```
coverage_calculator = StrongNeuronActivationCoverageCalculator(model)
```
If applicable, run the update_neuron_bounds method to determine the neuron bounds of the model.
```
coverage_calculator.update_neuron_bounds(training_dataset)
```
Run the update_coverage method to update model coverage for the given input.
```
coverage_calculator.update_coverage(input_data)
```
Run the get_coverage method to retrieve the current model coverage.
```
coverage = coverage_calculator.get_coverage()
```
Print the coverage to the console.
```
print(coverage)
```

Robustness Testing

The previous two sections describe how to generate a mutated dataset and calculate the coverage criteria for a model. These are prerequisites for testing the robustness of a model. In order to conduct a full end-to-end robustness test, the runner class is provided in InSynth.

Import the runner class from the respective module.
```
from insynth.runners import BasicImageRunner
```
Create an instance of the runner class and pass the list of perturbators, the list of coverage calculators and the model to the constructor in addition to the dataset inputs and target variables.
```
runner = BasicImageRunner(list_of_perturbators, list_of_coverage_calculators, dataset_x, dataset_y, model)
```
Note that the dataset_x parameter should be a method returning a python generator iterating over all samples to enable the processing of large datasets which do not fit into memory.
```
dataset_x = lambda: (x for x in dataset)
```
Run the run method to conduct the end-to-end robustness test.
```
report, robustness_score = runner.run()
```
Use the report variable to analyse the test results or use the robustness_score variable to retrieve a single robustness measure of the model.
```
print(report)
print(robustness_score)
```

If you want to apply all available perturbators and coverage calculators for a given domain, utilize the respective ComprehensiveRunner classes.

For more examples, please refer to the Documentation

(back to top)

Reproduce Experimental Results

The experimental results from the thesis can be reproduced by running the corresponding scripts in the experiments directory.

The performance comparison experiment is conducted in the reproduce_coverage_speed_comparison.py script.

The performance and coverage experiment is conducted in the reproduce_imagenet_results.py, reproduce_speaker_recognition_results.py and reproduce_sentiment_analysis_results.py scripts. To generate the speaker recognition and sentiment analysis models, first the generate_model_speaker_recognition.py and generate_model_sentiment_analysis.py scripts have to be run.

The perturbation strength experiment is conducted in the reproduce_imagenet_sensitivity_results.py, reproduce_speaker_recognition_sensitivity_results.py and reproduce_sentiment_analysis_sensitivity_results.py scripts.

Lastly, the diagrams used in the thesis can be generated by running the result_analysis.ipynb notebook.

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!