Reilover/Eq-MARL

Overview and Running the Code
Citation

If you use the code or derive code from it-- Cite the MIT License, Cite our work

Overview and Running the Code

The code learns movement policies in three MTD domain. The conda environment with all the prerequisite dependencies can be found at:

./conda_env.yml

Running the BSS-Q (but not the baselines) requires one to have a Gurobi installed in their machine alongwith a Liscence Key (it is free for academic researchers). More information on this can be found here. We will now briefly discuss how to use the existing data-files that are used in the paper and also how to generate all the data from scratch.

Moving Target Defense for Placement of Intrusion Detection Systems

Note that this domain does not need the full generality offered by Bayesian Stackelberg Markov Games and can be handled via MARL approaches for Markov Games.

Experimental data shown in the paper can be found at:

Hyperparameters -- ./markov_games/outputs/mtd-ids/hyperparameters.ini

Saved Rewards -- ./markov_games/outputs/mtd-ids/exp_data_ <learner name> Learner.pickle

Saved Plot -- ./markov_games/images/mtd-ids/state_rewards.png

Reproducing plot with existing data.

cd ./markov_games

# Copy saved data files
cp ./outputs/mtd-ids/*.ini ./

# Run code to plot graph
python plot_values.py

# View output
eog ./images/mtd-ids/state_rewards.png

Training RL agent and generating graph from scratch

cd ./markov_games

# Edit hyperparameters
vi hyperparameters.ini

# Run code (takes significant amount of time)
python run.py

# plot using generated data (stored in output folder)
python plot_values.py

# View output
eog ./images/mtd-ids/state_rewards.png

Moving Target Defense for Web-Application Security

Switching costs factored into the rewards of the BSMG

Experimental data shown in the paper can be found at:

Hyperparameters -- ./bayesian_stackelberg_markov_games/outputs/mtd_wa/paper/hyperparameters.ini

Saved Rewards -- ./bayesian_stackelberg_markov_games/outputs/mtd_wa/paper/exp_data_ <learner name> Learner.pickle

Saved Plot -- ./bayesian_stackelberg_markov_games/images/mtd_wa/paper/state_rewards.png

Reproducing plot with existing data.

cd ./bayesian_stackelberg_markov_games

# Copy saved data files
cp ./outputs/mtd_wa/paper/*.ini ./

# Run code to plot graph
python plot_values.py

# View output
eog ./images/mtd_wa/paper/state_rewards.png

Training RL agent and generating graph from scratch

cd ./bayesian_stackelberg_markov_games

# Edit hyperparameters
vi hyperparameters.ini

# Run code (takes significant amount of time)
python run.py

# plot using generated data (stored in output folder)
python plot_values.py

# View output
eog ./images/mtd_wa/paper/state_rewards.png

We have data from other runs that is available by simply replacing paper by run_5_30.

Switching factored into the transition dynamics of BSMG

Experimental data shown in the paper can be found at:

Hyperparameters -- ./bayesian_stackelberg_markov_games/outputs/mtd_wa_stochastic/hyperparameters.ini

Saved Rewards -- ./bayesian_stackelberg_markov_games/outputs/mtd_wa_stochastic/exp_data_ <learner name> Learner.pickle

Saved Plot -- ./bayesian_stackelberg_markov_games/images/mtd_wa_stochastic/state_rewards.png

Reproducing plot with existing data.

cd ./bayesian_stackelberg_markov_games

# Copy saved data files
cp ./outputs/mtd_wa_stochastic/*.ini ./

# Copy saved hyperparameters
cp ./outputs/mtd_wa_stochastic/*.pickle ./outputs/

# Run code to plot graph
python plot_values.py

# View output
eog ./images/mtd_wa_stochastic/state_rewards.png

Training RL agent and generating graph from scratch

cd ./bayesian_stackelberg_markov_games

# Edit hyperparameters
vi hyperparameters.ini

# Run code (takes significant amount of time)
python run.py

# plot using generated data (stored in output folder)
python plot_values.py

# View output
eog ./images/mtd_wa_stochastic/state_rewards.png

Citation

@article{sengupta2020multi,
  title={Multi-agent reinforcement learning in bayesian stackelberg markov games for adaptive moving target defense},
  author={Sengupta, Sailik and Kambhampati, Subbarao},
  journal={NeurIPS Workshop on Real-world Reinforcement Learning},
  year={2020}
}

Reilover / Eq-MARL

Table of Contents