dreadl0ck / masterthesis

Master thesis experiment code

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Masterthesis Code

This repository contains executions logs and scripts to reproduce the experiments from our masterthesis "Good || Evil: Defending Infrastructure at Scale with Anomaly and Classification Based Network Intrusion Detection"

TODOs

  • update presentation slides
  • add CSV dataset files to git lfs
  • add detailed step by step docs for experiment reproduction

Structure

The dnn folder contains the logs from the experiment execution of the deep neural network experiments, namely v1 and v2.

Each one has a folder that contains the model from the last execution, as well as folders for the tensorboard logs for each run, sequentially numbered.

The core experiment logic is in dnn/v1/experiment1.sh and dnn/v1/experiment2.sh respectively.

The screenlog.0 file contains the raw output from the experiment script execution, the file v1.csv and v2.csv files contain a generated spreadsheet for human analysis.

Step by step instructions are located in dnn/README.md.

CIC IDS 2018 Attack Descriptions

dnn/cic-ids2018-attacks.yml contains the attacks mapped onto the audit records.

Throughput measurement

dnn/pps-line.png is the line chart for the packet ingestion performance measured in packets per second when generating the connection audit records used in our experiments.

Connection Audit Record Generation

To generate and label the audit records for the experiments by processing the raw pcap files, the dnn/process-pcaps.sh script is used.

Please make sure to use at least netcap version v0.6.0 for the reproduction of our experiment results, and v0.6.1+ if you want to generate the throughput measurement chart automatically.

Pcap preprocessing

The pcaps for the CIC IDS 2018 dataset are provided for each day, as individual capture files per host. We merged these into a single pcap file for each day of the dataset.

After merging we noticed that some merged capture files contained traffic from multiple days, and therefore decided to clean them to ensure each file would only contain traffic for the desired day. The dnn/clean-days.sh script is used for this purpose.

Extract generated plots

The experiment code will generate several plots, to extract them all into a single directory the dnn/extract-plots.sh script can be used.

Experiments on Netflow data from the dataset authors

TODO: add step by step guide

Download and install anomaly tool: https://github.com/ppartarr/anomaly

Experiment scripts:

  • runAudit.sh
  • runExperiments.sh
  • runNetflow.sh

LFS

Some files in this repository are too big for github and therefore provided via git large file storage extension.

To install the git lfs extension, use:

Apt/deb: sudo apt-get install git-lfs
Yum/rpm: sudo yum install git-lfs
MacOS: brew install git-lfs

Afterwards just clone the repository as usual:

git clone git@github.com:dreadl0ck/masterthesis.git

About

Master thesis experiment code


Languages

Language:Shell 92.0%Language:HTML 8.0%