ogarrett / FlowMeter

⭐ ⭐ Use ML to classify flows and packets as benign or malicious. ⭐ ⭐

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GitHub license GitHub stars GitHub issues Slack

FlowMeter

FlowMeter is an experimental utility built to analyse and classify packets by looking at packet headers. We use FlowMeter internally to quickly analyse and label packets.

Primary design goals:

Following are the major aims of FlowMeter:

  • Classify packets and flows as benign or malicious with high true positives (TP) and low false positives (FP).

  • Use the labeled data to reduce amount of traffic requiring deeper analysis.

Additionally, Deepfence FlowMeter also categorizes packets into flows and shows a rich ensemble of flow data and statistics.

Flowmeter-flows
FlowMeter takes packets and returns file with statistics of flows.
Flowmeter-flowsClassification
Flowmeter takes packets and returns file with statistics of flows and classifies packets as benign or malicious.

Datasets:

FlowMeter uses takes packets as input, derives a rich set of features, constructs flows on the basis of these features and uses machine learning to classify the ensuing flows as malicious or benign.

FlowMeter has provisions to take live packets or analyze offline packets.

One can download the below-mentioned pcap datasets to replicate the tests shown in this repo.

  • Benign:
wget https://deepfence-public.s3.amazonaws.com/pcap-datasets/benign_2017-05-02_kali-normal22.pcap
  • Malicious:
wget https://deepfence-public.s3.amazonaws.com/pcap-datasets/webgoat.pcap

Additionally, one can also use sample data from various sources like the datasets mentioned below, or gather packet captures using PacketStreamer or other pcap tools.

Data analysis and choice of features:

FlowMeter obtains the below-mentioned features from packets and constructs flows. Using the said features, FlowMeter can robustly differentiate between malicious and benign flows.

  • Inter-arrival time

    • Forward inter-arrival time per microsecond
    • Backward inter-arrival time per microsecond
    • Forward inter-arrival time mean
    • Backward inter-arrival time mean
    • Forward inter-arrival time standard deviation
    • Backward inter-arrival time standard deviation
    • Forward inter-arrival time minimum
    • Backward inter-arrival time minimum
    • Forward inter-arrival time maximum
    • Backward inter-arrival time maximum
  • Packet size

    • Total (forward + backward) packet size per microsecond
    • Forward packet size per microsecond
    • Backward packet size per microsecond
    • Forward packet size mean
    • Backward packet size mean
    • Forward packet size standard deviation
    • Backward packet size standard deviation
    • Forward packet size minimum
    • Backward packet size minimum
    • Forward packet size maximum
    • Backward packet size maximum
  • Flow length

    • Total flow length per microsecond
    • Forward flow length per microsecond
    • Backward flow length per microsecond
  • Flow duration

Following are a few visual examples of how these metrics help us differentiate between benign and malicious traffic.

fwdPacketSizeMax
Profiles of maximum of forward packet sizes shows clear distinction in benign and malicious flow data.
fwdPacketSizeTotal
Profiles of maximum of forward flow length shows clear distinction in benign and malicious flow data.
fwdIATMean
Profiles of forward inter-arrival time mean shows clear distinction between benign and malicious flow data.
bwdIATMean
Profiles of backward inter-arrival time mean shows clear distinction between benign and malicious flow data.

Architecture:

FlowMeter observes packets, obtains a rich set of features from them, constructs flows and generates output csv files for these flows.

Using these output csv files, an ML model can be trained to classify packets as benign or malicious.

Weights obtained from the trained ML models can be fed into FlowMeter, which can now be used to classify packets as malicious or benign.

How to run:

Use the below GitHub link to get FlowMeter.

Generate csv for flows:

git clone https://github.com/deepfence/FlowMeter.git
cd FlowMeter/pkg

# Install libpcap package.
# Ubuntu/Debian:  sudo apt-get install libpcap0.8-dev
# RHEL/Centos:    sudo yum install install libpcap-devel
go build -o flowmeter .

# Download pcap files.
mkdir packets

wget https://deepfence-public.s3.amazonaws.com/pcap-datasets/webgoat.pcap -P packets
wget https://deepfence-public.s3.amazonaws.com/pcap-datasets/benign_2017-05-02_kali-normal22.pcap -P packets

./flowmeter -ifLiveCapture=false -fname=webgoat -maxNumPackets=40000000 -ifLocalIPKnown false
./flowmeter -ifLiveCapture=false -fname=benign_2017-05-02_kali-normal22 -maxNumPackets=40000000 -ifLocalIPKnown false

Generate ML parameters and classify packets:

cd FlowMeter/assets

python Deepfence_ML_flowmeter.py

cd ../pkg/

./flowmeter -ifLiveCapture=false -fname=webgoat -maxNumPackets=40000000 -ifLocalIPKnown false
./flowmeter -ifLiveCapture=false -fname=benign_2017-05-02_kali-normal22 -maxNumPackets=40000000 -ifLocalIPKnown false

Following is an example output of the code. FlowMeter gives a rich set of features about flows from packet data, and classifies packets as benign or malicious.

flowmeter

Get in touch

Thank you for using FlowMeter.

Security and Support

For any security-related issues in the FlowMeter project, contact productsecurity at deepfence dot io.

Please file GitHub issues as needed, and join the Deepfence Community Slack channel.

License

The Deepfence FlowMeter project (this repository) is offered under the Apache2 license.

Contributions to Deepfence FlowMeter project are similarly accepted under the Apache2 license, as per GitHub's inbound=outbound policy.

About

⭐ ⭐ Use ML to classify flows and packets as benign or malicious. ⭐ ⭐

License:Apache License 2.0


Languages

Language:Jupyter Notebook 81.0%Language:Go 16.2%Language:Python 2.8%