MartinPJorge / ITU-ML5G-PS-013

Solution for problem statement 13 in ITU ML 5G challenge

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool


This repository contains the solution of the ML 5G contest organized by ITU:

Particularly, it corresponds to problem statement 13. Participants have to forecast the throughput of BSs and STAs in a 802.11 deployment:

The slides with the problem statement presentation are available in this repo:

UPF image with routers and mobiles


This repository already contains the dataset provided at:

  • output-simulator: contains the output files of the komodor simulator;
  • input-node-files: contains the CSVs of the different scenarios;
  • output-simulator-parsed: contains the parsed output files in a JSON format;

Processing procedure

By running

./ output-simulator output-simulator-parsed/ 0

we parse the TXT files under output-simulator and store in a JSON the output of each scenario.

During the parsing of the data, we encountered the following errors:

  • output-simulator/script_output_sce1b.txt: scenario 028 misses the RSSI data of x2 STAs. Thus, we fill it with the last throughput value;
  • output-simulator/script_output_sce1a.txt: scenario 014 misses the RSSI data of x2 STAs. Thus, we fill it with the last throughput value;
  • output-simulator/script_output_sce1c.txt: scenario 024 misses the RSSI data of x2 STAs. Thus, we fill it with the last throughput value;

The refered files are sanitized as indicated in this repository.

v4 data

input-node-files-v4 and output-node-files-v4 contain the dataset files version 4. Such files contain the SINR of the STAs, therefore they need a different processing. It is enough to change 0 to 1 in the flag passed as last argument to ./


The solution creates a dataset gossip-dataset.csv with per-STA information. Each row corresponds to a STA, and the dataset is created using all STAs present in the different scenarios.

The columns present are the following (as idx is the STA identifier):

field description
wlan_code WLAN code the STA is attached to
scenario scenario id, e.g., 'sce1c'
deployment number of the scenario deployment, e.g., '010'
node_code STA node code, e.g., 'STA_A10'
node_x STA x coordinate
node_y STA y coordinate
node_z STA z coordinate
ap_x x coordinate of attached AP
ap_y y coordinate of attached AP
ap_z z coordinate of attached AP
primary_channel_neighs number of neighbors in the primary channel
primary_channel_0 1 or 0 if channel 0 is or is not, the primary
primary_channel_1 1 or 0 if channel 1 is or is not, the primary
primary_channel_2 1 or 0 if channel 2 is or is not, the primary
primary_channel_3 1 or 0 if channel 3 is or is not, the primary
primary_channel_4 1 or 0 if channel 4 is or is not, the primary
primary_channel_5 1 or 0 if channel 5 is or is not, the primary
primary_channel_6 1 or 0 if channel 6 is or is not, the primary
primary_channel_7 1 or 0 if channel 7 is or is not, the primary
allowed_channel_0 1 or 0 if STA is allowed to transmit over channel 0
allowed_channel_1 1 or 0 if STA is allowed to transmit over channel 1
allowed_channel_2 1 or 0 if STA is allowed to transmit over channel 2
allowed_channel_3 1 or 0 if STA is allowed to transmit over channel 3
allowed_channel_4 1 or 0 if STA is allowed to transmit over channel 4
allowed_channel_5 1 or 0 if STA is allowed to transmit over channel 5
allowed_channel_6 1 or 0 if STA is allowed to transmit over channel 6
allowed_channel_7 1 or 0 if STA is allowed to transmit over channel 7
rssi RSSI level of AP giving connection
q1_rssi 1st quantile of neighbors RSSI to the AP
q2_rssi 2st quantile of neighbors RSSI to the AP
q3_rssi 3st quantile of neighbors RSSI to the AP
q4_rssi 4st quantile of neighbors RSSI to the AP
sinr SINR level of AP giving connection
q1_sinr 1st quantile of neighbors SINR to the AP
q2_sinr 2st quantile of neighbors SINR to the AP
q3_sinr 3st quantile of neighbors SINR to the AP
q4_sinr 4st quantile of neighbors SINR to the AP
agg_interference aggregated interference perceived by the AP
channel_0_interference interference perceived by the AP in channel 0
channel_1_interference interference perceived by the AP in channel 1
channel_2_interference interference perceived by the AP in channel 2
channel_3_interference interference perceived by the AP in channel 3
channel_4_interference interference perceived by the AP in channel 4
channel_5_interference interference perceived by the AP in channel 5
channel_6_interference interference perceived by the AP in channel 6
channel_7_interference interference perceived by the AP in channel 7
throughput STA througput

Create a new dataset for Gossip

Using the output JSONs extracted to output-simulator-parsed/, and the input files present under input-node-files/; we can create a dataset to feed the Gossip solution:

python3 50 \
    --new_dataset gossip-dataset-v4.csv\
    --input_dir input-node-files-v4\
    --parsed_output_dir output-simulator-v4-parsed

This is how it was created the gossip-dataset-v4.csv in this repo (ignore the 50 parameter).

Train a Gossip model

To train a Gossip model you have to execute the following line:

python3 50 --dataset gossip-dataset.csv --model /tmp/gossip-trained-model --train --episodes 100

This line will train a model stored under /tmp/gossip-trained-model. The training consists of 100 episodes in which the gradient descend is performed over batches of size 50.

Use a Gossip model

To forecast the throughput of every STA and AP in a scenario, first one must create a dataset of the corresponding scenario.

Lets assume the test corresponds to a single deployment in a given scenario named 0t. If we have files input_nodes_sce0t_deployment_000.csv, and script_output_sce0t.txt; with the latter containing the single scenario:

cat script_output_sce0t.txt

KOMONDOR SIMULATION 'sim_input_nodes_sce0t_deployment_000.csv' (seed 1992)

Then one creates dedicated directories to perform the JSON extraction, and dataset creation as follows:

mkdir -p input-node-test-files/sce0t
cp /path/to/input_nodes_sce0t_deployment_000.csv input-node-test-files/sce0t
mkdir output-test
cp /path/to/script_output_sce0t.txt output-test

# Create the JSON output-test-parsed/sim_output_nodes_sce0t_deployment_000.json
./ output-test output-test-parsed/

# Change the gossip location of files to create the dataset
# inside :
#  INPUT_DIR='input-node-test-files'
#  OUT_PARSED_DIR='output-test-parsed'
python3 50 --new_dataset gossip-test-dataset.csv

With the code above we have in gossip-test-dataset.csv the dataset that corresponds with the test deployment, and all the information necessary for Gossip to forecast the per-STA and per-AP throughout as follows:

python3 100 --dataset gossip-test-dataset.csv --model /tmp/gossip-trained-model

= STA throughput forecast =
node_code real_throughput forecast_throughput
STA_A1 1.32 3.7261900901794434
STA_L9 0.23 2.867366075515747
= AP throughput =
wlan_code real_throughput forecast_throughput
A 106.98 74.58921813964844
L 83.11 73.48725891113281

Test data

The contest test data is inside input-node-files-test, and output-simulator-test directories; with the simulator input/output data respectively.

First of all we need to parse output data using

rm -rf output-simulator-test-parsed/*json; # remove previous versions
./ "output-simulator-test/*" output-simulator-test-parsed

this will generate all the parsed outputs in .json files under output-simulator-test-parsed directory.

create test gossip dataset

One the output files are parsed in the expected .json format, a new dataset must be created to perform predictions:

python3 50 --new_dataset gossip-dataset-test.csv\
    --input_dir input-node-files-test\
    --parsed_output_dir output-simulator-test-parsed

and we obtain the gossip-dataset-test.csv to derive the predictions of all scenarios at once.

derive all predictions

Once the dataset for the test is created, the user may execute

./ "input-node-files-test/*"\ # parsed test input
    output-simulator-test-parsed\            # parsed test output
    /tmp/gossip-trained-model-v4\            # TF trained model
    /tmp/predictions                         # directory with predictions

and the user will find under /tmp/predictions the forecasted throughput of both STAs and APs of every deployment:

├── test_1
│   ├── all_throughput_000.csv
│   ├── stas_throughput_000.csv
│   ├── throughput_000.csv
└── test_2

with the last 2 containing both STAs, and APs througput; respectively.


Solution for problem statement 13 in ITU ML 5G challenge


Language:Python 51.7%Language:TeX 24.2%Language:Shell 24.1%