NeighborFL is an individualized real-time federated learning scheme for traffic prediction that incorporates haversine distance as a heuristic and error-driven local models grouping from the aspect of each individual traffic device. This approach allows NeighborFL to create location-aware and tailored prediction models for each client while fostering collaborative learning. It can be seen as an extension of our prior work, BFRT (paper), which uses a conventional FL aggregation method and equates to NaiveFL in this codebase.
Please refer to the NeighborFL paper for detailed explanations of the framework.
We recommend using Google Colab to run this codebase, as it simplifies the setup process compared to running it locally. With the built-in resume functionality (explained later), the free version of Colab can handle simulations with a large number of specified communication rounds.
We have provided a sample Colab file that demonstrates the steps for running the code. Detailed instructions for all the functionalities explained later in this readme file can be found in the Colab file.
If you prefer to set up and run the code on your local machine, we assume you are already familiar with setting up libraries such as numpy, pandas, and TensorFlow for your OS. In that case, you can use the same commands provided in the following sections to run the code.
The gateway to running simulations is main.py. It supports simulating four methods: Central, NaiveFL, r-NaiveFL, and NeighborFL, which can be manually enabled or disabled using the -lm
argument (explained later). The program scans a folder containing your traffic data (csv) files, identifies each file as a traffic device, and processes the chosen feature (e.g., volume, speed, occupancy) as input for the selected model (e.g., LSTM, GRU, or your customized model). The data processing logic can be found in process_data.py. After loading the data files, the program allows each device to assign its candidate neighbors within a specified radius if NeighborFL is enabled, and then the simulation begins. Unless the number of communication rounds is set via the command line (using -c
), the code calculates the maximum possible rounds based on the number of data points in the provided files and stops after the last possible round. To assess NeighborFL's performance against the baseline methods, the code includes two plotting scripts for generating average prediction error tables, prediction and error curve plots as shown in Section IV of the paper.
main.py will look for the traffic data (csv) files inside of the data
directory residing at the root of the downloaded repo. By default, we have included the data files we used for our experiments shown in the paper, with each file named after its sensor ID. Additionally, there is another csv file containing the GPS locations of these 26 sensors.
If you have your own dataset, you may either replace the files in the data
with your data files or specify the absolute path to your dataset folder using the -dp
argument (as explained later). Please make sure that you include a csv file containing the GPS locations of the traffic devices, if NeighborFL and/or r-NaiveFL will be simulated. This file should have "location" in its file name, as indicated by our example file. The other csv files containing traffic data should be named after the corresponding traffic device's ID and must contain the learning/prediction feature (e.g., speed, volume, occupancy) specified using the -f
command line argument.
For dataset formatting, you can refer to the datasets we derived from the publicly available PEMS-BAY dataset used in our paper. Our datasets contain speed data from our selected 26 sensors, ranging from January 2017 to June 2017.
Sample running command -
$ python NeighborFL/main.py -c 250 -lm 1111 -m LSTM -ms 72 -r 1 -f Speed -et MSE -ks 3 -nu 1 -e 5 -b 1 -I 12 -O 1 -seed 40
Arguments explanation:
-
-c 250
: Runs the simulation for 250 communication rounds. -
-lm 1111
: Enables the simulation for (1) Central, (2) NaiveFL, (3) r-NaiveFL, and (4) NeighborFL. -
-m LSTM
: Configures devices to use an LSTM model for centralized/federated learning. -
-ms 72
: Sets the maximum data size to 72 while learning, denoted as MaxDataSize in the paper. -
-r 1
: Specifies a radius of 1 for all devices to search for their candidate favorite neighbors. -
-f Speed
: Selected feature used for training and inferencing set to Speed -
-et MSE
: Specifies the error type used to evaluate candidates in NeighborFL, in this case, mean squared error. This corresponds to the CalcError method in Algorithm 6 of the paper. -
-rs 3
: Sets the remove strategy to "remove the last added one," referring to Algorithm 8 in the paper. -
-nu 1
: Defines the$\nu$ value as 1, meaning a device removes the last added favorite neighbor whenever the real-time prediction error increases. -
-e 5
: Local learning epochs set to 5. -
-b 1
: Local learning batch size set to 1. -
-I 12
: Specifies the input length of the model, indicating the number of input neurons for the chosen LSTM model. -
-O 1
: Sets the output length of the model, which also represents the prediction horizon. -
-sd 40
: Set up random seed to a fixed integer, in this case 40, for reproduction and comparison.
In a nutshell, this command configures each device (identified by its individual csv data file) to create its candidate favorite neighbors set within a 1-mile radius, centered around itself. It employs a randomly initialized LSTM model (currently identical across all devices) with 12 input neurons and 1 output neuron for learning, with 5 local epochs, a batch size of 1, and a maximum data size set to 72. The simulation will run for 250 communication rounds for Central, NaiveFL, r-NaiveFL, and NeighborFL methods. In the case of NeighborFL, the error metric used to assess candidates is the mean squared error, and the favorite neighbor removal mechanism follows the "remove the last added one" approach whenever the real-time prediction error increases. This command corresponds to the Central, NaiveFL, r-NaiveFL, and NeighborFL L1 methods without pretrained models outlined in the paper.
To provide pretrained or customized models as initial models for Central and/or federated methods, please store these pretrained model files (in .h5 format) in a designated folder and specify the folder's path using the -pp
argument to main.py
. Each device should have its own pretrained model file with a name identical to its corresponding dataset file. You can refer to the pretrained model files we used in our experiment for an example. Our pretrained models were generated using the provided pretrain.py.
Sample run of pretrain.py
-
$ python NeighborFL/pretrain.py -dp /path/to/datasets -si 0 -ei 2016 -e 5 -f Speed -m LSTM -b 1 -I 12 -O 1 -sp /pretrained/models/save/path
Explanation of the arguments:
-dp
: Specifies the path to the dataset csv files. If not provided, the program will search for datasets in thedata
directory at the root of the downloaded repository.-si 0
: Sets the start data index for pretraining to 0, which corresponds to the first data point in the csv file.-ei 2016
: Defines the end data index for pretraining as 2015 (inclusive). Note that since datasets are zero-based, the data point at index 2016 is not included.-si 0
and-ei 2016
together select the first 2016 data points in the csv file(s) as the pretraining dataset.-e 5
: Sets the number of local training epochs to 5. Each local epoch covers a device's entire dataset once.-f Speed -m LSTM -b 1 -I 12 -O 1
: Functions similarly tomain.py
.-sp
: Specifies the save path for pretrained models. If not provided, the models will be saved in thepretrained_models
directory located at the root of the downloaded repository.
The program will save the generated pretrained models in the folder specified using the -sp
argument once it completes execution. The program will also save the arguments used to run the program to recall the generated model structure conveniently.
To make these models available to main.py
, please specify this path using the -pp
argument to main.py
. Additionally, it's advisable to set the -si
argument to main.py
to indicate the starting data index from which devices should begin collecting data for real-time learning and inference. If you've used the aforementioned arguments with pretrain.py
, a straightforward way to seamlessly transition from pretraining to real-time training in main.py
is to:
- Keep the
-sp
argument inpretrain.py
andmain.py
the same, ensuring both programs use the same folder for storing and accessing pretrained models. - Keep the
-ei
argument inpretrain.py
and the-si
argument inmain.py
the same. This way, devices can start real-time training immediately after the pretraining period ends.
Here's a sample command for running main.py
with the pretrained models generated by the previously used command for pretrain.py
. This command initiates the simulation with the data index immediately following the pretraining end index -
$ python NeighborFL/main.py -pp /pretrained/models/save/path -si 2016 -c 250 -lm 1111 -m LSTM -ms 72 -r 1 -et MSE -ks 3 -nu 1 -e 5 -b 1 -I 12 -O 1 -sd 40
This command corresponds to the Central, NaiveFL, r-NaiveFL, and NeighborFL L1 methods with pretrained models outlined in the paper.
main.py
, including aspects such as input and output length, the number of hidden layers, neurons, and so on.
Each time a new simulation is executed, a log folder is created under the path specified by -lb
to main.py
. If -lb
is not specified, the folder is created within the logs
directory located at the root of the downloaded repository. The log folder is named with the simulation execution date and time as the prefix, accompanied by customizable argument indicators configured in main.py
. For instance, a log folder named 09022023_202131_lstm_mds_72_I_12_O_3
indicates that the program was run on September 2nd, 2023, at 20:21:31, based on the local time zone of the running environment, with an LSTM network featuring 12 input neurons and 3 output neurons, with a max data size of 72 for the simulation methods.
A specific log folder has a text file recording the command line and default arguments while executing the program the 1st time. The log folder also has a subdolder named check_point
, which includes a file (config_vars.pkl
) recording these arguments in order to resume a simulation later, and another file (all_device_predicts.pkl
) recording the prediction values categorized by communication rounds by all the devices for the enabled simulation methods. This file is useful when plotting the prediction curves and analyzing model performances. Another useful file (fav_neighbors_by_round.pkl
) records the favorite neighbors set of each device categorized by communication rounds, which may be useful in our future work.
Note that the program will print out the path to its log folder upon execution for your convenience.
The program supports resuming the simulation process in case the simulation is interrupted due to time/resource limitation of colab or your running environment. As mentioned above, each new simulation will create a log folder. To resume a simulation, the only argument you need is -rp
. A sample command is
$ python NeighborFL/main.py -rp NeighborFL/logs/09022023_202131_lstm_mds_72_I_12_O_3
rp
stands for "resume path". This is the path to the log folder of the interrupted simulation. The program will automatically resume from the interrupted communication round, utilizing the latest models stored in the specified log folder and continuing with the remaining communication rounds. This command can be resued again and again until all the specified communication rounds are exhausted.
You may also overwrite some arguments while resuming, such as providing a higher value for -c
to extend the simulation with more communication rounds than initially configured. The overwritten arguments will also be updated in config_vars.pkl
for resuming next time. However, please note that some arguments may not support being overwritten, like altering the network model structure by providing -I
and/or -O
values different from the models stored in the log folder. Doing so will result in runtime errors, unless you replace the model files in the log folder with the structure of the models that align with the new argument values. The -dp
argument for the dataset path does not support overwriting, as the processed dataset is stored in the log folder for convenient resumption.
All available arguments to pretrain.py and main.py
Argument | Type | Default Value | Description |
---|---|---|---|
-dp , --dataset_path |
str | 'NeighborFL/data/' | Path to the traffic datasets. |
-sd , --seed |
int | 40 | Random seed value specified for reproducibility. |
-m , --model |
str | 'LSTM' | Choose our predefined LSTM or GRU model for Central/Federated simulations. You may also write your own model in create_model.py . |
-I , --input_length |
int | 12 | Input length for the chosen model, i.e., our predefined GRU/LSTM or your customized model. |
-O , --output_length |
int | 1 | Output length for the chosen model, i.e., our predefined GRU/LSTM or your customized model. |
-hn , --hidden_neurons |
int | 128 | Number of neurons in one of each 2 hidden layers in our predefined GRU/LSTM network. |
-op , --optimizer |
str | 'rmsprop' | Optimizer for training. |
-b , --batch |
int | 1 | Batch number for training. |
-lo , --loss |
str | 'mse' | Loss function used for training, refer to TensorFlow Losses. |
-f , --feature |
str | 'Speed' | Feature used for training (and prediction in main.py ), depending on your traffic dataset, usually speed, volume, occupancy. |
Argument | Type | Default | Description |
---|---|---|---|
-me , --metric |
str | 'mse' | Metric used for evaluation by the model. Refer to TensorFlow Model Compilation and TensorFlow Metrics for more details. |
-e , --epochs |
int | 5 | Local epoch number. One complete epoch goes through the entire pretraining dataset once. |
-pp , --pretrain_percent |
float | 1.0 | The percentage of the provided dataset files allocated for pretraining. |
-si , --pretrain_start_index |
int | 0 | The starting data index (row in pandas dataframe) for pretraining |
-ei , --pretrain_end_index |
int | None | End data index (not included) for pretraining. If this is provided, overwrite -pp |
-sp , --model_save_path |
str | 'NeighborFL/pretrained_models/' | The path to save the pretrained models |
Argument | Type | Default | Description |
---|---|---|---|
-lb , --logs_base_folder |
str | 'NeighborFL/logs/' | Base folder path to store running logs and h5 files for plotting and resuming |
-pm , --preserve_historical_models |
int | 1 | Whether to preserve models from old communication rounds. This is useful when you run the program on colab and have limited Google Drive storage. Set to 1 to only preserve the latest models that are enough for resumption of the simulation. 0 to store all historical models. |
Argument | Type | Default | Description |
---|---|---|---|
-rp , --resume_path |
str | None | Provide the interrupted log folder path to continue the simulation. Note that the program will print out this folder path upon executing |
Argument | Type | Default | Description |
---|---|---|---|
-c , --comm_rounds |
int | None | Number of communication rounds, default aims to run until all data is exhausted. |
-ms , --max_data_size |
int | 72 | Maximum data length for training in each communication round, simulating the memory size the devices have. |
-tau1 , --tau1 |
int | 24 | Number of data point collections and prediction amount in round 1, rule of thumb: 2 * --tau2p . |
-tau2p , --tau2p |
int | 12 | Number of data points collected, and also the number of predictions made in round 2 and beyond, rule of thumb - same as -I . |
-lm , --learning_methods |
str | 1111 | Enable learning methods for simulation. Use 1 to enable and 0 to disable. The indices correspond to: 1. Baseline centralized, 2. Pure FedAvg NaiveFL, 3. RadiusFL, 4. NeighborFL. |
Argument | Type | Default | Description |
---|---|---|---|
-me , --metric |
str | 'mse' | Evaluation metric by the local models during training and testing. This is essentially the same as the -me used in pretrain.py, but it should not be confused with -et . |
-e , --epochs |
int | 5 | Number of epochs per communication round for the enabled simulation methods. Each epoch processes the latest MaxDataSize data points once. |
-dlp , --data_load_percent |
float | 1.0 | The percentage of data to load. Note that the program calculates the maximum number of communication rounds based on the number of data points provided. If the specified -c is larger than the calculated maximum communication rounds, -c will be overwritten. |
-si , --start_train_index |
int | 0 | Start the simulation at this data index, usually to accommodate pretrained models. |
-pp , --pretrained_models_path |
str | None | Pretrained models path. If not provided, the program will create an initial model for all devices and train from scratch. |
Argument | Type | Default | Description |
---|---|---|---|
-r , --radius |
float | 1 | Specifies the radius within which participants are treated as candidate favorite neighbors. Unit - mile. |
-et , --error_type |
str | MSE | Defines the error type used to evaluate candidate neighbors. Not to be confused with -me . |
-nt , --num_neighbors_try |
int | 1 | Determines how many new neighbors to evaluate in each communication round. |
-ah , --add_heuristic |
int | 1 | Sets the heuristic for adding favorite neighbors: 1 - by distance from close to far, 2 - randomly. |
-rt , --remove_trigger |
int | 2 | Specifies the removal trigger: 0 - never remove; 1 - trigger by probability (set by -ep); 2 - trigger by consecutive rounds of error increase (set by -nu). |
-ep , --epsilon |
float | 0.2 | Used when -rt is set to 1; devices have a probability to remove their worst neighbors to explore new ones. |
-nu , --remove_rounds |
int | 3 | Triggers removal if the error of a device's NeighborFL aggregated model has increased for -nu rounds. If -nu is set to 1, removal occurs as soon as the error increases. |
-rn , --remove_num |
int | 1 | Specifies how many favorite neighbors to remove based on their reputation. |
-rs , --remove_strategy |
int | 1 | Removal strategy: 1 - remove by worst reputation; 2 - remove randomly; 3 - always remove the last added one. |
The code for plotting the experimental results and generating the tables in the paper are provided in the plotting_code folder. plot_prediction_curves_save_err_table.py was used to plot Figure 3 and create Table III, IV and V. plot_error_curves.py was used to plot Figure 4 in the paper. The code will look for config_vars.pkl
and all_device_predicts.pkl
files in the check_point
folder under the specified log folder for these tasks.
Sample command:
$ python NeighborFL/plotting_code/plot_prediction_curves_save_err_table.py -lp /NeighborFL/logs/08282023_233406_lstm_I_12_O_3_lm_1111 lp2 /NeighborFL/logs/09022023_202131_lstm_I_12_O_3_lm_0001 -NFLConfig L1 -r 400760_N -et MSE -pr 24 -Oseqs 13 -row 5
Arguments explanation:
-lp
: the path of the desired log folder for plotting.-lp2
: this is an optional second log path that allows you to overwrite NeighborFL predictions in-lp
. It comes in handy when you're running multiple simulations of NeighborFL and want to compare with the same baseline methods. For example, in simulation A, you might run all three baseline methods and one NeighborFL configuration, while in simulation B, you want to compare a different NeighborFL configuration with the baseline methods from simulation A. For instance, let's say the simulation run on 08282023_233406 (provided in-lp
in the command) includes all three baseline methods and a NeighborFL configuration (indicated by-lm
set to 1111, as referenced in the arguments inmain.py
). On the other hand, the simulation run on 09022023_202131 (provided in-lp2
) only runs a NeighborFL simulation (indicated by-lm
set to 0001). As a result, the NeighborFL predictions in 09022023_202131 will overwrite the NeighborFL predictions in 08282023_233406 for plots, tables, and various types of analysis while comparing with the predictions from the baseline methods in 08282023_233406. Notice that if-lp2
is provided, the plots and tables will be saved under the plot folder within the path indicated in-lp2
rather than-lp
.-NFLConfig
: the NeighborFL configuration name to show on plots and tables, such as R1, R3, L1, L3.-r
denotes "representative", allows you to specify a particular traffic device to be showcased in a large figure, i.e., spans the top row of the figures, such as in Figure 3 and 4 of the paper.-et
: Specify the error type for analysis to make the error tables within the specified round range.-pr
determines the number of the latest rounds of the simulation to include in the plots. Using-pr 24
will generate plots for the most recent 24 rounds of predictions, sourced from theall_device_predicts.pkl
file.-Oseqs
specifies which output sequences to plot for the prediction instances with a prediction horizon greater than 1. For example, when theoutput_length
is 4, and you have two consecutive prediction instances in two continuous communication rounds A and B, like [50.72, 51.77, 49.75, 66.02] and [66.92, 49.24, 45.05, 55.12], setting-Oseqs 13
will generate figures for the 1st predictions, taking 50.72 from round A and 66.92 from round B, and another figure for the 3rd predictions, taking 49.75 from round A and 45.05 from round B. This is useful for evaluating prediction performance in scenarios where long-term forecasting is required, such as predicting the impacts of non-recurrent events.-row
specifies how many rows to arrange the plots for traffic devices except the representative. The column number will be automatically determined. For instance, if the log has 26 devices and -row set to 5, each row will have 5 plots based on ⌈(26 - 1) / 5⌉ = 5, then the subplots will have 5 columns.
Suppose we have 26 devices running for 250 communication rounds, this sample command will plot the most recent 24 rounds (round 227 to round 250, both inclusive) of real-time prediction values for all three baseline methods, as recorded in the 08282023_233406
simulation, and the NeighborFL method, as recorded in the 09022023_202131
simulation, for output horizon 1 and 3. The program will plot for all 26 devices, with the traffic device 400760_N as the representative, and render the rest of the devices into 5*5 subplots, just like Figure 3 in the paper. If you have a different number of devices, you can adjust the -row
value as needed. The resulting figures will be saved in a folder named prediction_curves
within the plots
directory located under the specified log folder. If -lp2
is provided, the figures are saved in the plots
directory within the path of -lp2
, otherwise -lp
.
Additionally, the program will generate a table named error.csv
in the plots
folder, which calculates individual device MSE values over the specified round range consistent to the plotting range, akin to Table IV in the paper. Furthermore, the program will print (in the console) the average device MSE values across all devices for the specified round range, which we used to create Tables III and V in the paper.
Sample command:
$ python NeighborFL/plotting_code/plot_error_curves.py -lp /NeighborFL/logs/08282023_233406_lstm_I_12_O_3_lm_1111 lp2 /NeighborFL/logs/09022023_202131_lstm_I_12_O_3_lm_0001 -NFLConfig L1 -r 400760_N -et MSE -Oseqs 13 -row 5 -ei 24
Arguments explanation:
-
-ei
represents error interval, unit is communication rounds. This is used to create smoother error curves, especially when dealing with a large number of communication rounds, as seen in Figure 4 of the paper. In this example,-ei 24
is used to achieve a smoothed representation of errors across the entire 250 rounds while maintaining the overall error trend. -
All remaining arguments operate in the same manner as they do for
plot_prediction_curves_save_err_table.py
.
In a nutshell, this sample command will plot the MSE value trends across the entire simulation for the real-time learning process using the same data (all_device_predicts.pkl
) and plot style as demonstrated in the previous example for plotting prediction curves. The resulting plot is akin to Figure 4 in the paper. The code also calculates and shows the surpass percentage between NeighborFL and available baselines on the figure and labels a star if NeighborFL surpasses NaiveFL for 50% or more. The generated figures will be saved in the plots/error_curves
directory, which can be found under either the plot
folder within the -lp2
specified log folder or under the plot
folder within the -lp
specified log folder if -lp2
is not provided.
All available arguments to plot_prediction_curves_save_err_table.py and plot_error_curves.py
Argument | Type | Default Value | Description |
---|---|---|---|
-lp , --logs_dirpath |
str | None | The path to the log directory where all_device_predicts.pkl resides. |
-lp2 , --logs_dirpath2 |
str | None | An optional log path for overwriting NeighborFL predictions in the -lp file. A use case is running NeighborFL simulations with different remove strategies. |
-et , --error_type |
str | 'MSE' | The type of error metric to use to generate error tables, which can be MAE, MSE, RMSE, or MAPE. |
-r , --representative |
str | None | The device ID to be featured as the representative figure. If not specified, no representative figure will be generated. |
-row , --row |
int | 1 | The number of rows in the subplots. Columns will be automatically determined based on the number of available devices. |
-NFLConfig , --NFLConfig |
str | '' | The NeighborFL configuration name to display on plots and tables, such as R1, R3, L1, L3. |
-Oseqs , --output_sequences |
str | '1' | For predictions with an output_length greater than 1, specify the output sequences to plot. |
Argument | Type | Default Value | Description |
---|---|---|---|
-sr , --starting_comm_round |
int | None | The round number to start plotting. |
-pr , --plot_rounds |
int | 24 | The number of the communication rounds to plot. If starting_comm_round is not specified, it plots the most recent plot_rounds rounds in the simulations. |
-tr , --time_resolution |
int | 5 | The time resolution of the data, commonly 5 minutes. |
-xi , --xticks_interval |
int | 12 | Specify the number of points to plot within a communication round to ensure accurate round labeling on the x-axis for the representative plots, typically set to the same value as -tau2p . The program is hardcoded to create three xticks for subplots. |
-f , --feature |
str | 'Speed' | The feature name of the predictions used to set up the y-axis. |
Argument | Type | Default | Description |
---|---|---|---|
-sr, --start_round |
int | 1 | Starting round to plot and calculate error. Default to the 1st round of the simulation. |
-er, --end_round |
int | None | Ending round to plot and calculate error. Default to to the last round of the simulation. |
-ei, --error_interval |
int | 24 | The unit is communication rounds. Used to smooth out error curves, especially in simulations with a large number of rounds to improve plot clarity. |
To replicate our experimental results and plots, you could refer to the provided Colab file section 6 Reproduce our experiments and plots, which contains direct code for recreating our plots, tables and simulation results.
We've refactored the code for improved clarity in variable names and some writing styles of the algorithms after obtaining those running logs. Therefore, to reproduce our experiments, please check out to the commit 41a6735
by doing this first
$ git clone https://github.com/hanglearning/NeighborFL.git
$ cd NeighborFL
$ git checkout 41a6735
You can find the logs for the eight simulations used in our paper in this folder.
To recreate Figure 3, Table IV, and the MSE values of the three baselines and NeighborFL L1 under the Pretrain column in Table III, execute the following commands:
$ git checkout 41a6735
$ pip install xlsxwriter
$ python NeighborFL/plotting_code/plot_realtime_prediction_curves_and_save_err_table.py -lp "logs_on_paper/pretrain/R3" -lp2 "logs_on_paper/pretrain/L1" -sr 227 -pr 24 -r 400760_N -row 5
The resulting plots and Table IV will be saved under logs_on_paper/pretrain/L1/plots/realtime_learning_curves_all_detectors
, and the Average Device MSE values of the last 24 rounds for the three baselines and NeighborFL L1 under the Pretrain setting in Table III will be displayed in the console.
To get all the values in Table III and Table V, execute the following command:
$ git checkout 41a6735
$ python NeighborFL/plotting_code/plot_realtime_prediction_curves_and_save_err_table.py -r 400760_N -lp <see_below> -sr <see_below> -pr <see_below>
Replace -lp <see_below>
with the appropriate value based on the configuration you want to reproduce. Refer to the table below for the correct value:
Avg MSE value of Config | -lp value |
---|---|
Pretrain NeighborFL L1 | "logs_on_paper/pretrain/L1" |
Pretrain NeighborFL L3 | "logs_on_paper/pretrain/L3" |
Pretrain NeighborFL R1 | "logs_on_paper/pretrain/R1" |
Pretrain All Baselines + R3 | "logs_on_paper/pretrain/R3" |
Non-Pretrain NeighborFL L1 | "logs_on_paper/no_pretrain/L1" |
Non-Pretrain NeighborFL L3 | "logs_on_paper/no_pretrain/L3" |
Non-Pretrain NeighborFL R1 | "logs_on_paper/no_pretrain/R1" |
Non-Pretrain All Baselines + R3 | "logs_on_paper/no_pretrain/R3" |
For Table III, use -sr 227
and -pr 24
. For Table V, use -sr 1
and -pr 250
.
Please note the -r
argument is required but can have any random device ID due to a bug in the code.
To recreate Figure 4, execute the following command:
$ git checkout 41a6735
$ python NeighborFL/plotting_code/plot_realtime_errors_and_save_err_interval_table.py -lp "NeighborFL/logs_on_paper/pretrain/R3" -lp2 "NeighborFL/logs_on_paper/pretrain/L1" -ei 24 -row 5 -r 400922_S
The resulting plots will be saved under logs_on_paper/pretrain/L1/plots/realtime_errors_interval
. Note that there is also a csv file saved and some console output, but they are not used in the paper.
Execute the following command to run the non-pretrain experiments:
$ git checkout 7019541
$ python NeighborFL/main.py -dp <path/to/data> -lb "NeighborFL/logs" -c 250 -lm <lm> -ms 72 -r 1 -et MSE -ks <ks> -kr <kr> -si 2016
Replace <path/to/data>
with the absolute path to the data
folder where you've placed the downloaded dataset. Modify -lm <lm> -ks <ks> -kr <kr>
according to the following table:
Running Config | -lm |
-ks |
-kr |
---|---|---|---|
Central | 100000 | ||
NaiveFL | 010000 | ||
r-NaiveFL | 000100 | ||
NeighborFL L1 | 000001 | 3 | 1 |
NeighborFL L3 | 000001 | 3 | 3 |
NeighborFL R1 | 000001 | 1 | 1 |
NeighborFL R3 | 000001 | 1 | 3 |
1. Reproduce the Pretrained Models
Execute the following command to generate our pretrained models:
$ # git checkout 7019541
$ # python NeighborFL/pretrain.py -dp <path/to/data> -ei 2016 -sp <path/to/save/models>
Replace <path/to/data>
with the absolute path to the data
folder where you've placed the downloaded dataset. For -sp
, specify any desired path, such as NeighborFL/pretrained_models
.
It appears that running the above command on Google Colab may not produce the exact same pretrained models as those created on our local machine. We are concerned that this command may not reproduce the same models in other environments either. To reproduce our experiments precisely, please download the pretrained models we used for our experiments.
2. Reproduce Experiments with the Pretrained Models
Execute the following command to run experiments with the pretrained models:
$ git checkout 7019541
$ python NeighborFL/main.py -dp <path/to/data> -lb "NeighborFL/logs" -c 250 -lm <lm> -ms 72 -r 1 -et MSE -ks <ks> -kr <kr> -si 2016 -pp <path/to/pretrained_models>
The only difference between this command and the one in the non-pretrain experiments is the extra -pp
argument. Replace <path/to/pretrained_models>
with the path specified in -sp
in the previous step or the path containing our provided pretrained models.
To run different configurations, refer to the -lm <lm> -ks <ks> -kr <kr>
table provided in the non-pretrain experiments section.
3. Resuming a Simulation
If you need to resume a simulation in commit 41a6735
, use the log path obtained from the console output when the simulation was first executed through main.py
. Direct the -rp
argument to this path as follows:
$ git checkout 7019541
$ python NeighborFL/main.py -rp <resume/log/path>
Please note that this old commit doesn't support argument override.
(1) Our federated learning code is extended from the LSTM and GRU centralized learning methods found in this xiaochus's TrafficFlowPrediction repo. Thank you!
(2) This document was composed for free at StackEdit.