yuchenzhao / iotdi22-mmfl

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

This repository contains the source code and data of the paper Multimodal Federated Learning on IoT Data.

Directory structure

  • src: source code
  • data: pre-processed data
  • config: configuration files for experiments
  • sub: pbs files for job submissions to clusters
  • results: experimental results
  • plots: data visualizations of results

Prerequisites

To fully re-produce the results and plots in the paper, please use the provided configuration files to run replicates of individual experiments in parallel on a cluster that supports MPI. On your cluster, you need following modules available to support the experiments.

  • Anaconda 3
  • Python 3.7+
  • Pytorch 1.8+
  • Torchvision 0.9+
  • Numpy 1.19+
  • Matplotlib 3.3+
  • Scipy 1.4+
  • MPI for Python 3.0+

Pre-processed data

The data directory contains the pre-processed data for the experiments. You can download the data in .mat files from https://drive.google.com/drive/folders/1rWJYkfMavGs1F-H0jykJ5V0fIiwrQdJV?usp=sharing. You can also generate the pre-processed data from the original datasets.

For the Opp dataset

  1. Download the dataset publicly available at http://www.opportunity-project.eu/system/files/Challenge/OpportunityChallengeLabeled.zip.
  2. Extract all the .dat files into the data/opp directory.
  3. Uncomment the # gen_opp("data") in utils.py and run python3 src/utils.py (it will take a few minutes).
  4. Once it's completed, you should find a generated opp.mat file in data/opp/.

For the mHealth dataset

  1. Download the dataset publicly available at https://archive.ics.uci.edu/ml/machine-learning-databases/00319/MHEALTHDATASET.zip.
  2. Extract all the .log files into the data/mhealth directory.
  3. Uncomment the # gen_mhealth("data") in utils.py and run python3 src/utils.py (it will take a few minutes).
  4. Once it's completed, you should find a generated mhealth.mat file in data/mhealth/.

For the UR Fall dataset

  1. Uncomment the # download_UR_fall() and # gen_ur_fall("data") in the utils.py.
  2. Run python3 src/utils.py to download the raw data and generate the pre-processed data (it will take a few minutes).
  3. Once it's completed, you should find a generated ur_fall.mat file in data/mhealth/.

Configuration files

The config/ directory contains the configuration files of experiments. Each file describes the parameters of one individual experiment. The config/config_example file contains the explanations of all the parameters.

Job submission files

The sub/ directory contains the job submission files to HPC clusters. Each file uses mpirun to run 64 replicates of simulation using a specific configuration file. You may need to adjust the walltime, number of nodes and CPUs, and size of memory according to the policy of your clusters (i.e., the first 3 lines). It uses module load to load an Anaconda3 module named as "anaconda3/personal" and an MPI module named as "mpi" for simulations. Change them to suit your cluster's environment, if necessary. The $PBS_O_WORKDIR$ is the absolute path of the current working directory of the qsub utility process.

Instructions

  1. Create a conda environment (with Python 3.7+) named as deep-learning on your HPC cluster.
  2. Install pytorch, torchvision, numpy, matplotlib, scipy, and mpi4py in your deep-learning environment.
  3. Use the provided .sh files to run groups of experiments. For example, the exp_opp.sh will use qsub to submit all the pbs files of the Opp experiments to the queue system of your cluster.
  4. With the provided configuration files, the results will be output in the results/ directory.
  5. Once ALL experiments are completed, run python3 src/analysis.py to output data visualizations of the results into the plots/ directory.

About

License:MIT License


Languages

Language:Python 92.9%Language:Shell 7.1%