Fed-BioMed demonstrator for the EUCAIM project M9 milestone
First, clone the fedbiomed directory
git clone --branch master git@github.com:fedbiomed/fedbiomed.git
export FEDBIOMED_DIR=$PWD/fedbiomed
You may find deployment instructions for Fed-BioMed at the following link. Specifically, you should only follow the instructions under the heading "Deploy on the node side". Basically, they amount to
- building one docker image (you may only build the
node
image, and avoid thegui
image) - copying a vpn configuration file to a predefined path/name in the docker container
- generating a public vpn key
To obtain the vpn configuration file, please contact francesco.cremonesi@inria.fr. When the public key has been generated, please send it to francesco.cremonesi@inria.fr, so it can be registered on the vpn server.
Some additional information:
- the documentation says to checkout the master branch of fedbiomed. You may alternatively checkout the v4.4.4 tag (or any higher available tag if we push more in the meantime). In any case, the master branch is fine since it is up to date with the latest hotfixes required for correctly building the docker images
- using the GUI is fully optional. In case you don’t want to do it, then obviously there is no need to build the corresponding image
Copy the node's configuration provided in the repo to the fedbiomed directory
data_provider=bsc # other options: ub, forth
cp etc/${data_provider}.ini fedbiomed/envs/vpn/docker/node/run_mounts/etc
Clone the data directory
git clone git@github.com:EUCAIM/demo_ml_data.git fedbiomed/envs/vpn/docker/node/run_mounts/data/demo_ml_data
Copy the dataset configuration files to the appropriate location (each data provider has a specific file)
data_provider=bsc # other options: ub, forth
cp ../demo_ml/${data_provider}.json fedbiomed/envs/vpn/docker/node/run_mounts/data/demo_ml_data/
Then start your node
cd ${FEDBIOMED_DIR}/envs/vpn/docker
docker-compose exec -u $(id -u) node bash -ci 'export MPSPDZ_IP=$VPN_IP && export MPSPDZ_PORT=14001 && export MQTT_BROKER=10.220.0.2 && export MQTT_BROKER_PORT=1883 && export UPLOADS_URL="http://10.220.0.3:8000/upload/" && export PYTHONPATH=/fedbiomed && export FEDBIOMED_NO_RESET=1 && eval "$(conda shell.bash hook)" && conda activate fedbiomed-node && bash'
This will open a shell on the container. From that shell you can add a dataset
./scripts/fedbiomed_run node config ${data_provider}.ini --add-dataset-from-file /data/demo_ml_data/${data_provider}.json
Start the node in the background
nohup ./scripts/fedbiomed_run node config ${data_provider}.ini start >./fedbiomed_node.out &
Clone the data directory
git clone git@github.com:EUCAIM/demo_ml_data.git fedbiomed/envs/vpn/docker/node/run_mounts/data/demo_dl_data
Download the dataset from kaggle, and extract its contents. Then you need to change the directory structure:
PATH_TO_DOWNLOADED_DATA=<input here the path to where you extracted the archived data>
CSV_FILE_ID=2_1 # 2_1 for bsc, 2_2 for forth
while read img; do
mv $PATH_TO_DOWNLOADED_DATA/chest_xray/train/PNEUMONIA/$img fedbiomed/envs/vpn/docker/node/run_mounts/data/chest_xray/PNEUMONIA/$img
done < <(tail -n +2 fedbiomed/envs/vpn/docker/node/run_mounts/data/demo_dl_data/data_ids/two_datasites_scenario/train.pnm.${CSV_FILE_ID}.csv)
while read img; do
mv $PATH_TO_DOWNLOADED_DATA/chest_xray/train/NORMAL/$img fedbiomed/envs/vpn/docker/node/run_mounts/data/chest_xray/NORMAL/$img
done < <(tail -n +2 fedbiomed/envs/vpn/docker/node/run_mounts/data/demo_dl_data/data_ids/two_datasites_scenario/train.nrm.${CSV_FILE_ID}.csv)
Copy the dataset configuration files to the appropriate location (the same file for all data providers)
cp ../demo_dl/dataset.json fedbiomed/envs/vpn/docker/node/run_mounts/data/demo_dl_data/
Then start your node
cd ${FEDBIOMED_DIR}/envs/vpn/docker
docker-compose exec -u $(id -u) node bash -ci 'export MPSPDZ_IP=$VPN_IP && export MPSPDZ_PORT=14001 && export MQTT_BROKER=10.220.0.2 && export MQTT_BROKER_PORT=1883 && export UPLOADS_URL="http://10.220.0.3:8000/upload/" && export PYTHONPATH=/fedbiomed && export FEDBIOMED_NO_RESET=1 && eval "$(conda shell.bash hook)" && conda activate fedbiomed-node && bash'
This will open a shell on the container. From that shell you can add a dataset
./scripts/fedbiomed_run node config ${data_provider}.ini --add-dataset-from-file /data/demo_dl_data/dataset.json
Start the node in the background
nohup ./scripts/fedbiomed_run node config ${data_provider}.ini start >./fedbiomed_node.out &
First clone fedbiomed if you haven't done so yet
git clone --branch master git@github.com:fedbiomed/fedbiomed.git
export FEDBIOMED_DIR=$PWD/fedbiomed
cd envs/vpn/docker
Build the container
${FEDBIOMED_DIR}/scripts/fedbiomed_vpn build researcher
Copy the VPN configuration file that was provided to you via email to the appropriate location
cp config.env ./researcher/run_mounts/config/config.env
Retrieve the public key
docker-compose exec researcher wg show wg0 public-key | tr -d '\r' >/tmp/publickey-researcher
Send the public key via email to francesco.cremonesi@inria.fr.
Create the fedbiomed configuration for the researcher
docker-compose exec -u $(id -u) researcher bash -ci 'export MPSPDZ_IP=$VPN_IP && export MPSPDZ_PORT=14001 && export MQTT_BROKER=10.220.0.2 && export MQTT_BROKER_PORT=1883 && export UPLOADS_URL="http://10.220.0.3:8000/upload/" && export PYTHONPATH=/fedbiomed && export FEDBIOMED_NO_RESET=1 && eval "$(conda shell.bash hook)" && conda activate fedbiomed-researcher && ./scripts/fedbiomed_run researcher configuration create'
Copy the training files to the appropriate location
mkdir ${FEDBIOMED_DIR}/envs/vpn/docker/researcher/run_mounts/samples/demo_ml/
mkdir ${FEDBIOMED_DIR}/envs/vpn/docker/researcher/run_mounts/samples/demo_dl/
cp ${EUCAIM_DEMO_DIR}/demo_ml/federated_training.py ${FEDBIOMED_DIR}/envs/vpn/docker/researcher/run_mounts/samples/demo_ml/
cp ${EUCAIM_DEMO_DIR}/demo_dl/federated_training.py ${FEDBIOMED_DIR}/envs/vpn/docker/researcher/run_mounts/samples/demo_dl/
Finally, start the container
docker-compose exec -u $(id -u) researcher bash -ci 'export MPSPDZ_IP=$VPN_IP && export MPSPDZ_PORT=14000 && export MQTT_BROKER=10.220.0.2 && export MQTT_BROKER_PORT=1883 && export UPLOADS_URL="http://10.220.0.3:8000/upload/" && export PYTHONPATH=/fedbiomed && export FEDBIOMED_NO_RESET=1 && eval "$(conda shell.bash hook)" && conda activate fedbiomed-researcher && bash'
From within the container, you may run the federated training
python /fedbiomed/notebooks/samples/demo_ml/federated_training.py