twoentartian / DFL

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

This project has been moved to DFL2

DFL

What is DFL?

DFL is a blockchain framework integrating specially optimized for, and works for federated machine learning. In DFL, all contributions are reflected on the improvements of model accuracy and blockchain database works as a proof of contribution rather than a distributed ledger.

Toolchain

Here are two tested toolchain configuration cases.

General x86-64

  • Ubuntu 20
  • GCC 9.3.0
  • CMake 3.16
  • Boost 1.76
  • CUDA 10.2 (optional)

Arm Aarch64: Nvidia Jetson Nano 2Gb

  • Ubuntu 18 (Official Jetson image)
  • GCC 9.4.0
  • CMake 3.16
  • Boost 1.76
  • CUDA 10.2 (under testing)
  • CuDNN 8 (under testing)

Dependency

  • Caffe, DFL uses Caffe as the machine learning backend, CUDA support is still under testing.
  • Boost 1.76
  • nlohmann/json
  • RocksDB
  • Lz4
  • OpenSSL

Getting started

For deployment

  1. Install CMake and GCC with C++17 support.

  2. You can install above dependencies by executing the shell scripts in shell folder. For most cases, you should execute these scripts with this order:

    If you are going to depoly DFL to Jetson Nano, you must execute two additional scripts:

  3. Compile DFL executable(the source code is in DFL.cpp, you can find everything you need in CMake), which will start a node in the DFL network. There are several tools that we recommend to build, they are listed below:

    • Keys generator: to generate private keys and public keys. These keys will be used in the configuration file.
  4. Compile your own "reputation algorithm", which will define the way of updating ML models and updating the other nodes' reputation. This implementation is critical for different dataset distribution, malicious ratio situations. We provide four sample "reputation algorithm" here.

  5. Run DFL executable, it should provide a sample configuration file for you.

  6. Modify the configuration file as you wish, for example, peers, node address, private key, public key, etc. Notice that the batch_size and test_batch_size must be identical to the Caffe solver's configuration. Here is an explaination file for the configuration.

  7. DFL receive ML dataset by network, there is an executable file called data_injector for MNIST dataset, use it to inject dataset to DFL. Current version of data_injector only supports I.I.D. dataset injection.

  8. DFL will train the model once it receives enough dataset for training, and send it as a transaction to other nodes. The node will generate a block when generating enough transactions and perform FedAvg when receiving enough models from other nodes.

For simulation

  1. Perform step 1, step 2 and step 4 in deployment.

  2. Compile DFL_Simulator_mt (source file: simulator_mt.cpp). This version have multi-threading optimization.

    Some tools:

  3. Run the simulator, it should generate a sample configuration file and execute simulation immediately. You can use Ctrl+C to exit.

  4. Modify the configuration file with this explanation file.

  5. The simulator will automatically crate an output folder, whose name is the current time, in the executable path. The configuration file and reputation dll will also be copied to the output folder for easily reproduce the output.

We provide a sample simulation output folder here, you can reuse the reputation dll and the configuration. This configuration contains 5 nodes(1 observer) and all of them use IID dataset. Please note that this configuration uses HalfFedAvg(the output model = 50% previous model + 50% FedAvg output) because there is no malicious node.

Reputation algorithm SDK API:

Please refer to this link for sample reputation algorithm. The SDK API is not written yet.

Future work:

  • Large scale DFL deployment (50+ nodes) tool is on its way (50%). Introducer is under testing.

For more details

https://arxiv.org/pdf/2110.15457.pdf

About


Languages

Language:C++ 91.0%Language:CMake 4.0%Language:Python 3.1%Language:C 1.1%Language:Shell 0.8%