dtac-dev

The is the repo of Distributed Task-aware Compression (dtac).

Link to paper: Task-aware Distributed Source Coding under Dynamic Bandwidth
Link to blog: Blog

TLDR
Results
Installation
- Packages
- Dataset
Usage
Citation

TLDR

We design a distributed compression framework that learns low-rank task representations and efficiently distributes bandwidth among sensors to provide a trade-off between performance and bandwidth.

Abstract

Click to expand

Efficient compression of correlated data is essential to minimize communication overload in multi-sensor networks. In such networks, each sensor independently compresses the data and transmits them to a central node due to limited communication bandwidth. A decoder at the central node decompresses and passes the data to a pre-trained machine learning-based task to generate the final output. Thus, it is important to compress features that are relevant to the task. Additionally, the final performance depends heavily on the total available bandwidth. In practice, it is common to encounter varying availability in bandwidth, and higher bandwidth results in better performance of the task. We design a novel distributed compression framework composed of independent encoders and a joint decoder, which we call neural distributed principal component analysis (NDPCA). NDPCA flexibly compresses data from multiple sources to any available bandwidth with a single model, reducing computing and storage overhead. NDPCA achieves this by learning low-rank task representations and efficiently distributing bandwidth among sensors, thus providing a graceful trade-off between performance and bandwidth. Experiments show that NDPCA improves the success rate of multi-view robotic arm manipulation by 9% and the accuracy of object detection tasks on satellite imagery by 14% compared to an autoencoder with uniform bandwidth allocation.

System Overview

Task-aware distributed source coding with NDPCA: $X_1, \dots, X_k$ are correlated data sources. Neural encoders $E_1, \dots, E_k$ independently compress data to latent representations $Z_1, \dots, Z_k$. The proposed DPCA module, which is a linear matrix, allocates the bandwidth of sources based on the importance of the task $\Phi$.

The framework uses neural encoders and their corresponding neural decoder to minimize the task loss, by measuring the task-performance on the reconstructed data, $\phi(\hat{X})$, and on uncompressed data, $\phi(X)$.
The neural autoencoders are encouraged to generate low-rank representations $Z$, which helps achieve a systematic trade-off between latent dimension and performance with a single model.

Results

Top: Performance Comparison for 3 different tasks. Our method achieves equal or higher performance than other methods. Bottom: Distribution of total available bandwidth (latent space) among the two views for NDPCA (ours). The unequal allocation highlights the difference in the importance of the views for a given task.

Installation

Packages

For the installation of the required packages, see the setup.py file or simply run the following command to install the required packages in requirements.txt:

pip install -r requirements.txt

Then to activate the dtac environment, run:

pip install -e .

Dataset

Locate and lift

The locate and lift experiment needs the gym package (see setup.py or requirements.txt) and mujoco. To install mujoco, see install mujoco.
To collect the demonstration dataset used for training of RL agent and the autoencoders, run the following command:

python collect_lift_hardcode.py

Airbus

The airbus experiment needs the airbus dataset. To download the dataset, see Airbus Aircraft Detection.
After downloading the dataset, place the dataset in the "./airbus_dataset folder and run "./airbus_scripts/aircraft-detection-with-yolov8.ipynb".
Then, put the output of the notebook in the following folder "./airbus_dataset/224x224_overlap28_percent0.3_/train" and "./airbus_dataset/224x224_overlap28_percent0.3_/val".
Finally, augment the dataset with mosaic at "./mosaic/":

python main.py --width 224 --height 224 --scale_x 0.4 --scale_y 0.6 --min_area 500 --min_vi 0.3

Usage

dtac package

The dtac package contains the following models:

ClassDAE.py: Class of Autoencoders
DPCA_torch.py: Fuctions of DPCA

and other common utility functions.

Locate and lift

To train an RL agent, run the following command:

python train_behavior_cloning_lift.py -v

where -v is the views of the agent: "side", "arm", or "2image".

To train the lift and locate NDPCA, run the following command:

python train_awaDAE.py -args

See the -args examples in the main function of train_awaDAE.py file.

To evaluate autoencoder models, run the following command in the "./PnP_scripts" folder:

python eval_DAE.py -args

See the -args examples in the main function of eval_DAE.py file.

Airbus

To train the object detection (Yolo) model, run the following command:

python airbus_scripts/yolov1_train_detector.py

To train the Airbus NDPCA, run the following command in the "./airbus_scripts" folder:

python train_od_awaAE.py -args

See the -args examples in the main function of train_od_awaAE.py file.

To evaluate autoencoder models, run the following command in the "./airbus_scripts" folder:

python dpca_od_awaDAE.py -args

See the -args examples in the main function of dpca_od_awaDAE.py file.

Citation

If you find this repo useful, please cite our paper:

@inproceedings{Li2023taskaware,
      title={Task-aware Distributed Source Coding under Dynamic Bandwidth},
      author={Po-han Li and Sravan Kumar Ankireddy and Ruihan Zhao and Hossein Nourkhiz Mahjoub and Ehsan Moradi-Pari and Ufuk Topcu and Sandeep Chinchali and Hyeji Kim},
      booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
      year={2023},
      url={https://openreview.net/forum?id=EJo8lMC5cY}
}

UTAustin-SwarmLab / Task-aware-Distributed-Source-Coding

dtac-dev

Table of Contents

TLDR

Abstract

System Overview

Results

Installation

Packages

Dataset

Locate and lift

Airbus

Usage

dtac package

Locate and lift

Airbus

Citation

About

Languages