ChocalateWZLH / SDNet

An official codebase of paper "Revisiting Sparse Convolutional Model for Visual Recognition"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SDNet — Revisiting Sparse Convolutional Model for Visual Recognition

This repository contains the official PyTorch implementation of the paper: Xili Dai*, Mingyang Li*, Pengyuan Zhai, Shengbang Tong, Xingjian Gao, Shao-Lun Huang, Zhihui Zhu, Chong You, Yi Ma. "Revisiting Sparse Convolutional Model for Visual Recognition". NeurIPS 2022.

Introduction

Despite strong empirical performance for image classification, deep neural networks are often regarded as ``black boxes'' and they are difficult to interpret. On the other hand, sparse convolutional models, which assume that a signal can be expressed by a linear combination of a few elements from a convolutional dictionary, are powerful tools for analyzing natural images with good theoretical interpretability and biological plausibility. However, such principled models have not demonstrated competitive performance when compared with empirically designed deep networks. This paper revisits the sparse convolutional modeling for image classification and bridges the gap between good empirical performance (of deep learning) and good interpretability (of sparse convolutional models). Our method uses differentiable optimization layers that are defined from convolutional sparse coding as drop-in replacements of standard convolutional layers in conventional deep neural networks. We show that such models have equally strong empirical performance on CIFAR-10, CIFAR-100 and ImageNet datasets when compared to conventional neural networks. By leveraging stable recovery property of sparse modeling, we further show that such models can be much more robust to input corruptions as well as adversarial perturbations in testing through a simple proper trade-off between sparse regularization and data reconstruction terms.

Reproducing Results

Installation for Reproducibility

For ease of reproducibility, we suggest you install Miniconda (or Anaconda if you prefer) before executing the following commands.

git clone https://github.com/Delay-Xili/SDNet
cd SDNet
conda create -y -n sdnet
source activate sdnet
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
mkdir data logs

Training

To retrain the neural network from scratch on your own machine, execute the following commands

SDNet (CIFAR-10 & CIFAR-100 & ImageNet)

CUDA_VISIBLE_DEVICES=0 python train.py --cfg experiments/cifar10.yaml --dir_phase cifar10_sdnet18 LOG_DIR path/to/your/current/project
CUDA_VISIBLE_DEVICES=0 python train.py --cfg experiments/cifar100.yaml --dir_phase cifar100_sdnet18 LOG_DIR path/to/your/current/project
CUDA_VISIBLE_DEVICES=0 python train.py --cfg experiments/ImageNet.yaml --dir_phase imagenet_sdnet18 LOG_DIR path/to/your/current/project

Some hyper-parameters can be changed directly in the corresponding xxx.yaml file. We run the experiments on an NVIDIA RTX 3090 with 24GB memory. Adjust the CUDA_VISIBLE_DEVICES parameter based on available GPUs.

Note: this codebase support multi GPUs training, but it is not stable for that. If you want do that, be careful with the hyper-parameters and do grid-search again.

SDNet-all (CIFAR-10 & CIFAR-100 & ImageNet)

CUDA_VISIBLE_DEVICES=0 python train.py --cfg experiments/cifar10.yaml --dir_phase cifar10_sdnet18_all LOG_DIR path/to/your/current/project MODEL.NAME sdnet18_all
CUDA_VISIBLE_DEVICES=0 python train.py --cfg experiments/cifar100.yaml --dir_phase cifar100_sdnet18_all LOG_DIR path/to/your/current/project MODEL.NAME sdnet18_all
CUDA_VISIBLE_DEVICES=0 python train.py --cfg experiments/ImageNet.yaml --dir_phase imagenet_sdnet18_all LOG_DIR path/to/your/current/project MODEL.NAME sdnet18_all TRAIN.BATCH_SIZE 256

The trained SDNet-all models could be used to visualize the hidden features. (refer to the commend in section Visualizing the hidden features) Note: we still have a version SDNet-18-all no short-cut version, you could re-train it based on commend of SDNet-all add MODEL.SHORTCUT False.

Pre-trained Models

You can download our trained models from the following links:

Datasets Models shortcut Accuracy Results
CIFAR-10 SDNet-18 yes 95.34 % link
SDNet-18-all yes 95.02 % link
SDNet-18-all no 94.89 % link
CIFAR-100 SDNet-18 yes 78.45 % link
SDNet-18-all yes 77.89 % link
SDNet-18-all no 77.28 % link
ImageNet-1k SDNet-18 yes 69.78 % link
SDNet-18-all yes TBD
SDNet-18-all no 69.03 % link

Each link includes the corresponding results, which consists of checkpoints files, training log files, and tensorboard file.

Robustness Testing on CIFAR-10-C

CUDA_VISIBLE_DEVICES=0 python robust_inference.py --cfg experiments/cifar10.yaml --dir_phase cifar10_sdnet18 LOG_DIR path/to/your/current/project TRAIN.MODEL_FILE path/to/the/ckpt/model_best.pth.tar

The robust inference results of cifar-10 could be found in the Results link of the first row of the table above.

Visualizing the hidden features

CUDA_VISIBLE_DEVICES=0 python visualize.py --cfg experiments/cifar10.yaml --dir_phase cifar10_sdnet18_all_no_shortcut/viz MODEL.NAME sdnet18_all MODEL.SHORTCUT False TRAIN.MODEL_FILE logs/cifar10_sdnet18_all_no_shortcut/model_best.pth.tar

You can also directly download the visualization results of the hidden features from the Results link of the third row of the table above.

Sparse Coding layer for 1D Case

We also offer a version of 1D case for applications like voice or language.

cd Lib/models
python msc_layer.py

Citation

If you find SDNet useful in your research, please consider citing:

@article{dai2022revisiting,
  title={Revisiting Sparse Convolutional Model for Visual Recognition},
  author={Dai, Xili and Li, Mingyang and Zhai, Pengyuan and Tong, Shengbang and Gao, Xingjian and Huang, Shao-Lun and Zhu, Zhihui and You, Chong and Ma, Yi},
  journal={arXiv preprint arXiv:2210.12945},
  year={2022}
}

License

See LICENSE for details.

About

An official codebase of paper "Revisiting Sparse Convolutional Model for Visual Recognition"

License:MIT License


Languages

Language:Python 100.0%