Deep Connected Attention Networks (DCANet)

By anonymous ECCV 2020 submission. Paper ID: 4574

Illustration

Figure 1. Illustration of our DCANet. We visualize intermediate feature activation using Grad-CAM. Vanilla SE-ResNet50 varies its focus dramatically at different stages. In contrast, our DCA enhanced SE-ResNet50 progressively and recursively adjusts focus, and closely pays attention to the target object.

Approach

Figure 2. An overview of our Deep Connected Attention Network. We connect the output of transformation module in the previous attention block to the output of extraction module in current attention block. In the context of multiple attention dimensions, we connect attentions along each dimension. Here we show an example with two attention dimensions. It can be extended to more dimensions.

Implementation

In this repository, all the models are implemented by pytorch.

We use the standard data augmentation strategies with ResNet.

To reproduce our DCANet work, please refer Usage.md.

Trained Models

😊 All trained models and training log files are submitted to an anonymous Google Drive.

😊 We provide corresponding links in the "download" column.

Table 1. Single crop classification accuracy (%) on ImageNet validation set. We re-train models using the PyTorch framework and report results in the "re-implement" column. The corresponding DCANet variants are presented in the "DCANet" column. The best performances are marked as bold. "-" means no experiments since our DCA module is designed for enhancing attention blocks, which are not existent in base networks.

	Re-Implement					DCANet
	Top1	Top5	Param(G)	FLOPs	Download	Top1	Top5	Param(G)	FLOPs	Download
ResNet50	75.90	92.72	4.12	25.56M	model log	-	-	-	-	-
SE-ResNet50	77.29	93.65	4.13	28.09M	model log	77.55	93.77	4.13	28.65M	model log
SK-ResNet50	77.79	93.76	5.98	37.12M	model log	77.94	93.90	5.98	37.48M	model log
GEθ-ResNet50	76.24	92.98	4.13	25.56M	model log	76.75	93.36	4.13	26.12M	model log
GC-ResNet50	74.90	92.28	4.13	28.11M	model log	75.42	92.47	4.13	28.63M	model log
CBAM-ResNet50	77.28	93.60	4.14	28.09M	model log	77.83	93.72	4.14	30.90M	model log
Mnas1_0	71.72	90.32	0.33	4.38	model log	-	-	-	-	-
SE-Mnas1_0	69.69	89.12	0.33	4.42M	model log	71.76	90.40	0.33	4.48M	model log
GEθ-Mnas1_0	72.72	90.87	0.33	4.38M	model log	72.82	91.18	0.33	4.48M	model log
CBAM-Mnas1_0	69.13	88.92	0.33	4.42M	model log	71.00	89.78	0.33	4.56M	model log
MobileNetV2	71.03	90.07	0.32	3.50M	model log	-	-	-	-	-
SE-MobileNetV2	72.05	90.58	0.32	3.56M	model log	73.24	91.14	0.32	3.65M	model log
SK-MobileNetV2	74.05	91.85	0.35	5.28M	model log	74.45	91.85	0.36	5.91M	model log
GEθ-MobileNetV2	72.28	90.91	0.32	3.50M	model log	72.47	90.68	0.32	3.59M	model log
CBAM-MobileNetV2	71.91	90.51	0.32	3.57M	model log	73.04	91.18	0.34	3.65M	model log

Table 2: Detection performances (%) with different backbones on the MS-COCO validation dataset. We employ two state-of-the-art detectors: RetinaNet and Cascade R-CNN in our detection experiments.

Detector	Backbone	AP(50:95)	AP(50)	AP(75)	AP(s)	AP(m)	AP(l)	Download
Retina	ResNet50	36.2	55.9	38.5	19.4	39.8	48.3	model log
Retina	SE-ResNet50	37.4	57.8	39.8	20.6	40.8	50.3	model log
Retina	DCA-SE-ResNet50	37.7	58.2	40.1	20.8	40.9	50.4	model log
Cascade R-CNN	ResNet50	40.6	58.9	44.2	22.4	43.7	54.7	model log
Cascade R-CNN	GC-ResNet50	41.1	59.7	44.6	23.6	44.1	54.3	model log
Cascade R-CNN	DCA-GC-ResNet50	41.4	60.2	44.7	22.8	45.0	54.2	model log

About

Deep Connected Attention Networks

Languages

Language:Python 86.8%Language:Cuda 8.9%Language:C++ 4.2%Language:Shell 0.1%Language:Makefile 0.0%