FxSR

Flexible Style Image Super-Resolution using Conditional Objective

Seung Ho Park, Young Soo Moon, and Nam Ik Cho

Abstract

Recent studies have significantly enhanced the performance of single-image super-resolution (SR) using convolutional neural networks (CNNs). While there can be many high-resolution (HR) solutions for a given input, most existing CNN-based methods do not explore alternative solutions during the inference. A typical approach to obtaining alternative SR results is to train multiple SR models with different loss weightings and exploit the combination of these models. Instead of using multiple models, we present a more efficient method to train a single adjustable SR model on various combinations of losses by taking advantage of multi-task learning. Specifically, we optimize an SR model with a conditional objective during training, where the objective is a weighted sum of multiple perceptual losses at different feature levels. The weights vary according to given conditions, and the set of weights is defined as a style controller. Also, we present an architecture appropriate for this training scheme, which is the Residual-in-Residual Dense Block equipped with spatial feature transformation layers. At the inference phase, our trained model can generate locally different outputs conditioned on the style control map. Extensive experiments show that the proposed SR model produces various desirable reconstructions without artifacts and yields comparable quantitative performance to state-of-the-art SR methods.

Usage:

Environments

Pytorch 1.10.0
CUDA 11.5
Python 3.8

Quick usage on your data:

You can choose any number [0, 1] for t.

python test.py -opt options/test/test_FxSR_PD_4x.yml -t 0.8

Test models

Download the pretrained FxSR-PD 4x model from OneDrive Link
Download the pretrained FxSR-PD 8x model from OneDrive Link
Download the pretrained FxSR-DS 4x model from OneDrive Link
Download the pretrained FxSR-DS 8x model from OneDrive Link

Brief Description of Our Proposed Method

TARGETED PERCEPTUAL LOSS

The effect of choosing different layers when estimating perceptual losses on different regions, e.g., on edge and texture regions, where the losses correspond to MSE, ReLU 2-2 (VGG22), and ReLU 4-4 (VGG44) of the VGG-19 network.

PROPOSED SR WITH FLEXIBLE STYLE

The proposed flexible SR model is optimized with a conditional objective, which is a weighted sum of several perceptual losses corresponding to different feature levels, where each weight changes depending on the style map.

PROPOSED NETWORK ARCHITECTURE

The architecture of our proposed flexible SR network. We use the RRDB equipped with SFT as a basic block. The condition branch takes a style map for reconstruction style as input. This map is used to control the recovery styles of edges and textures for each region through SFT layers.

The proposed Basic Block (RRDB equipped with SFT layer)

Experimental Results

Visual Evaluation of Flexible SR for Perception-Distortion (FxSR-PD)

Changes in the result of FxSR-PS 4x SR according to t on DIV2K validation set.

Visual comparison with state-of-the-art perception-driven SR methods on DIV2K validation set.

Quantitative Evaluation of Flexible SR for Perception-Distortion (FxSR-PD)

Visual Evaluation of Flexible SR for Diverse Styles (FxSR-DS)

Changes in the result of FxSR-DS 4x SR according to t on DIV2K validation set.

Per-pixel Style Control

Examples of local reconstruction style control.

Comparison of the SR results of the conventional method and the FxSR-PD method.

The conventional method

The FxSR-PD method

Depth-adaptive FxSR

T-maps is the modified version of the depth map of an image from the Make3D dataset.

An example of applying a user-created depth map to enhance the perspective feeling with the sharper and richer textured foreground and the background with more reduced camera noise than the ground truth.

Focusing objects

Examples of naturally focusing foreground objects without artifacts. (Experiments for FxSR-PD 4x on Div8K validation dataset)

(red circle: over-enhanced and unnatural areas)

Ablation Study

Convergence of diversity curve of the proposed FxSR-PD model as the number of training iteration increase

NTIRE 2021 Learning the Super-Resolution Space Challenge Link

We participated in the NTIRE 2021 challenge under the name of SSS. FxSR-DS is the best in terms of LPIPS for both 4x and 8x, 8th in diversity score and 3rd in MOR (Mean Opinion Rank) Link.

Citation

@ARTICLE{9684919,
  author={Park, Seung Ho and Moon, Young Su and Cho, Nam Ik},
  journal={IEEE Access}, 
  title={Flexible Style Image Super-Resolution Using Conditional Objective}, 
  year={2022},
  volume={10},
  number={},
  pages={9774-9792},
  doi={10.1109/ACCESS.2022.3144406}}

Acknowledgement

Our work and implementations are inspired by and based on BasicSR [site]

CV-IP / FxSR