A question about the framework
yzcv opened this issue · comments
Hi, @JingyunLiang
I appreciate your fabulous work but I have a question about the framework. Did you ever try the Unet-like framework or encoder-decoder one for the Deep Feature Extraction Block (the whole transformer block)? As your framework is all of the same RSTB blocks, I am wondering if the encoder-decoder idea is helpful for the performance gain?
Thank you very much.
We mainly try to deal with the SR problem at the beginning. This is why we use such a design. UNet-like framework should also work. You can find similar ideas here and here.
So do you mean that your framework is more suitable for the SR task than the Unet-like or autoencoder-like transformer-based framework?
Thanks.
Possibly yes from my intuition. But we need experiments to prove it.
Thanks very much for the patient reply.
Hi,
I now understand why this framework works well. Other tasks like segmentation or detection, need to extract the high-level semantic information. Thus the unet or autoencoder can make a contribution. Different from those tasks, SR is aimed at utilizing all the information from the LR image and we do not want to lose any low-level information. As such, your framework is intuitively better for SR.