clovaai / rexnet

Official Pytorch implementation of ReXNet (Rank eXpansion Network) with pretrained models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Comparison with RegNet

chris-ha458 opened this issue · comments

Edit : (I separated this question from a previous issue #3 )

This paper considers network design spaces similar to the approach taken in the recent RegNet paper (Designing Network Design Spaces). Are the principles from that paper congruent to yours?

본 논문에서와 같은 설계방법론에 대한 논의는 이전에 fair에서 나온 RegNet paper (Designing Network Design Spaces)를 연상시키는데요, 해당 논문의 접근과의 어떻게 비교되는지요?

Thank you again for your interest in our work. I would like to briefly introduce RegNet with some points first.

As you know, in RegNet paper, the authors proposed a method that progressively refines design spaces (namely shrinks the current design space while preserving or improving the sampled networks' EDF that related to models' performance). Based on a ResNet-style modularized network (which has the stem-body-head architecture, bottleneck blocks with a shortcut and consists of 4 stages), the authors did the experiments at every design space with 500 sampled models trained for 10 epochs to validate the new design space's quality compared to the old one. After obtaining some useful design principles by progressive refinement steps, eventually, RegNetX (and Y) were born at the final step and show prominent performances.

We can notice from the beginning to the end of the steps, all the design principles (represented by design parameters) are progressively tested and established at the stage-level (i.e., concerning the design of each stage of a network) by the experiments with bootstrapping. To sum up, the proposed design principles were empirically found from certain design spaces and applied at each stage of a network to refine the network at the stage-level.

Our ReXNet has a totally different goal compared to that of RegNet, which is simply to reduce the representational bottleneck at each layer in a network inspired by Softmax bottleneck in language modeling. To this end, we started with the theoretical investigation with the matrix rank of the output feature at a layer and how to expand it. Based on this, we provided a remedy that cures the representational bottleneck problem at the layer-level, which is to expand the input channel size with proper a nonlinearity. Then, we provided a design guide of an entire network that lets a model less suffer from the bottleneck problem by increasing the number of expand layers to avoid certain layers' representational bottleneck.

Our preliminary experiments with a set of randomly generated networks are to provide empirical evidence of how rank has expanded to support the theory and the impact of the number of expand layers. Additionally, as a toy experiment, we further showed the relationship between some trained networks' rank and accuracy, showing the validity of the design principle. Finally, our proposed ReXNet_V1-1.0x is a model that shows how promising it is by reflecting the design principles. There would exist better architectures by following our design principles.

We believe that using our design principles equipped with the method refining the design space in RegNet paper would be well harmonized to make a better network architecture. Finally, it was glad to take look at such a good paper of RegNet and leave the comment about it.