Clarification about Format Training

Question

Clarification about Format Training

Yocodeyo opened this issue 4 years ago · comments

Hi Ma, may I clarify with you about how you did format training? From my understanding, you're going to take the one-hot from HSIC Bottleneck training to pass into a simple layer to do the classification. However, based on your code, it seems that you train hsic model first, load the weights as the initial weights, then combine hsic model and vanilla model and train them together as a whole? Thank you very much!

Leopad · Answer 1 · Fri Sep 18 2020 17:14:45 GMT+0800 (China Standard Time)

That's correct. We intentionally separated the training into two stages: unformatted training (or hsic training), and formatted training on vanilla model. The goal of this is to hope the unformatted training have sufficient information about the training target and forget about the input, such that the optimized information makes the formatted training (or simple classifier) can do its best job. The HSIC solve you asked before is the very special case that can solve classification directly, as the hsic model output dimension is 10 and it shows the classed activation peak permutation visually.

Back to the subject, we first do unformatted training on HSIC model. Then load/fix the weight and use the output of HSIC model for vanilla training. So either they don't require back-propagation during the training.

Yocodeyo · Answer 2 · Fri Sep 18 2020 17:56:53 GMT+0800 (China Standard Time)

Okay thanks for the clear explanation!:)

Leopad · Answer 3 · Sun Sep 20 2020 05:58:32 GMT+0800 (China Standard Time)

No worries. Always let me know if you have any questions about our project. Good luck