Question regarding architecture

Question

Question regarding architecture

GRIGORR opened this issue 5 years ago · comments

Hi. Small question regarding the architecture. Do I understand correctly that at each iteration the candidate networks are trained separately, then each of them is added to Adanet, weighting the logits is learned and then the best one is chosen? Thanks in advance.

dawg · Answer 1 · Wed Feb 26 2020 23:19:58 GMT+0800 (China Standard Time)

I understan it like that too.

Charles Weill · Answer 2 · Wed Apr 08 2020 00:43:20 GMT+0800 (China Standard Time)

@GRIGORR That is correct: all the candidate subnetworks (and their associated ensemble) are trained in parallel in the same TensorFlow graph. At the end of each iteration, the best subnetwork is chosen based on its performance within the ensemble.

dawg · Answer 3 · Wed Apr 08 2020 00:55:24 GMT+0800 (China Standard Time)

@GRIGORR That is correct: all the candidate subnetworks (and their associated ensemble) are trained in parallel in the same TensorFlow graph. At the end of each iteration, the best subnetwork is chosen based on its performance within the ensemble.

From what I gather from the 0.8.0 docs it sounds to me like one AdaNet iteration actually selects a complete Ensemble each iteration and discards the others. Could it be said that each Ensemble from the candidate ensemble set differs from all the other candidate Ensembles in the subnetwork that has been added to it in the current iteration?