Loss Functions: Paper vs Code
Ceralu opened this issue · comments
I am finding a difference between the loss function explained in the paper and the loss functions in the code.
For the supervised loss, in the code, I understand that minimizing loss_lab
is equivalent to making T.sum(T.exp(output_before_softmax_lab))
go to 1 and also making max D(x_lab)
equal to 1 for the correct label.
However, what I don't understand is the expression of loss_unl
. How is it equivalent to the loss function L_unsupervised
in the paper which aims to make the discriminator predict class K+1
when the data is fake and predict not K+1
when the data is unlabelled?
Edit: I accidentally clicked to submit issue before finishing writing it.
Edit: This is kind of similar to issue #14 which didn't receive any answer.
Any one could answer this issue?