mafda / generative_adversarial_networks_101

Keras implementations of Generative Adversarial Networks. GANs, DCGAN, CGAN, CCGAN, WGAN and LSGAN models with MNIST and CIFAR-10 datasets.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LSGAN loss function

NANBFTOP5 opened this issue · comments

commented

Hi there,

I have a question related to your awesomework.
Base on LSGAN paper, the loss function may looks like this on tensorflow version
D_loss = 0.5 * (tf.reduce_mean((D_real - 1)2) + tf.reduce_mean(D_fake2))
G_loss = 0.5 * tf.reduce_mean((D_fake - 1)**2)

I check your loss function which is MSE. Where do you define the a, b, c for LSGAN.
d_g.compile(optimizer=optimizer, loss='mse', metrics=['binary_accuracy'])

Thank you,

Hi,

Thank you for your feedback.

You are right with the issue you are reporting.

However, from my understanding, the implementation I made is equivalent to the one proposed in the paper.

The MSE loss function is defined as,

mse = tf.reduce_mean(tf.square(y_pred - y))

In my code, I define the variables real and fake as,

real = np.ones(shape=(batch_size, 1))
fake = np.zeros(shape=(batch_size, 1))

Then, the variables real and fake are equivalent to the paper variables b and a respectively, using the 0-1 binary coding scheme.

The discriminator is trained in two steps,

# Real samples
X_batch = X_train[i*batch_size:(i+1)*batch_size]
d_loss_real = discriminator.train_on_batch(x=X_batch, y=real)

# Fake Samples
z = np.random.normal(loc=0, scale=1, size=(batch_size, latent_dim))
X_fake = generator.predict_on_batch(z)
        
d_loss_fake = discriminator.train_on_batch(x=X_fake, y=fake)
         
# Discriminator loss
d_loss_batch = 0.5 * (d_loss_real[0] + d_loss_fake[0])

This would be equivalent to,

d_loss_real = tf.reduce_mean(tf.square(discriminator(X_real) - real))
d_loss_fake = tf.reduce_mean(tf.square(discriminator(X_fake) - fake))
d_loss_batch = 0.5 * (d_loss_real + d_loss_fake)

As far as I understand, depending on the batch size, the implementation would be equivalent.

To make the code easier to read and more consistent with the paper, I propose the following modifications:

@@ -5,6 +5,7 @@
-real = np.ones(shape=(batch_size, 1))
-fake = np.zeros(shape=(batch_size, 1))
+a = np.zeros(shape=(batch_size//2, 1))
+b = np.ones(shape=(batch_size//2, 1))
+c = np.ones(shape=(batch_size, 1))
@@ -13,20 +14,21 @@
         # Real samples
-        X_batch = X_train[i*batch_size:(i+1)*batch_size]
-        d_loss_real = discriminator.train_on_batch(x=X_batch, y=real * (1 - smooth))
+        X_real = X_train[i*batch_size//2:(i+1)*batch_size//2]
         
         # Fake Samples
-        z = np.random.normal(loc=0, scale=1, size=(batch_size, latent_dim))
+        z = np.random.normal(loc=0, scale=1, size=(batch_size//2, latent_dim))
         X_fake = generator.predict_on_batch(z)
-        d_loss_fake = discriminator.train_on_batch(x=X_fake, y=fake)

         # Discriminator loss
-        d_loss_batch = 0.5 * (d_loss_real[0] + d_loss_fake[0])
+        d_loss_batch = discriminator.train_on_batch(
+            x=np.concatenate((X_fake, X_real), axis=0),
+            y=np.concatenate((a, b), axis=0)
+        )
@@ -32,32 +33,34 @@        
-        d_g_loss_batch = d_g.train_on_batch(x=z, y=real)
+        z = np.random.normal(loc=0, scale=1, size=(batch_size, latent_dim))
+        d_g_loss_batch = d_g.train_on_batch(x = z, y = c)

I think this implementation is easier to compare with the paper.

What do you think? Do you think I should include those modifications in the repository?

Again, thank you very much for your feedback and I will be waiting your reply.