Weights in one layer do not change

Question

Weights in one layer do not change

joe-antognini opened this issue 6 years ago · comments

This implementation creates an extra set of weights in the last residual block that do not get trained because the output from these weights never gets used for the model's prediction. (These are the weights from the 1x1 convolution that would be used as the input to the next residual block if there were one.)

Minimal reproducing example:

import numpy as np
import tensorflow as tf
from wavenet import WaveNetModel

features = tf.placeholder(tf.float32, [2, 16000])
wavenet = WaveNetModel(4, [1, 2, 4], 2, 4, 4, 4)
loss = wavenet.loss(features)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
train_op = optimizer.minimize(loss)

feed_dict = {features: np.random.uniform(-1, 1, size=(2, 16000))}
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    vars_init = sess.run(tf.trainable_variables())

    # Take one step and see what the variables are now.
    sess.run(train_op, feed_dict=feed_dict)
    vars_final = sess.run(tf.trainable_variables())

# One assertion in this loop will fail because the weights did not change.
for v_init, v_final in zip(vars_init, vars_final):
    assert not np.allclose(v_init, v_final)