Different results when running tdb.debug() vs. session.run()
vojtajina opened this issue · comments
I'm running the code from Tensorflow/Udacity course (3_regularization.ipynb; Problem 1).
Here is the code:
%%javascript
Jupyter.utils.load_extensions('tdb_ext/main')
import tdb
batch_size = 128
graph = tf.Graph()
with graph.as_default():
# Input data.
# For the training data, we use a placeholder that will be fed at run time with a training minibatch.
tf_train_dataset = tf.placeholder(tf.float32,
shape=(batch_size, image_size * image_size))
tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
tf_valid_dataset = tf.constant(valid_dataset)
tf_test_dataset = tf.constant(test_dataset)
# Hidden layer
hidden_layer_size = 1024
weights_h = tf.Variable(
tf.truncated_normal([image_size * image_size, hidden_layer_size]))
biases_h = tf.Variable(tf.zeros([hidden_layer_size]))
hidden = tf.nn.relu(tf.matmul(tf_train_dataset, weights_h) + biases_h)
# Output layer
weights_o = tf.Variable(
tf.truncated_normal([hidden_layer_size, num_labels]))
biases_o = tf.Variable(tf.zeros([num_labels]))
logits = tf.matmul(hidden, weights_o) + biases_o
_loss = tf.nn.softmax_cross_entropy_with_logits(logits, tf_train_labels)
regularizers = tf.nn.l2_loss(weights_h) + tf.nn.l2_loss(biases_h) + tf.nn.l2_loss(weights_o) + tf.nn.l2_loss(biases_o)
loss = tf.reduce_mean(_loss) + 5e-4 * regularizers
# Optimizer.
optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
# Predictions for the training, validation, and test data.
train_prediction = tf.nn.softmax(logits)
valid_hidden = tf.nn.relu(tf.matmul(tf_valid_dataset, weights_h) + biases_h)
valid_logits = tf.matmul(valid_hidden, weights_o) + biases_o
valid_prediction = tf.nn.softmax(valid_logits)
test_hidden = tf.nn.relu(tf.matmul(tf_test_dataset, weights_h) + biases_h)
test_logits = tf.matmul(test_hidden, weights_o) + biases_o
test_prediction = tf.nn.softmax(test_logits)
# TRAIN
num_steps = 3001
with tf.Session(graph=graph) as session:
tf.initialize_all_variables().run()
print("Initialized")
for step in range(num_steps):
# Pick an offset within the training data, which has been randomized.
# Note: we could use better randomization across epochs.
offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
# Generate a minibatch.
batch_data = train_dataset[offset:(offset + batch_size), :]
batch_labels = train_labels[offset:(offset + batch_size), :]
# Prepare a dictionary telling the session where to feed the minibatch.
# The key of the dictionary is the placeholder node of the graph to be fed,
# and the value is the numpy array to feed to it.
feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
_,l,predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
if (step % 500 == 0):
print("Minibatch loss at step %d: %f" % (step, l))
print("Minibatch accuracy: %.1f%%" % accuracy(predictions, batch_labels))
print("Validation accuracy: %.1f%%" % accuracy(
valid_prediction.eval(), valid_labels))
print("Test accuracy: %.1f%%" % accuracy(test_prediction.eval(), test_labels))
Here is the output, when using session.run()
:
Initialized
Minibatch loss at step 0: 484.110291
Minibatch accuracy: 11.7%
Validation accuracy: 17.3%
Minibatch loss at step 500: 134.830811
Minibatch accuracy: 84.4%
Validation accuracy: 61.2%
Minibatch loss at step 1000: 99.932526
Minibatch accuracy: 82.8%
Validation accuracy: 64.1%
Minibatch loss at step 1500: 74.190613
Minibatch accuracy: 82.8%
Validation accuracy: 62.0%
Minibatch loss at step 2000: 57.145329
Minibatch accuracy: 80.5%
Validation accuracy: 64.2%
Minibatch loss at step 2500: 44.857193
Minibatch accuracy: 83.6%
Validation accuracy: 64.0%
Minibatch loss at step 3000: 34.841984
Minibatch accuracy: 89.8%
Validation accuracy: 66.2%
Test accuracy: 90.7%
And when I switch to tdb.debug()
:
status,result = tdb.debug(
[optimizer, loss, train_prediction], feed_dict=feed_dict, session=session, breakpoints=[])
l = result[1]
predictions = result[2]
Initialized
Minibatch loss at step 0: 597.969238
Minibatch accuracy: 12.5%
Validation accuracy: 9.9%
Minibatch loss at step 500: 582.064575
Minibatch accuracy: 7.0%
Validation accuracy: 9.9%
Minibatch loss at step 1000: 599.943359
Minibatch accuracy: 7.8%
Validation accuracy: 9.9%
Minibatch loss at step 1500: 579.468689
Minibatch accuracy: 11.7%
Validation accuracy: 9.9%
Minibatch loss at step 2000: 582.900635
Minibatch accuracy: 6.2%
Validation accuracy: 9.9%
Minibatch loss at step 2500: 605.257935
Minibatch accuracy: 5.5%
Validation accuracy: 9.9%
Minibatch loss at step 3000: 605.155579
Minibatch accuracy: 5.5%
Validation accuracy: 9.9%
Test accuracy: 7.6%
I would expect the same output. What am I doing wrong?
This is running on MacBook Pro (10.11.1 15B35a), tensorflow 0.5.0, tfdebugger 0.1.1
Thanks for bringing this to my attention. It's probably not your fault, I
suspect it to be a bug in how TDB deals with Operations on tf.Variables.
That would explain why the learning is causing accuracy to plummet. I'll
take a look hopefully sometime this weekend.
On Fri, Feb 12, 2016 at 12:48 AM, Vojta Jina notifications@github.com
wrote:
This is running on MacBook Pro (10.11.1 15B35a), tensorflow 0.5.0,
tfdebugger 0.1.1—
Reply to this email directly or view it on GitHub
#4 (comment).
still different =\