Different results when running tdb.debug() vs. session.run()

Question

Different results when running tdb.debug() vs. session.run()

vojtajina opened this issue 8 years ago · comments

I'm running the code from Tensorflow/Udacity course (3_regularization.ipynb; Problem 1).

Here is the code:

%%javascript
Jupyter.utils.load_extensions('tdb_ext/main')

import tdb

batch_size = 128

graph = tf.Graph()
with graph.as_default():

  # Input data.
  # For the training data, we use a placeholder that will be fed at run time with a training minibatch.
  tf_train_dataset = tf.placeholder(tf.float32,
                                    shape=(batch_size, image_size * image_size))
  tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
  tf_valid_dataset = tf.constant(valid_dataset)
  tf_test_dataset = tf.constant(test_dataset)

  # Hidden layer
  hidden_layer_size = 1024
  weights_h = tf.Variable(
    tf.truncated_normal([image_size * image_size, hidden_layer_size]))
  biases_h = tf.Variable(tf.zeros([hidden_layer_size]))
  hidden = tf.nn.relu(tf.matmul(tf_train_dataset, weights_h) + biases_h)

  # Output layer
  weights_o = tf.Variable(
    tf.truncated_normal([hidden_layer_size, num_labels]))
  biases_o = tf.Variable(tf.zeros([num_labels]))
  logits = tf.matmul(hidden, weights_o) + biases_o

  _loss = tf.nn.softmax_cross_entropy_with_logits(logits, tf_train_labels)

  regularizers = tf.nn.l2_loss(weights_h) + tf.nn.l2_loss(biases_h) + tf.nn.l2_loss(weights_o) + tf.nn.l2_loss(biases_o)
  loss = tf.reduce_mean(_loss) + 5e-4 * regularizers

  # Optimizer.
  optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss)

  # Predictions for the training, validation, and test data.
  train_prediction = tf.nn.softmax(logits)

  valid_hidden = tf.nn.relu(tf.matmul(tf_valid_dataset, weights_h) + biases_h)
  valid_logits = tf.matmul(valid_hidden, weights_o) + biases_o
  valid_prediction = tf.nn.softmax(valid_logits)

  test_hidden = tf.nn.relu(tf.matmul(tf_test_dataset, weights_h) + biases_h)
  test_logits = tf.matmul(test_hidden, weights_o) + biases_o
  test_prediction = tf.nn.softmax(test_logits)

  # TRAIN
num_steps = 3001

with tf.Session(graph=graph) as session:
  tf.initialize_all_variables().run()
  print("Initialized")
  for step in range(num_steps):
    # Pick an offset within the training data, which has been randomized.
    # Note: we could use better randomization across epochs.
    offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
    # Generate a minibatch.
    batch_data = train_dataset[offset:(offset + batch_size), :]
    batch_labels = train_labels[offset:(offset + batch_size), :]
    # Prepare a dictionary telling the session where to feed the minibatch.
    # The key of the dictionary is the placeholder node of the graph to be fed,
    # and the value is the numpy array to feed to it.
    feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
    _,l,predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)

    if (step % 500 == 0):
      print("Minibatch loss at step %d: %f" % (step, l))
      print("Minibatch accuracy: %.1f%%" % accuracy(predictions, batch_labels))
      print("Validation accuracy: %.1f%%" % accuracy(
        valid_prediction.eval(), valid_labels))
  print("Test accuracy: %.1f%%" % accuracy(test_prediction.eval(), test_labels))

Here is the output, when using session.run():

Initialized
Minibatch loss at step 0: 484.110291
Minibatch accuracy: 11.7%
Validation accuracy: 17.3%
Minibatch loss at step 500: 134.830811
Minibatch accuracy: 84.4%
Validation accuracy: 61.2%
Minibatch loss at step 1000: 99.932526
Minibatch accuracy: 82.8%
Validation accuracy: 64.1%
Minibatch loss at step 1500: 74.190613
Minibatch accuracy: 82.8%
Validation accuracy: 62.0%
Minibatch loss at step 2000: 57.145329
Minibatch accuracy: 80.5%
Validation accuracy: 64.2%
Minibatch loss at step 2500: 44.857193
Minibatch accuracy: 83.6%
Validation accuracy: 64.0%
Minibatch loss at step 3000: 34.841984
Minibatch accuracy: 89.8%
Validation accuracy: 66.2%
Test accuracy: 90.7%

And when I switch to tdb.debug():

    status,result = tdb.debug(
      [optimizer, loss, train_prediction], feed_dict=feed_dict, session=session, breakpoints=[])
    l = result[1]
    predictions = result[2]

Initialized
Minibatch loss at step 0: 597.969238
Minibatch accuracy: 12.5%
Validation accuracy: 9.9%
Minibatch loss at step 500: 582.064575
Minibatch accuracy: 7.0%
Validation accuracy: 9.9%
Minibatch loss at step 1000: 599.943359
Minibatch accuracy: 7.8%
Validation accuracy: 9.9%
Minibatch loss at step 1500: 579.468689
Minibatch accuracy: 11.7%
Validation accuracy: 9.9%
Minibatch loss at step 2000: 582.900635
Minibatch accuracy: 6.2%
Validation accuracy: 9.9%
Minibatch loss at step 2500: 605.257935
Minibatch accuracy: 5.5%
Validation accuracy: 9.9%
Minibatch loss at step 3000: 605.155579
Minibatch accuracy: 5.5%
Validation accuracy: 9.9%
Test accuracy: 7.6%

I would expect the same output. What am I doing wrong?

Vojta Jina · Answer 1 · Fri Feb 12 2016 13:48:41 GMT+0800 (China Standard Time)

This is running on MacBook Pro (10.11.1 15B35a), tensorflow 0.5.0, tfdebugger 0.1.1

Eric Jang · Answer 2 · Fri Feb 12 2016 21:16:08 GMT+0800 (China Standard Time)

Thanks for bringing this to my attention. It's probably not your fault, I
suspect it to be a bug in how TDB deals with Operations on tf.Variables.
That would explain why the learning is causing accuracy to plummet. I'll
take a look hopefully sometime this weekend.

On Fri, Feb 12, 2016 at 12:48 AM, Vojta Jina notifications@github.com
wrote:

This is running on MacBook Pro (10.11.1 15B35a), tensorflow 0.5.0,
tfdebugger 0.1.1

—
Reply to this email directly or view it on GitHub
#4 (comment).

Andrew Me · Answer 3 · Mon Sep 26 2016 16:31:10 GMT+0800 (China Standard Time)

still different =\