Using custom data fails (errors with placeholders and feed data sizes)

Question

Using custom data fails (errors with placeholders and feed data sizes)

aloerch opened this issue 8 years ago · comments

I'm using the run_stacked_autoencoder_unsupervised..py script with custom data in npy format, and am receiving a failure after the data is loaded, at the point where it's being fed into the training step:

tensorflow/core/common_runtime/gpu/gpu_device.cc:972] DMA: 0 1 2 I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 0: Y Y Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 1: Y Y Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 2: Y Y Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 970, pci bus id: 0000:03:00.0) I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 970, pci bus id: 0000:02:00.0) I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:2) -> (device: 2, name: GeForce GTX 970, pci bus id: 0000:01:00.0) Tensorboard logs dir for this run is /home/oem/.yadlt/logs/un_sdae/dae-1/run2 Traceback (most recent call last): File "command_line/run_stacked_autoencoder_unsupervised.py", line 185, in <module> encoded_X, encoded_vX = sdae.pretrain(trX, vlX) File "/home/oem/Documents/phdCaseStudy/tensorflow27/local/lib/python2.7/site-packages/yadlt/models/autoencoder_models/deep_autoencoder.py", line 142, in pretrain validation_set=validation_set) File "/home/oem/Documents/phdCaseStudy/tensorflow27/local/lib/python2.7/site-packages/yadlt/core/model.py", line 180, in pretrain_procedure layer_graphs[l]) File "/home/oem/Documents/phdCaseStudy/tensorflow27/local/lib/python2.7/site-packages/yadlt/core/model.py", line 197, in _pretrain_layer_and_gen_feed validation_set, validation_set, graph=graph) File "/home/oem/Documents/phdCaseStudy/tensorflow27/local/lib/python2.7/site-packages/yadlt/core/unsupervised_model.py", line 48, in fit train_set, train_ref, validation_set, validation_ref) File "/home/oem/Documents/phdCaseStudy/tensorflow27/local/lib/python2.7/site-packages/yadlt/models/autoencoder_models/denoising_autoencoder.py", line 74, in _train_model self._run_train_step(train_set) File "/home/oem/Documents/phdCaseStudy/tensorflow27/local/lib/python2.7/site-packages/yadlt/models/autoencoder_models/denoising_autoencoder.py", line 101, in _run_train_step self.tf_session.run(self.train_step, feed_dict=tr_feed) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 717, in run run_metadata_ptr) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 894, in _run % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape()))) ValueError: Cannot feed value of shape (25, 128, 128, 3) for Tensor u'x-input:0', which has shape '(?, 128)'

My image tiles (saved as an npy batch) are 128x128 with 3 channels (rgb). The error seams to indicate that the script understands the shape of the data, but that this u'x-input:0 has a shape of (?, 128). I've tried to find where to fix this, without much luck. I've tried using np.reshape but I've only been able to change the shape of the (25,128,128,3) array and not the shape of the u'x-input:0...

Any help on this would be greatly appreciated!

Andrew · Answer 1 · Mon Oct 17 2016 04:32:08 GMT+0800 (China Standard Time)

I can confirm this same behavior using run_dbn.py.

Andrew · Answer 2 · Mon Oct 17 2016 06:27:20 GMT+0800 (China Standard Time)

Ok, I'm happy to report that I've made some progress here, that you might want to add/modify to your existing code. First, I'll start with denoising_autoencoder.py:

under line 147, def _create_placeholders(self, n_features):

original:

self.input_data_orig = tf.placeholder(
            tf.float32, [None, n_features], name='x-input')
        self.input_data = tf.placeholder(
            tf.float32, [None, n_features], name='x-corr-input')

The result of x-input for (None, n_features) in the above example, when applied to a custom dataset with a batch_size of 25, (25, 128, 128, 3), is (?, 128). To correct this using hardcoding, for the placeholder x-input I change n_features to be 49152, which is 128x128x3 and therefore both x-input and x-corr-input become size (batch_size, 49152).

The correction I used is:

self.input_data_orig = tf.placeholder(
            tf.float32, [None, 49152], name='x-input')
        self.input_data = tf.placeholder(
            tf.float32, [None, 49152], name='x-corr-input')

Now, in the "run_stacked_autoencoder_unsupervised.py" script, it is necessary to make the data being fed to the model fit the 2 dimensions.
original code starting from line 127:

        trX, trRef = load_from_np(FLAGS.train_dataset), load_from_np(FLAGS.train_ref)
        vlX, vlRef = load_from_np(FLAGS.valid_dataset), load_from_np(FLAGS.valid_ref)
        teX, teRef = load_from_np(FLAGS.test_dataset), load_from_np(FLAGS.test_ref)

trX, vlX, and teX all get encoded on line 183, but if they retain their original shape, they are encoded as (2100, 128, 128, 3) .

So, I reshape them by adding after line 129:

        t rX = np.reshape(trX, (-1, 49152))
        vlX = np.reshape(vlX, (-1, 49152))
        teX = np.reshape(teX, (-1, 49152))

Finally, a last error is created:

Traceback (most recent call last):
  File "command_line/run_stacked_autoencoder_unsupervised.py", line 185, in <module>
    encoded_X, encoded_vX = sdae.pretrain(np.reshape(trX, (-1, 49152), np.reshape(vlX, -1,49152)))
  File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 225, in reshape
    return reshape(newshape, order=order)
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

That pertains to lines 131-135, which I fixed:

        if not np.all(trRef):
            trRef = trX
        if not np.all(vlRef):
            vlRef = vlX
        if not np.all(teRef):
            teRef = teX

A cursory glance at other files in this project (run_dbn.py comes to mind) shows that these issues and suggested fixes might be applicable to more than just the 2 files I've listed. For the hard-coded dimensions, you might consider re-coding how n_features is generated to prevent the need for hard-coded values, and the same goes for reshaping trX, vlX, etc.

Gabriele Angeletti · Answer 3 · Mon Oct 17 2016 19:02:21 GMT+0800 (China Standard Time)

Hi @aloerch, thanks for this detailed report!
I think this error is due to the fact that the pretrain() and the fit() methods expect an array of shape (N, 49152) in your case. So, instead of hardcoding the number of features in the code, if you reshape the .npy files before passing them to run_stacked_autoencoder_unsupervised, then n_features would be 49152 and everything should work as expected.
Or maybe I could add a little function to detect if the dataset has more than two dimensions, and reshape it if necessary.
Let me know your thoughts!