Using a CNN with Grated recurrent units and Time distributed wrappers over MNIST data
I am using MNIST data here to test this new Convolutional Architecture that uses : 4 Convolutional layers Time wrapped 3 GRU layers
I am using 3 epochs and running it 3 times each resulting in total of 9 epochs.
The total Loss calculated here is 3.0515 which is lowest you might get. you can edit the epochs and layers to experiment further.