muupan / dqn-in-the-caffe

An implementation of Deep Q-Network using Caffe

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

eltwise_layer shape check failed

omgteam opened this issue · comments

Hi, when tying to construct solver, following error occurs, which suggests the shape of filter layer (32,18,1,1) does not match q_values (32, 18). Any idea? Thank you very much!

I0916 20:53:52.315197 2321 net.cpp:127] Top shape: 32 18 (576)
I0916 20:53:52.315212 2321 layer_factory.hpp:74] Creating layer eltwise_layer
I0916 20:53:52.315240 2321 net.cpp:90] Creating Layer eltwise_layer
I0916 20:53:52.315249 2321 net.cpp:410] eltwise_layer <- q_values
I0916 20:53:52.315259 2321 net.cpp:410] eltwise_layer <- filter
I0916 20:53:52.315270 2321 net.cpp:368] eltwise_layer -> filtered_q_values
I0916 20:53:52.315284 2321 net.cpp:120] Setting up eltwise_layer
F0916 20:53:52.315300 2321 eltwise_layer.cpp:35] Check failed: bottom[i]->shape() == bottom[0]->shape()

It does not matter, just replace inner-product layer with 1*1 kernel convolutional layer!
Thank you!

Were you able to get pong or breakout to train up with the change to the model you made? I made a different change that doesn't work. I think I have a bit to learn in terms of how Caffe works.

I got this problem too.
How to fix this problem ?

I tried adding a Reshape layer to the dqn.prototxt. I didn't think that would affect the update because of how the target and filter arrays are being set but I'm getting enormous Q values once training starts. I don't have much experience with reinforcement learning but I haven't seen Q values above 5 or so in my other tasks. Any ideas?

I put together a version that works over at

https://github.com/watts4speed/fast-dqn-caffe

It has some instructions also about how to get it going. I made some speed improvements also.

Sorry for the problems. I haven't run my code for a while and don't know whether it works with the latest Caffe and ALE.

@watts4speed nice work!

Ya there's a couple things. The caffe guys changed their API for solver creation. See what I did over on my branch. Then somewhere maybe after September something changed in caffe that broke the training all together. I've never been able to figure out what it was yet. I put a commit number that I know works for caffe on the readme for the link above. It should also work for the verion here. Let me know if you figure out what the issue is with running with the head of caffe/master :-)

I worked through the code also and found what looks to be an error when resetting the filters and targets for each forward pass. The filters and targets are not the correct values after calling reset and doing one step. Is there a reason for this? Am I understanding this correctly?

After 50 iterations the computed targets for each of the 32 outputs (this looks reasonable):
0.173218 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0.177312 0 0 0 0 0 0
0 0 0 0 0.175913 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0.176915 0 0 0 0 0 0
0 0 0 0 0.176081 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0.177312 0 0 0 0 0 0
0 0 0 0 0.17177 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0.173809 0 0 0 0 0 0 0 0 0 0 0 0 0
0.173392 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0.176585 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0.171213 0 0 0 0 0
0 0 0 0.173022 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0.174964 0 0 0 0 0
0 0 0 0 0.17177 0 0 0 0 0 0 0 0 0 0 0 0 0
0.174004 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0.174004 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0.173392 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0.17581 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0.174467 0 0 0 0 0
0 0 0 0.174032 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0.171213 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0.172245 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0.171213 0 0 0 0 0
0 0 0 0 0.173349 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0.172245 0 0 0 0 0
0 0 0 0 0.175626 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0.177733 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0.173809 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0.168742 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0.175043 0 0 0 0 0
0 0.171759 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0.177312 0 0 0 0 0 0

The corresponding filters (this also looks correct as the positions match):
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0

After doing one step, though, the data in the filter and target blobs are different.
Targets post step:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0.177312 0 0 0 0 0 0 0 0
0 0 0.175913 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0.176915 0 0 0 0 0 0 0 0
0 0 0.176081 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0.177312 0 0 0 0 0 0 0 0
0 0 0.17177 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0.173809 0 0 0 0 0 0 0 0 0 0 0 0 0 0.173392 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.176585
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0.171213 0 0 0 0 0 0 0
0 0.173022 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0.174964 0 0 0 0 0 0 0
0 0 0.17177 0 0 0 0 0 0 0 0 0 0 0 0 0 0.174004 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.174004 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.173392 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0.17581 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0.174467 0 0 0 0 0 0 0
0 0.174032 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0.171213 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0.172245 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0.171213 0 0 0 0 0 0 0
0 0 0.173349 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0.172245 0 0 0 0 0 0 0
0 0 0.175626 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0.177733 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0.173809 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0.168742 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0.175043 0 0 0 0 0 0 0.171759
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0.177312 0 0 0 0 0 0 1 0

Filters post step:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0

If you look closely all the values are shifted two places to the left. This results in incorrect filtered q values since some actions might not even have a corresponding filter any more. Some rows are all zeros and some rows will have two targets.

Filtered q values:
0 0 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0.00185883 -0 -0 -0 -0 -0 0 -0 0
0 0 -0.0725774 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0.00195366 -0 -0 -0 -0 -0 0 -0 0
0 0 -0.0720046 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0.00185883 -0 -0 -0 -0 -0 0 -0 0
0 0 -0.0729674 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0.0723516 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0.0462732 0
0 0 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0.186157
0 0 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0 -0.0137082 -0 -0 -0 -0 0 -0 0
0 0.0343856 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 -0 -0.0133068 -0 -0 -0 -0 0 -0 0
0 0 -0.0729674 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0.0423569 0
0 0 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0.0423569 0
0 0 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0.0462732 0
0 0 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0.0324558 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0 -0.014098 -0 -0 -0 -0 0 -0 0
0 0.0291945 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0 -0.0137082 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0 -0.0111855 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0 -0.0137082 -0 -0 -0 -0 0 -0 0
0 0 -0.069723 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0 -0.0111855 -0 -0 -0 -0 0 -0 0
0 0 -0.0663748 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0.0305264 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0.0723516 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 -0.00127689 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0 -0.0107717 -0 -0 -0 -0 0 -0 0.180422
0 0 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0.00185883 -0 -0 -0 -0 -0 0 4.17635e-38 0

Hi Trevor,

Are you using the head of caffe/master?

For me things work with the version at the link above with a caffe/master
around Sept 2015. THe head of master doesn't would love to know why. See
the link above for a working version with caffe around Sept 2015. If you
start with that then get to the head of caffe main that would be really
helpful.

On Fri, Dec 18, 2015 at 10:20 AM, Trevor Barron notifications@github.com
wrote:

I worked through the code also and found what looks to be an error when
resetting the filters and targets for each forward pass. The filters and
targets are not the correct values after calling reset and doing one step.
Is there a reason for this? Am I understanding this correctly?

After 50 iterations the computed targets for each of the 32 outputs (this
looks reasonable):
0.173218 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0.177312 0 0 0 0 0 0
0 0 0 0 0.175913 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0.176915 0 0 0 0 0 0
0 0 0 0 0.176081 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0.177312 0 0 0 0 0 0
0 0 0 0 0.17177 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0.173809 0 0 0 0 0 0 0 0 0 0 0 0 0
0.173392 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0.176585 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0.171213 0 0 0 0 0
0 0 0 0.173022 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0.174964 0 0 0 0 0
0 0 0 0 0.17177 0 0 0 0 0 0 0 0 0 0 0 0 0
0.174004 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0.174004 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0.173392 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0.17581 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0.174467 0 0 0 0 0
0 0 0 0.174032 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0.171213 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0.172245 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0.171213 0 0 0 0 0
0 0 0 0 0.173349 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0.172245 0 0 0 0 0
0 0 0 0 0.175626 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0.177733 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0.173809 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0.168742 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0.175043 0 0 0 0 0
0 0.171759 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0.177312 0 0 0 0 0 0

The corresponding filters (this also looks correct as the positions match):
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0

After doing one step, though, the data in the filter and target blobs are
different.
Targets post step:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0.177312 0 0 0 0 0 0 0 0
0 0 0.175913 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0.176915 0 0 0 0 0 0 0 0
0 0 0.176081 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0.177312 0 0 0 0 0 0 0 0
0 0 0.17177 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0.173809 0 0 0 0 0 0 0 0 0 0 0 0 0 0.173392 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.176585
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0.171213 0 0 0 0 0 0 0
0 0.173022 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0.174964 0 0 0 0 0 0 0
0 0 0.17177 0 0 0 0 0 0 0 0 0 0 0 0 0 0.174004 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.174004 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.173392 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0.17581 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0.174467 0 0 0 0 0 0 0
0 0.174032 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0.171213 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0.172245 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0.171213 0 0 0 0 0 0 0
0 0 0.173349 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0.172245 0 0 0 0 0 0 0
0 0 0.175626 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0.177733 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0.173809 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0.168742 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0.175043 0 0 0 0 0 0 0.171759
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0.177312 0 0 0 0 0 0 1 0

Filters post step:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0

If you look closely all the values are shifted two places to the left.
This results in incorrect filtered q values since some actions might not
even have a corresponding filter any more. Some rows are all zeros and some
rows will have two targets.

Filtered q values:
0 0 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0.00185883 -0 -0 -0 -0 -0 0 -0 0
0 0 -0.0725774 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0.00195366 -0 -0 -0 -0 -0 0 -0 0
0 0 -0.0720046 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0.00185883 -0 -0 -0 -0 -0 0 -0 0
0 0 -0.0729674 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0.0723516 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0.0462732 0
0 0 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0.186157
0 0 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0 -0.0137082 -0 -0 -0 -0 0 -0 0
0 0.0343856 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 -0 -0.0133068 -0 -0 -0 -0 0 -0 0
0 0 -0.0729674 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0.0423569 0
0 0 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0.0423569 0
0 0 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0.0462732 0
0 0 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0.0324558 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0 -0.014098 -0 -0 -0 -0 0 -0 0
0 0.0291945 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0 -0.0137082 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0 -0.0111855 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0 -0.0137082 -0 -0 -0 -0 0 -0 0
0 0 -0.069723 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0 -0.0111855 -0 -0 -0 -0 0 -0 0
0 0 -0.0663748 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0.0305264 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0.0723516 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 -0.00127689 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0 -0.0107717 -0 -0 -0 -0 0 -0 0.180422
0 0 -0 0 0 -0 0 -0 0 0 -0 -0 -0 -0 -0 0 -0 0
0 0 -0 0 0 -0 0 -0 0 0.00185883 -0 -0 -0 -0 -0 0 4.17635e-38 0


Reply to this email directly or view it on GitHub
#12 (comment)
.

How to save the model?

Hi,

In the file models/fast_dqn_solver.prototxt there are the lines:

snapshot intermediate results

snapshot: 1000000
snapshot_prefix: "model/dqn"

Currently every 1M steps the model is saved. You can modify this to save
it at some other interval.

-Peter

On Mon, Mar 14, 2016 at 8:43 AM, jiutiandiwang notifications@github.com
wrote:

How to save the model?


Reply to this email directly or view it on GitHub
#12 (comment)
.

I have got the same problem. heck failed: bottom[0]->shape() == bottom[i]->shape() bottom[0]: 1 1 100 20 20 (40000), bottom[1]: 1 1 20 20 20 (8000)

What do you mean by inner-product layer ? how to solve it? @omgteam