chuckcho / video-caffe

Video-friendly caffe -- comes with the most recent version of Caffe (as of Jan 2019), a video reader, 3D(ND) pooling layer, and an example training script for C3D network and UCF-101 data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Why are some channels of the input data are all zeros?

fxing328 opened this issue · comments

Hi,

When I try to train this model on my own dataset , I just found the input images or videos data are not completely read when I look into the intermediate output. Some of the channels are all zeros, especially for frames starting from the 3rd frame , they all have two channels with zeros. Please see the following details:

(Pdb) solver.net.blobs['data'].data[1,:,2,:,:]
array([[[ 151., 151., 151., ..., 149., 149., 150.],
[ 160., 163., 167., ..., 151., 151., 151.],
[ 150., 150., 150., ..., 157., 158., 159.],
...,
[ 156., 156., 157., ..., 159., 159., 159.],
[ 160., 160., 160., ..., 156., 156., 156.],
[ 164., 164., 164., ..., 160., 160., 160.]],

   [[   0.,    0.,    0., ...,    0.,    0.,    0.],
    [   0.,    0.,    0., ...,    0.,    0.,    0.],
    [   0.,    0.,    0., ...,    0.,    0.,    0.],
    ..., 
    [   0.,    0.,    0., ...,    0.,    0.,    0.],
    [   0.,    0.,    0., ...,    0.,    0.,    0.],
    [   0.,    0.,    0., ...,    0.,    0.,    0.]],

   [[   0.,    0.,    0., ...,    0.,    0.,    0.],
    [   0.,    0.,    0., ...,    0.,    0.,    0.],
    [   0.,    0.,    0., ...,    0.,    0.,    0.],
    ..., 
    [   0.,    0.,    0., ...,    0.,    0.,    0.],
    [   0.,    0.,    0., ...,    0.,    0.,    0.],
    [   0.,    0.,    0., ...,    0.,    0.,    0.]]], dtype=float32)

I also saw these output when training on the UCF101 dataset, and I make sure everything is correct. They look the same. But when I train on AlexNet, I looked into the data output, they have values in all the three channels.
Is this a problem? Or, it means only data of the first channel is used to train this 3d network?

Also have these values been subtracted by the mean? I wondering the values after normalization are suppose to be small floating values, and with both positive and negative values.

Thanks for your input. Could you check if this issue persists with the latest fix -- 7fd2230?
I confirmed that UCF101 clip accuracy (training from scratch) reaches 45% as in the original paper Fig 2.

This issue is fixed with the recent commit -- a6c5c1b