In bp function, WE should set ds:fill(1) not set ds:zero()

Question

In bp function, WE should set ds:fill(1) not set ds:zero()

BigNewbiePlus opened this issue 8 years ago · comments

I am confused with the backpropagation process. In the bp function, when i equals params.seq_length, the ds shouldn't equal 1 but not 0?(your reset_ds() set ds be 0) As we know, the outermost calculus should be 1 when calculus Composite function: dy = 1 * d(x^2)= 2x. using 0 will be dy=0 * d(x^2) = 0. Is this a bug?

  reset_ds()
  for i = params.seq_length, 1, -1 do
    state.pos = state.pos - 1
    local x = state.data[state.pos]
    local y = state.data[state.pos + 1]
    local s = model.s[i - 1]
    local derr = transfer_data(torch.ones(1))
    local tmp = model.rnns[i]:backward({x, y, s},
                                       {derr, model.ds})[3]
    g_replace_table(model.ds, tmp)
    cutorch.synchronize()
  end

Wojciech Zaremba · Answer 1 · Tue Mar 28 2017 09:46:22 GMT+0800 (China Standard Time)

Unrolled RNN optimizes loss function, which is independent of the last hidden state. Therefore, it's derivative with respect to this state is zero.