hjmshi / PyTorch-LBFGS

A PyTorch implementation of L-BFGS.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Some confusion about the code.

yangorwell opened this issue · comments

When I use the Powell damping to hold the PSD property of the metric matrix, I find the Bs update are written as "Bs.copy_(g_Sk.mul(-t))". Is it right?

Hi, thanks for taking a look at the code!

Powell damping in the code actually occurs in the modification of y in this part of the code:

# perform Powell damping
if damping == True and ys < eps*sBs:
if debug:
print('Applying Powell damping...')
theta = ((1-eps)*sBs)/(sBs - ys)
y = theta*y + (1-theta)*Bs
.

Bs is explicitly computed by noting that $x_{k + 1} = x_k - t H_k g_{S_k}, so B_k s_k = B_k (x_{k + 1} - x_k) = -t B_k H_k g_{S_k} = -t g_{S_k}$. Note that we never explicitly compute B.

Please let me know if this helps clarify things!

Thanks for your prompt response! This solve my problem!