Reduce the number of Krylov iterations spent on trust-region algorithms on a rejected step

Question

Reduce the number of Krylov iterations spent on trust-region algorithms on a rejected step

josyoun opened this issue 10 years ago · comments

When a trust-region method rejects a step, we don't move, reduce the size of the trust-region, and then compute the Krylov solve again. Unless the trust-region starts cutting into the Cauchy point, it turns out that many of these iterations are exactly the same as they were before. Now, for small problems, this is wasted compute time and not a big deal. However, for something like a reduced-space method for parameter-estimation, this is two forward solves and two adjoints solves per iteration. For Gauss-Newton, this is one forward solve and one adjoint solve per iteration. For PDE solves, this is very expensive and a complete waste.

In order to fix this, we could just checkpoint the Krylov solve. This would improve compute time and sacrifice memory. In theory, we could make this work, but I'not all that keen on trying to maintain restart infrastructure on the Krylov methods.

As an alternative, I'm pretty sure that adding a dog-leg safeguard would fix the problem. Basically, we compute the Cauchy point and then we compute the result from the truncated Krylov method. Then, we do a dog-leg step from the truncated-Krylov step to the Cauchy point. Due to how truncated-CG works, the dogleg path is monotonically increasing in norm and the the model is monotonically nonincreasing. As such, we can just run a standard dog-leg algorithm on it. This means that we could cut back the computed step in a much more efficent manner since we'd avoid new Krylov iterations. Eventually, we'd retreat to the Cauchy point, which means that the convergence results would still hold. In addition, we should interfere with our high-order convergence results since, in theory, the trust-region wouldnt be active if we're close to the solution. In any case, this is pretty easy to do for the unconstrained algorithms. For the composite step SQP algorithm, it's trickier since we may want to cut back the normal step and tangential step independently. I'm pretty sure it's workable, but requires some thought.

Joseph Young · Answer 1 · Wed Dec 02 2015 15:45:01 GMT+0800 (China Standard Time)

As another note, I still think the dogleg approach would work, but not for problems with equality constraints. Unless we don't take a quasinormal step, the tangential subproblem will occur at a different location, which necessitates a new computation.

Joseph Young · Answer 2 · Thu Dec 10 2015 14:38:58 GMT+0800 (China Standard Time)

I added the dogleg approach with commit aa49e44. Overall, I think it works well. Though, I don't think the performance is vastly different to what we had before. Likely, it'll depend on the problem.