Memory footprints of ActiveSubspace and POD for large problems.

Question

Memory footprints of ActiveSubspace and POD for large problems.

tomoleary opened this issue 3 years ago · comments

The way that these decompositions are currently computed (as well as data generation) are memory intensive operations, since they require storing many vectors simultaneously.

Reducing the memory footprint for data generation is easy, and already mostly fixed on tom_dev branch.

Reducing the memory footprint for the active subspace decomposition requires refactoring all the way from the randomized eigensolver down. The strategy is to iterate over samples one by one, solve the forward problem, set linearization points and then perform the entire Gaussian random matrix for that sample, updating the target matrix in place, and scaling accordingly for the averaging operation. The issue is that MatMVMult in hippylib iteratively applies a given operator to different vectors to perform a matrix-matrix product, but we want to put this MatMVMult itself inside of a for loop, so we need to create a custom averagedMatMVMult function that then can apply the entire matrix-matrix action for one sample, and then iterate over samples until the entire action is complete.

Additionally because the doublePass algorithm calls this operation twice, the samples need to be exactly the same for successive calls to this operator, but we want the option to not have to save the samples directly, so we need a deterministic sampling strategy (this requires custom instance of Random with deterministic burn-ins). This raises a question about the parRandom when it is on a communicator mesh grid.

tomoleary · Answer 1 · Sat Oct 09 2021 07:06:08 GMT+0800 (China Standard Time)

Resolved in October 8 PR.

tomoleary · Answer 2 · Wed Oct 13 2021 02:53:14 GMT+0800 (China Standard Time)

Resolved in October 8 PR.