styvon / cPCG

R package for solving system of linear equations using (preconditioned) conjugate gradient algorithm, with improved efficiency using C++ Armadillo linear algebra library, and flexibility for user-specified preconditioning method.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Allow initial coefficient values (vector x) to be passed as an argument?

tomwenseleers opened this issue · comments

I was just wondering whether it would be a lot of trouble to allow the initial coefficient estimates x to be passed as an argument x in the functions cgsolve, pcgsolve & the two sparse versions, and to use as default rep(0,ncol(A)) ? It' just that I am dealing with an application in which I already have reasonable initial estimates available, so I would expect a speedup if I started from there...

Hi Tom, thanks for the suggestion! Will work on this one in the next release.

Many thanks for that! I just tried adding it, and it seems that timings did not go down quite as much as I had hoped (they only decreased by 10-20% for problem sizes with n=200 and p=100 or n=2000 and p=1000) - i.e. even if I passed as initial values a pre-calculated solution, it would still take more time than I thought to return an updated solution. Would you happen to know where the bottleneck could be? I had thought it should then give a solution near-instantaneously, but maybe I'm missing something... Or is there anything else I could pass between two subsequent cgsolve fits, aside from just the coefficient starting values? PS Checking further it seems that the bottleneck was that I was using cgsolve to solve a ridge regression, using

beta_cgsolve <- cgsolve(A = crossprod(X)+lambda*diag(ncol(X)), b = crossprod(X,y),
                                       start = beta_solve, tol = 1e-6, maxIter = 1000)

and the main bottleneck of course is then the calculation of A = crossprod(X)+lambda*diag(ncol(X)). I was now trying with Eigen::LeastSquaresConjugateGradient in Rcpp together with solveWithGuess to pass initial values (https://eigen.tuxfamily.org/dox/classEigen_1_1LeastSquaresConjugateGradient.html), which has the advantage that it doesn't require forming crossprod(X) - I just need to row augment my covariate matrix X with sqrt(lambda)*diag(ncol(X)) first to solve the ridge regression problem...

How does your symmetric and positive-definite matrix A look like? If it has a large condition number then the algorithm might take long to converge. Have you tried using preconditioners other than Jacobi?

The condition number is 29.52542, so that should be OK right? It's not sparse and it looks like this:
Rplot

cgsolve doesn't have any options to specify other preconditioners right? For pcgsolve Jacobi was the fastest for my problem...