cvxgrp / CVXR

An R modeling language for convex optimization problems.

Home Page:https://cvxr.rbind.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Inaccurate timing results

dbackenroth opened this issue · comments

Thank you for developing the great CVXR package. The timing stats provided by CVXR seem to be drastically underestimating the amount of time required for problem solution. 

Here is my script:

library(CVXR)
library(MASS)

Test <- function(n) {
  n_ds <- 1
  r_sq <- runif(n_ds, min = 0.05, max = 0.15)
  prev <- sample(c(0.005, 0.01, 0.05, 0.1), size = n_ds, replace = T)
  C <- GetCostMatrix(prev = prev, r_sq = r_sq, n)
  P <- GetPMatrix(C = C, m = round(n * 2/3))
}

GetPMatrix <- function(C, m) {
  n <- nrow(C)
  P <- Variable(rows = n, cols = n, name = "P", symmetric = T)
  ones <- matrix(rep(1, n))
  constraints <- list(
    P <= 1,
    0 <= P,
    P %*% ones == m * ones
  )
  objective <- Minimize(sum(C * P))
  problem <- Problem(objective, constraints = constraints)
  a <- Sys.time()
  result <- psolve(problem, solver = "ECOS", verbose = F)
  print("Sys.time timing:")
  print(Sys.time() - a)
  print("psolve timing:")
  print(result$solve_time + result$setup_time)
  P_res <- result$getValue(P)
  return(P_res)
}

GetCostMatrix <- function(prev, r_sq, n) {
  n_ds <- length(prev)
  z_k <- qnorm(prev, lower.tail = F)
  scores <- mvrnorm(n = n, mu = rep(0, n_ds), Sigma = diag(r_sq, nrow = length(r_sq)))
  grid <- expand.grid(m = 1:n, w = 1:n)
  mean_scores <- (scores[grid$m, , drop = F] + scores[grid$w, , drop = F]) / 2
  # 1 - Phi((z_k - c) / sqrt(1 - r^2_ps / 2))
  
  p_d <- matrix(NA, nrow = nrow(grid), ncol = n_ds)
  for (i in 1:n_ds) {
    p_d[, i] <- 1 - pnorm((z_k[i] - mean_scores[, i, drop = F]) / sqrt(1 - r_sq[i] / 2))
  }
  p_no_d <- apply(1 - p_d, 1, prod)
  no_d_matrix <- matrix(p_no_d, nrow = n, ncol = n, byrow = F)
}

I print out the time using both Sys.time() and also using the solve_time and setup_time variables returned by CVXR::psolve. It appears that time is both much higher with Sys.time() than with solve_time and setup_time, and it also increases at a much faster rate with problem size (# rows in symmetric matrix which we want to optimize) as measured by Sys.time() then by using solve_time and setup_time. 

Is there some other operation done by CVXR::psolve other than solve and setup, and is there anything that can reduce how long that operation takes?

Test(100)
[1] "Sys.time timing:"
Time difference of 1.458917 secs
[1] "psolve timing:"
[1] 0.05601097
Test(200)
[1] "Sys.time timing:"
Time difference of 16.15535 secs
[1] "psolve timing:"
[1] 0.3218943
Test(400)
[1] "Sys.time timing:"
Time difference of 3.830424 mins
[1] "psolve timing:"
[1] 1.831462

This is a valid point. The timings returned by the psolve refer to the solver setup and solver solution times. It does not tell anything about CVXR setup time etc. which can substantial as that is pure R code.