`standardize` creates implicit intercept even if `intercept=FALSE`

Question

`standardize` creates implicit intercept even if `intercept=FALSE`

huisaddison opened this issue 2 years ago · comments

Summary:

standardize creates an implicit intercept through the centering step even if intercept=FALSE. This is because the standardization step removes the means of each (non-intercept) column.
In least squares, mean-centering does not affect the final fit (nor the non-intercept coefficients), only the intercept coefficient. After fitting, the intercept term "on the original scale" can be recovered via this step.
When intercept=TRUE, an intercept term is formally included in the quantile lasso problem for each quantile level. This is different from the intercept that is introduced if standardize=TRUE, intercept=FALSE.

How to resolve?

Option 1: The documentation of quantile_lasso can be updated to clarify this distinction (and to fix the incorrect statement that when intercept=FALSE, the beta matrix will only have p rows).

Option 2: Decline to center the columns as part of the standardization step when intercept=FALSE. This is what glmnet does (see these lines). In glmnet,

intercept toggles mean centering, and standardize toggles unit variance scaling. Thus, if intercept=FALSE, standardize=TRUE, no mean centering is performed.
Then, in the actual fitting, intercept further toggles whether an intercept is actually fit (see these lines).
Therefore for glmnet there is no "implicit intercept" introduced when standardize=TRUE.

Explanation and MWE

Consider the example:

library(quantgen)
set.seed(1)
n = 100
p = 5
beta = rnorm(p)
beta0 = rnorm(1)
X = matrix(rnorm(p*n), ncol=p)
X = matrix(rnorm(p*n), ncol=p)
y = X%*%beta + beta0 + rnorm(n)

> qr1$beta
6 x 1 sparse Matrix of class "dgCMatrix"
     tau=0.5, lam=0
[1,]     0.09643698
[2,]    -0.69192820
[3,]     0.29832938
[4,]    -0.92821645
[5,]     1.55401742
[6,]     0.29139470
> qr2$beta
6 x 1 sparse Matrix of class "dgCMatrix"
     tau=0.5, lam=0
[1,]     -0.9557202
[2,]     -0.7180836
[3,]      0.1623037
[4,]     -0.9758548
[5,]      1.4705162
[6,]      0.3999571

Both qr1 and qr2 have p+1 coefficients, despite the ?quantile_lasso documentation, which states that,

[...]
Value:

     A list with the following components:

    beta: Matrix of lasso coefficients, of dimension = (number of
          features + 1) x (number of quantile levels) assuming
          ‘intercept=TRUE’, else (number of features) x (number of
          quantile levels). Note: these coefficients will always be on
          the appropriate scale; they are always on the scale of
          original features, even if ‘standardize=TRUE’ 
[...]

The "extra" coefficient that appears even when intercept=FALSE arises from the default argument that standardize=TRUE, which adds an implicit intercept.

If one turns of both intercept and standardize, then the model indeed only has p coefficients:

> quantgen::quantile_lasso(X, y, tau=0.5, lambda=0, intercept=FALSE, standardize=FALSE)$beta
5 x 1 sparse Matrix of class "dgCMatrix"
     tau=0.5, lam=0
[1,]     -0.8404342
[2,]      0.1153272
[3,]     -0.8874969
[4,]      1.4669617
[5,]      0.3537983