`standardize` creates implicit intercept even if `intercept=FALSE`
huisaddison opened this issue · comments
Summary:
standardize
creates an implicit intercept through the centering step even ifintercept=FALSE
. This is because the standardization step removes the means of each (non-intercept) column.- In least squares, mean-centering does not affect the final fit (nor the non-intercept coefficients), only the intercept coefficient. After fitting, the intercept term "on the original scale" can be recovered via this step.
- When
intercept=TRUE
, an intercept term is formally included in the quantile lasso problem for each quantile level. This is different from the intercept that is introduced ifstandardize=TRUE, intercept=FALSE
.
How to resolve?
Option 1: The documentation of quantile_lasso
can be updated to clarify this distinction (and to fix the incorrect statement that when intercept=FALSE
, the beta
matrix will only have p
rows).
Option 2: Decline to center the columns as part of the standardization step when intercept=FALSE
. This is what glmnet does (see these lines). In glmnet,
intercept
toggles mean centering, andstandardize
toggles unit variance scaling. Thus, ifintercept=FALSE, standardize=TRUE
, no mean centering is performed.- Then, in the actual fitting,
intercept
further toggles whether an intercept is actually fit (see these lines). - Therefore for glmnet there is no "implicit intercept" introduced when
standardize=TRUE
.
Explanation and MWE
Consider the example:
library(quantgen)
set.seed(1)
n = 100
p = 5
beta = rnorm(p)
beta0 = rnorm(1)
X = matrix(rnorm(p*n), ncol=p)
X = matrix(rnorm(p*n), ncol=p)
y = X%*%beta + beta0 + rnorm(n)
> qr1$beta
6 x 1 sparse Matrix of class "dgCMatrix"
tau=0.5, lam=0
[1,] 0.09643698
[2,] -0.69192820
[3,] 0.29832938
[4,] -0.92821645
[5,] 1.55401742
[6,] 0.29139470
> qr2$beta
6 x 1 sparse Matrix of class "dgCMatrix"
tau=0.5, lam=0
[1,] -0.9557202
[2,] -0.7180836
[3,] 0.1623037
[4,] -0.9758548
[5,] 1.4705162
[6,] 0.3999571
Both qr1
and qr2
have p+1
coefficients, despite the ?quantile_lasso
documentation, which states that,
[...]
Value:
A list with the following components:
beta: Matrix of lasso coefficients, of dimension = (number of
features + 1) x (number of quantile levels) assuming
‘intercept=TRUE’, else (number of features) x (number of
quantile levels). Note: these coefficients will always be on
the appropriate scale; they are always on the scale of
original features, even if ‘standardize=TRUE’
[...]
The "extra" coefficient that appears even when intercept=FALSE
arises from the default argument that standardize=TRUE
, which adds an implicit intercept.
If one turns of both intercept
and standardize
, then the model indeed only has p
coefficients:
> quantgen::quantile_lasso(X, y, tau=0.5, lambda=0, intercept=FALSE, standardize=FALSE)$beta
5 x 1 sparse Matrix of class "dgCMatrix"
tau=0.5, lam=0
[1,] -0.8404342
[2,] 0.1153272
[3,] -0.8874969
[4,] 1.4669617
[5,] 0.3537983