yaglm
is a modern, comprehensive and flexible Python package for fitting and tuning penalized generalized linear models and other supervised M-estimators in Python. It supports a wide variety of losses (linear, logistic, quantile, etc) combined with penalties and/or constraints. Beyond the basic lasso/ridge, yaglm
supports structured sparsity penalties such as the nuclear norm and the group, exclusive, graph fused, and generalized lasso. It also supports the more accurate adaptive and non-convex (e.g. SCAD) flavors of these penalties that typically come with strong statistical guarantees at limited additional computational expense.
Parameter tuning methods including cross-validation, generalized cross-validation, and information criteria (e.g. AIC, BIC, EBIC, HBIC) come built-in. BIC-like information criteria are important for analysts interested in model selection (Zhang et al, 2012). We also provide built-in linear regression noise variance estimators (Reid et al, 2016; Yu and Bien, 2019; Liu et al, 2020). Tuning parameter grids are automatically created from the data whenever possible.
yaglm
comes with a computational backend based on FISTA with adaptive restarts, an augmented ADMM algorithm, cvxpy, and the LLA algorithm for non-convex penalties. Path algorithms and parallelization for fast tuning are supported. It is straightforward to supply your favorite, state of the art optimization algorithm to the package.
yaglm
follows a sklearn compatible API, is highly customizable and was inspired by many existing packages including sklearn, lightning, statsmodels, pyglmnet, celer, andersoncd, picasso, tick, PyUNLocBoX, regerg, grpreg, ncreg, and glmnet.
A manuscript describing this package and the broader GLM ecosystem can be found on arxiv.
Beware: This is a preliminary release of version 0.3.1. Not all features have been fully added and it has not yet been rigorously tested.
yaglm
can be installed via github
git clone https://github.com/yaglm/yaglm.git
python setup.py install
yaglm
should feel a lot like sklearn -- particularly LassoCV. The major difference is that we make extensive use of config objects to specify the loss, penalty, penalty flavor, constraint, and solver.
from yaglm.toy_data import sample_sparse_lin_reg
from yaglm.GlmTuned import GlmCV, GlmTrainMetric
from yaglm.config.loss import Huber
from yaglm.config.penalty import Lasso, GroupLasso
from yaglm.config.flavor import Adaptive, NonConvex
from yaglm.metrics.info_criteria import InfoCriteria
from yaglm.infer.Inferencer import Inferencer
from yaglm.infer.lin_reg_noise_var import ViaRidge
# sample sparse linear regression data
X, y, _ = sample_sparse_lin_reg(n_samples=100, n_features=10)
# fit a lasso penalty tuned via cross-validation with the 1se rule
GlmCV(loss='lin_reg',
penalty=Lasso(), # specify penalty with config object
select_rule='1se'
).fit(X, y)
# fit an adaptive lasso tuned via cross-validation
# initialized with a lasso tuned with cross-validation
GlmCV(loss='lin_reg',
penalty=Lasso(flavor=Adaptive()),
initializer='default'
).fit(X, y)
# fit an adaptive lasso and tuned via EBIC
# estimate the noise variance via a ridge-regression method
GlmTrainMetric(loss='lin_reg',
penalty=Lasso(flavor=Adaptive()),
inferencer=Inferencer(scale=ViaRidge()), # noise variance estimator
scorer=InfoCriteria(crit='ebic') # Info criteria
).fit(X, y)
# fit a huber loss with a group SCAD penalty
# both the huber knot parameter and the SCAD penalty parameter are tuned with CV
# the LLA algorithm is initialized with a group Lasso penalty tuned via cross-validation
groups = [range(5), range(5, 10)]
GlmCV(loss=Huber().tune(knot=range(1, 5)),
penalty=GroupLasso(groups=groups,
flavor=NonConvex()),
lla=True, # we use the LLA algorithm by default. If lla=False, we would use FISTA
).fit(X, y)
We can use the basic penalties as building blocks to create new ones e.g. via overlapping or separable sums of penalties. For example, we might want to penalized some features while leaving others unpenalized.
from yaglm.config.penalty import OverlappingSum, SeparableSum, \
FusedLasso, NoPenalty
from yaglm.pen_seq import get_sequence_decr_max
# Sometimes we want to put different penalties on different sets of features
# this can be accomplished with the SeparableSum() class
groups = {'no_pen': range(5), # don't penalized the first 5 features!
'sparse': range(5, 10)
}
est = GlmCV(penalty=SeparableSum(groups=groups,
no_pen=NoPenalty(),
sparse=Lasso(flavor=NonConvex())
)
).fit(X, y)
# Fit an adaptive sparse-fused lasso using the OverlappingSum() class
# note we have to manually specify the tuning sequence for the fused lasso
pen_val_seq = get_sequence_decr_max(max_val=1, num=10)
fused_config = FusedLasso(flavor=Adaptive()).tune(pen_val_seq=pen_val_seq)
est = GlmCV(penalty=OverlappingSum(fused=fused_config,
sparse=Lasso(flavor=Adaptive())
)
).fit(X, y)
You can employ your favoirite state of the art optimization algorithm by wrapping it in a solver config object. These objects can also be used to specify optimization parameters (e.g. maximum number of iterations).
from yaglm.solver.FISTA import FISTA # or your own solver!
# supply your favorite optimization algorithm!
solver = FISTA(max_iter=100) # specify optimzation parameters in the solvers' init
GlmCV(loss='lin_reg', penalty='lasso', solver=solver)
See the docs/ folder for additional examples in jupyter notebooks (if they don't load on github try nbviewer.jupyter.org/).
Additional documentation, examples and code revisions are coming soon. For questions, issues or feature requests please reach out to Iain: idc9@cornell.edu.
We welcome contributions to make this a stronger package: data examples, bug fixes, spelling errors, new features, etc.
If you use this package please cite our arxiv manuscript
@article{carmichael2021yaglm,
title={yaglm: a Python package for fitting and tuning generalized linear models that supports structured, adaptive and non-convex penalties},
author={Carmichael, Iain and Keefe , Thomas and Giertych, Naomi and Williams, Jonathan P},
journal={arXiv preprint arXiv:2110.05567},
year={2021}
}
Some of yaglm
's solvers wrap solvers implemented by other software packages. We kindly ask you also cite these underlying packages if you use their solver (see the solver config documentation).
Zou, H., 2006. The adaptive lasso and its oracle properties. Journal of the American statistical association, 101(476), pp.1418-1429.
Zou, H. and Li, R., 2008. One-step sparse estimates in nonconcave penalized likelihood models. Annals of statistics, 36(4), p.1509.
Beck, A. and Teboulle, M., 2009. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM journal on imaging sciences, 2(1), pp.183-202.
Zhang, Y., Li, R. and Tsai, C.L., 2010. Regularization parameter selections via generalized information criterion. Journal of the American Statistical Association, 105(489), pp.312-323.
Fan, J., Xue, L. and Zou, H., 2014. Strong oracle optimality of folded concave penalized estimation. Annals of statistics, 42(3), p.819.
Loh, P.L. and Wainwright, M.J., 2017. Support recovery without incoherence: A case for nonconvex regularization. The Annals of Statistics, 45(6), pp.2455-2482.
Reid, S., Tibshirani, R. and Friedman, J., 2016. A study of error variance estimation in lasso regression. Statistica Sinica, pp.35-67.
Zhu, Y., 2017. An augmented ADMM algorithm with application to the generalized lasso problem. Journal of Computational and Graphical Statistics, 26(1), pp.195-204.
Yu, G. and Bien, J., 2019. Estimating the error variance in a high-dimensional linear model. Biometrika, 106(3), pp.533-546.
Liu, X., Zheng, S. and Feng, X., 2020. Estimation of error variance via ridge regression. Biometrika, 107(2), pp.481-488.
Carmichael, I., Keefe, T., Giertych, N., Williams, JP., 2021 yaglm: a Python package for fitting and tuning generalized linear models that supports structured, adaptive and non-convex penalties.