Fast algorithm to solve Lasso-like problems with dual extrapolation. Currently, the package handles the following problems: Lasso, Sparse Logistic regression, Group Lasso and Multitask Lasso. The estimators follow the scikit-learn API, come with automated cross-validation, and support sparse and dense data with feature centering and normalization. The solvers used allow for solving large scale problems with millions of features.
Please visit https://mathurinm.github.io/celer/ for the latest version of the documentation.
Assuming you have a working Python environment, e.g. with Anaconda you can install celer with pip.
From a console or terminal install celer with pip:
pip install -U celer
From a console or terminal clone the repository and install Celer:
git clone https://github.com/mathurinm/celer.git cd celer/ pip install -e .
To build the documentation you will need to run:
pip install -U sphinx_gallery sphinx_bootstrap_theme cd doc/ make html
You find on the documentation examples on the Leukemia dataset (comparison with scikit-learn) and on the Finance/log1p dataset (more significant, but it takes times to download the data, preprocess it, and compute the path).
All dependencies are in ./setup.py
file.
If you use this code, please cite:
@InProceedings{pmlr-v80-massias18a, title = {Celer: a Fast Solver for the Lasso with Dual Extrapolation}, author = {Massias, Mathurin and Gramfort, Alexandre and Salmon, Joseph}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {3321--3330}, year = {2018}, volume = {80}, } @article{massias2019dual, title={Dual Extrapolation for Sparse Generalized Linear Models}, author={Massias, Mathurin and Vaiter, Samuel and Gramfort, Alexandre and Salmon, Joseph}, journal={arXiv preprint arXiv:1907.05830}, year={2019} }
ArXiv links: