mathurinm / celer

Fast solver for L1-type problems: Lasso, sparse Logisitic regression, Group Lasso, weighted Lasso, Multitask Lasso, etc.

Home Page:https://mathurinm.github.io/celer/

Repository from Github https://github.commathurinm/celerRepository from Github https://github.commathurinm/celer

celer

build coverage License Downloads Downloads PyPI version

celer is a Python package that solves Lasso-like problems and provides estimators that follow the scikit-learn API. Thanks to a tailored implementation, celer provides a fast solver that tackles large-scale datasets with millions of features up to 100 times faster than scikit-learn.

Currently, the package handles the following problems:

Problem Support Weights Native cross-validation
Lasso
ElasticNet
Group Lasso
Multitask Lasso
Sparse Logistic regression

If you are interested in other models, such as non convex penalties (SCAD, MCP), sparse group lasso, group logistic regression, Poisson regression, Tweedie regression, have a look at our companion package skglm

Cite

celer is licensed under the BSD 3-Clause. Hence, you are free to use it. If you do so, please cite:

@InProceedings{pmlr-v80-massias18a,
  title     = {Celer: a Fast Solver for the Lasso with Dual Extrapolation},
  author    = {Massias, Mathurin and Gramfort, Alexandre and Salmon, Joseph},
  booktitle = {Proceedings of the 35th International Conference on Machine Learning},
  pages     = {3321--3330},
  year      = {2018},
  volume    = {80},
}

@article{massias2020dual,
  author  = {Mathurin Massias and Samuel Vaiter and Alexandre Gramfort and Joseph Salmon},
  title   = {Dual Extrapolation for Sparse GLMs},
  journal = {Journal of Machine Learning Research},
  year    = {2020},
  volume  = {21},
  number  = {234},
  pages   = {1-33},
  url     = {http://jmlr.org/papers/v21/19-587.html}
}

Why celer?

celer is specially designed to handle Lasso-like problems which makes it a fast solver of such problems. In particular, it comes with tools such as:

  • automated parallel cross-validation
  • support of sparse and dense data
  • optional feature centering and normalization
  • unpenalized intercept fitting

celer also provides easy-to-use estimators as it is designed under the scikit-learn API.

Get started

To get started, install celer via pip

pip install -U celer

On your python console, run the following commands to fit a Lasso estimator on a toy dataset.

>>> from celer import Lasso
>>> from celer.datasets import make_correlated_data
>>> X, y, _ = make_correlated_data(n_samples=100, n_features=1000)
>>> estimator = Lasso()
>>> estimator.fit(X, y)

This is just a starter example. Make sure to browse celer documentation to learn more about its features. To get familiar with celer API, you can also explore the gallery of examples which includes examples on real-life datasets as well as timing comparisons with other solvers.

Contribute to celer

celer is an open-source project and hence relies on community efforts to evolve. Your contribution is highly valuable and can come in three forms

  • bug report: you may encounter a bug while using celer. Don't hesitate to report it on the issue section.
  • feature request: you may want to extend/add new features to celer. You can use the issue section to make suggestions.
  • pull request: you may have fixed a bug, enhanced the documentation, ... you can submit a pull request and we will respond asap.

For the last mean of contribution, here are the steps to help you setup celer on your local machine:

  1. Fork the repository and afterwards run the following command to clone it on your local machine
git clone https://github.com/{YOUR_GITHUB_USERNAME}/celer.git
  1. cd to celer directory and install it in edit mode by running
cd celer
pip install -e .
  1. To run the gallery examples and build the documentation, run the following
cd doc
pip install -e .[doc]
make html

Further links

About

Fast solver for L1-type problems: Lasso, sparse Logisitic regression, Group Lasso, weighted Lasso, Multitask Lasso, etc.

https://mathurinm.github.io/celer/

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Python 56.1%Language:Cython 43.2%Language:Makefile 0.6%