GjjvdBurg / SparseStep

SparseStep: Approximating the Counting Norm for Sparse Regularization

Home Page:https://arxiv.org/abs/1701.06967

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

really great code but do you have it coded in Python?

Sandy4321 opened this issue · comments

really great code but do you have it coded in Python?

and also will it work for bid data like
us-used-cars-dataset 9 GB 3ml rows 66 features predict price
https://www.kaggle.com/ananaymital/us-used-cars-dataset

Hi @Sandy4321,

Thanks for your kind words and your question. I don't have an equivalent package in Python, but the core algorithm is not too complex so perhaps you could consider coding it up yourself.

Regarding the dataset: in terms of features this shouldn't be a problem, but you'll likely run into memory issues due to the large number of rows. Perhaps you could consider separating the data into chunks and creating an ensemble model?

Perhaps you could consider separating the data into chunks and creating an ensemble model?

great idea thanks
can you please share some link for such a python code - for any ML algorithm even for regression or random forest
how to divide to chunks and create an ensemble model?

I see what you meaning
but imagine your self that the same logic group of rows is spread to different chunks
then we have many weak classifiers
like many not professional in music people even million can compose music like one Mozart

for examlple
https://thenewstack.io/the-big-data-debate-batch-processing-vs-streaming-processing/

I thought something like
warm_start
reuse the solution of the previous call to fit as initialization

https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.ElasticNet.html#sklearn.linear_model.ElasticNet

I'm sorry, but I don't think this is related to SparseStep anymore. For general advice on fitting a machine learning model, please ask on places such as Cross Validated.

it is about SparseStep
how to use SparseStep with big data?

This works for me, no ensemble necessary:

> X <- as.matrix(rnorm(3e6, 66))
> y <- as.vector(rnorm(3e6))
> library(sparsestep)
> fit <- path.sparsestep(X, y)