glm-tools / pyglmnet

Python implementation of elastic-net regularized generalized linear models

Home Page:http://glm-tools.github.io/pyglmnet/

Repository from Github https://github.comglm-tools/pyglmnetRepository from Github https://github.comglm-tools/pyglmnet

ValueError: group should be (n_features,)

duemig opened this issue · comments

image

I dont get this error

It doesn't happen for me. Can you provide a full script to reproduce instead of a screenshot. Here is what I tried:

import numpy as np
from pyglmnet import GLM

group_ids = np.random.random(36)
X_train_trans = np.random.random((42603, 36))
y_train = np.random.random(42603)

glm = GLM(distr="gaussian", group=group_ids, alpha=0.05, reg_lambda=0.2, max_iter=1000)
glm.fit(X=X_train_trans, y=y_train)

I found it

image

now it works.

It is due to the datatype (np.float32 vs np.float64)

Could you fix that?

Can I use sklearn GridsearchCV to determine the parameters??

Thanks

Best,
David

can you modify my script to show me how can I make it fail? It works for me whether I use np.float32 or np.float64.

Yes, GridsearchCV used to work but I am not quite sure if it works on the latest version of sklearn.

import numpy as np
from pyglmnet import GLM

group_ids = np.float32(np.random.random(36))
X_train_trans = np.random.random((42603, 36))
y_train = np.random.random(42603)

glm = GLM(distr="gaussian", group=np.float32(group_ids), alpha=0.05, reg_lambda=0.2, max_iter=1000)
glm.fit(X=np.float32(X_train_trans), y=np.float32(y_train))

image

But with the sklearn GirdsearchCV as well ? so not GLMCV ?

Can I use the package as grouplasso for penalizing betas of a cubic spline representation

Is there already an open issue for the following

image

Or am I doing something wrong ?

If I install pyglmnet I get version 1.0.0
image

image
image

Does not seem to work ;(

You need to use the development version for this. Unfortunately we have a release due for a long time. Can you try using the development version in the meanwhile?

But with the sklearn GirdsearchCV as well ? so not GLMCV ?

you can use both depending on your application.

Can I use the package as grouplasso for penalizing betas of a cubic spline representation

sorry I don't know exactly what you are trying to do. But yes, we do support group lasso.

Thank you for your answer.

I will try this tmr and let you know whether it works.

However, from the source code it seems that tscv (https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.TimeSeriesSplit.html) is not supported.

This would be super helpful for time series prediction tasks where k-fold etc. fail.

It would be nice for GLMCV to accept a cv object from sklearn but nothing stops you from using your own cv and using cross_val_score etc.

Hey,

Is there a reason why it becomes so slow when I use the Github version?

image

GridsearchCV seems to work
image

But it is super slow ;(

Any suggestions ? For my purpose its infeasible.

Just to be sure it's not a problem with the convergence criteria, can you set the max_iter lower and check the timings?

seems like the slowness is arising from the same root cause (group lasso). duplicated by #267