-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
solution dependence on regularization path #71
Comments
if I switch the order of the So my guess is that it is working well in this particular example for the higher values of
|
maybe this could be a test?
okay, we can do that too but option should not be in the public API. Maybe in a private function / method ... |
an alternative to shutting down warm restarts is to just run it for each individual lambda at a time and compare it to when running them together. I haven’t manage to get positive pseudo-R2s on the V4 neuron for the canonical link function in pyglmnet (these values are setting |
according to the paper, the logic of warm restarts doesn't work if we go from low to high lambda. they recommend fitting the largest lambda first and initializing the next lambda after. fix: the documentation should include this so that user knows to specify lambda's in decreasing order. @hugoguh shall i close this issue and create a new issue for documentation related changes? |
still it should converge for a single import numpy as np
import scipy.sparse as sps
from sklearn.preprocessing import StandardScaler
from pyglmnet import GLM
model = GLM(distr='poissonexp', verbose=False, alpha=0.05, reg_lambda=[0.001])
n_samples, n_features = 10000, 100
# coefficients
beta0 = np.random.normal(0.0, 1.0, 1)
beta = sps.rand(n_features, 1, 0.1)
beta = np.array(beta.todense())
# training data
Xr = np.random.normal(0.0, 1.0, [n_samples, n_features])
yr = model.simulate(beta0, beta, Xr)
# testing data
Xt = np.random.normal(0.0, 1.0, [n_samples, n_features])
yt = model.simulate(beta0, beta, Xt)
# fit Generalized Linear Model
scaler = StandardScaler().fit(Xr)
model.fit(scaler.transform(Xr), yr)
# we'll get .fit_params after .fit(), here we get one set of fit parameters
fit_param = model[-1].fit_
# we can use fitted parameters to predict
yhat = model.predict(scaler.transform(Xt))
print 'reg_lambdas:', model.reg_lambda
print 'pseudo_R2s:', [model.pseudo_R2(yt,i,np.mean(yr)) for i in yhat] output:
|
I don’t think this issue should be closed. It’s not about documenting the order of the lambdas when using warm starts (though that should be documented). This issue was about the convergence which still fails in this case. While R doesn’t.
|
@hugoguh I agree we should crease out the convergence issues urgently. Did you manage to trace the source of the bug? |
@hugoguh agreed that we should make the outputs conform to sklearn and R glmnet package for whichever use cases exist in those tools. I appreciate all the testing! |
Let's aim to get to the bottom of the bug. In your bug report above, I wouldn't call it a convergence issue, so much as a poor choice of reg. parameter. Note that the absolute value of reg. parameter lambda cannot be directly compared for In your bug report above, the test set pseudo-R2 being negative, simply indicates that the model has overfit to the training set. This can be verified if you print both training set ( I would expect to see comparable |
@pavanramkumar |
From here: http:https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.ElasticNet.html alpha : float |
Turns out this is mainly an issue of regularization path. See conversation in #76 |
let's just rename this issue so that all the conversation is in the same place? |
I got negative pseudo-R2s on the V4 data
so I checked:
running with default
reg_lambdas
seems to work fineOutput:
However if I choose another, even not that different range here’s what happens:
output:
The text was updated successfully, but these errors were encountered: