Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] FIX problem of learning rate #226

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jasmainak
Copy link
Member

@jasmainak jasmainak commented Nov 9, 2017

closes #65

@pavanramkumar I am not sure if I got the lipschitz constant for logistic regression right, and we need to dig up the other Lipschitz constants or see how the R code deals with it

TODO

  • Add test

@codecov-io
Copy link

codecov-io commented May 6, 2018

Codecov Report

Merging #226 into master will decrease coverage by 18.42%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master     #226       +/-   ##
===========================================
- Coverage   75.48%   57.05%   -18.43%     
===========================================
  Files           4        7        +3     
  Lines         673     1311      +638     
  Branches      148      263      +115     
===========================================
+ Hits          508      748      +240     
- Misses        128      494      +366     
- Partials       37       69       +32
Impacted Files Coverage Δ
pyglmnet/pyglmnet.py 73.85% <100%> (-7.22%) ⬇️
pyglmnet/utils.py 32.55% <0%> (-10.58%) ⬇️
pyglmnet/externals/six.py 68.24% <0%> (ø)
pyglmnet/externals/funcsigs.py 32.54% <0%> (ø)
pyglmnet/externals/__init__.py 100% <0%> (ø)
pyglmnet/base.py 48.48% <0%> (+3.32%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b049151...1c39c8a. Read the comment docs.

@jasmainak jasmainak force-pushed the learning_rate branch 6 times, most recently from 9cf9503 to 87bc267 Compare August 29, 2018 05:04
@jasmainak
Copy link
Member Author

@pavanramkumar this PR might fix our learning_rate woes. It's close to ready ...

@jasmainak jasmainak force-pushed the learning_rate branch 2 times, most recently from bfb5f46 to e8dc899 Compare August 29, 2018 05:24
@jasmainak
Copy link
Member Author

Note also that in the community crime example, the R^2 for the grid search method matches what we find with reg path

@jasmainak jasmainak changed the title [WIP] FIX problem of learning rate [MRG] FIX problem of learning rate Aug 29, 2018
@jasmainak jasmainak changed the title [MRG] FIX problem of learning rate [WIP] FIX problem of learning rate Aug 29, 2018
@jasmainak jasmainak changed the title [WIP] FIX problem of learning rate [MRG] FIX problem of learning rate Aug 31, 2018
@jasmainak
Copy link
Member Author

@pavanramkumar with the fix in #249 the group lasso example gives reasonable R^2. So, it's fixing 2 or 3 examples. The only problem is that it slows down the convergence. I don't know off the bat how to solve that. Maybe profiling + cython can help.

@jasmainak
Copy link
Member Author

@pavanramkumar have you had time to look at this PR? It's related to our discussion today

@pavanramkumar
Copy link
Collaborator

@jasmainak i looked at it, but i don't know whether we can use this approach for all the distrs. also, how much slower is it with line search in each iteration? can we write a test?

@jasmainak
Copy link
Member Author

I think line search does not make any assumption about convexity -- so it should apply to all the distributions.

Sure I can add tests and report back some benchmarks

@jasmainak
Copy link
Member Author

@pavanramkumar I rebased this PR with master. It does solve some issues. See above.

But travis needs to be made happy ...

@jasmainak
Copy link
Member Author

@pavanramkumar here is the script to benchmark. And here is what I get on my computer for alpha=1.0:

reg_lambda | Learning rate | glm.n_iter_ | time (s)
   0.50000 |         0.001 |         286 | 9.154
   0.32374 |         0.001 |         286 | 9.737
   0.20961 |         0.001 |         286 | 8.534
   0.13572 |         0.001 |         286 | 9.067
   0.08788 |         0.001 |         286 | 8.560
   0.05690 |         0.001 |         286 | 9.609
   0.03684 |         0.001 |         286 | 9.310
   0.02385 |         0.001 |         286 | 11.780
   0.01544 |         0.001 |         286 | 8.841
   0.01000 |         0.001 |         294 | 8.102
   0.50000 |          0.01 |          91 | 2.278
   0.32374 |          0.01 |          91 | 2.298
   0.20961 |          0.01 |          91 | 2.396
   0.13572 |          0.01 |          91 | 2.277
   0.08788 |          0.01 |          94 | 2.388
   0.05690 |          0.01 |         103 | 2.793
   0.03684 |          0.01 |         153 | 3.907
   0.02385 |          0.01 |         164 | 4.228
   0.01544 |          0.01 |         259 | 6.560
   0.01000 |          0.01 |         277 | 6.567
   0.50000 |           0.1 |          19 | 0.455
   0.32374 |           0.1 |          33 | 0.776
   0.20961 |           0.1 |          32 | 1.315
   0.13572 |           0.1 |          65 | 2.160
   0.08788 |           0.1 |          89 | 2.927
   0.05690 |           0.1 |         132 | 5.334
   0.03684 |           0.1 |         179 | 5.463
   0.02385 |           0.1 |         219 | 6.056
/Users/mainak/Documents/github_repos/pyglmnet/pyglmnet/pyglmnet.py:885: UserWarning: Reached max number of iterations without convergence.
  "Reached max number of iterations without convergence.")
   0.01544 |           0.1 |        1000 | 35.344
   0.01000 |           0.1 |        1000 | 26.787
/Users/mainak/Documents/github_repos/pyglmnet/pyglmnet/pyglmnet.py:250: RuntimeWarning: invalid value encountered in true_divide
  grad_beta0 = np.sum(grad_mu) - np.sum(y * grad_mu / mu)
/Users/mainak/Documents/github_repos/pyglmnet/pyglmnet/pyglmnet.py:252: RuntimeWarning: invalid value encountered in true_divide
  np.dot((y * grad_mu / mu).T, X)).T)
/Users/mainak/Documents/github_repos/pyglmnet/pyglmnet/pyglmnet.py:652: RuntimeWarning: invalid value encountered in greater
  (np.abs(beta) > thresh)
/Users/mainak/Documents/github_repos/pyglmnet/pyglmnet/pyglmnet.py:100: RuntimeWarning: invalid value encountered in greater
  mu[z > eta] = z[z > eta] * np.exp(eta) + beta0
/Users/mainak/Documents/github_repos/pyglmnet/pyglmnet/pyglmnet.py:101: RuntimeWarning: invalid value encountered in less_equal
  mu[z <= eta] = np.exp(z[z <= eta])
/Users/mainak/Documents/github_repos/pyglmnet/pyglmnet/pyglmnet.py:117: RuntimeWarning: invalid value encountered in greater
  grad_mu[z > eta] = np.ones_like(z)[z > eta] * np.exp(eta)
/Users/mainak/Documents/github_repos/pyglmnet/pyglmnet/pyglmnet.py:118: RuntimeWarning: invalid value encountered in less_equal
  grad_mu[z <= eta] = np.exp(z[z <= eta])
   0.50000 |             3 |        1000 | 25.019
   0.32374 |             3 |        1000 | 25.106
   0.20961 |             3 |        1000 | 24.828
   0.13572 |             3 |        1000 | 25.927
   0.08788 |             3 |        1000 | 24.657
   0.05690 |             3 |        1000 | 25.544
   0.03684 |             3 |        1000 | 35.799
   0.02385 |             3 |        1000 | 43.277
   0.01544 |             3 |        1000 | 28.077
   0.01000 |             3 |        1000 | 26.254
   0.50000 |          auto |          36 | 3.604
   0.32374 |          auto |          30 | 3.045
   0.20961 |          auto |          34 | 3.298
   0.13572 |          auto |        1000 | 76.629
   0.08788 |          auto |        1000 | 93.815
   0.05690 |          auto |        1000 | 128.094
   0.03684 |          auto |         191 | 24.650
   0.02385 |          auto |         223 | 27.515
   0.01544 |          auto |         339 | 42.505
   0.01000 |          auto |        1000 | 129.048

Not exactly sure if it's beneficial or not. It might be a bit risky to include in the release. But we might want to fix the plot_community_crime.py example in any case

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

larger training set deviance for smaller values of reg_lambda: bug in convergence criterion?
3 participants