Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

box (or just positive) constraints on enet OLS #152

Open
adienes opened this issue Oct 3, 2023 · 4 comments
Open

box (or just positive) constraints on enet OLS #152

adienes opened this issue Oct 3, 2023 · 4 comments

Comments

@adienes
Copy link
Contributor

adienes commented Oct 3, 2023

what would it take to support box / positive constraints on the Lasso / ElasticNet solvers? is this compatible with the existing API, and if so where could I get started on contributing to the implementation?

@tlienart
Copy link
Collaborator

tlienart commented Oct 3, 2023

so let's say you wanted to do a linear regression with positive coefficients, you could either write that as something where you project on R+ at every step (would be easy to implement with the current code but likely not guaranteed to converge and might be slow) or start in R+ and add a penalty that explodes (e.g. logarithm) when things get close to 0. Extension to box is the same.

Note also that there is an LBFGS algorithm that supports box constraints but I don't think optim.jl implements it, it might implement something similar to it though (e.g. this pr: JuliaNLSolvers/Optim.jl#584, I haven't dug into that).

So maybe my suggestion would be to start with what optim has in store, see if it can be made to work, also see if there are existing "standard" implementations of regression with + constraints in python or other that could be used as a baseline.

Note: looks like fminbox implements a primal barrier that would amount to what I was suggesting in my first paragraph https://github.com/JuliaNLSolvers/Optim.jl/blob/5fa5d61a9f2ba9fc534b4e74b5659df45afd59c4/src/multivariate/solvers/constrained/fminbox.jl#L182-L206

@adienes
Copy link
Contributor Author

adienes commented Oct 3, 2023

is projecting to R+ at each step what sklearn does here ? https://github.com/scikit-learn/scikit-learn/blob/286f0c9d17019e52f532d63b5ace9f8e1beb5fe5/sklearn/linear_model/_cd_fast.pyx#L568C5-L568C33 it looks like it but I'm not entirely sure. I guess they are doing coordinate descent rather than gradient descent anyway

in particular I have this algorithm in mind for the constrained lasso https://arxiv.org/pdf/1611.01511.pdf

@tlienart
Copy link
Collaborator

tlienart commented Oct 4, 2023

is projecting to R+ at each step what sklearn does here ? https://github.com/scikit-learn/scikit-learn/blob/286f0c9d17019e52f532d63b5ace9f8e1beb5fe5/sklearn/linear_model/_cd_fast.pyx#L568C5-L568C33 it looks like it but I'm not entirely sure. I guess they are doing coordinate descent rather than gradient descent anyway

oof that's not an easy one to read, could be used as an example of why people should move to Julia... yeah it's coordinate descent, I don't fully understand what they're doing in there so would rather not comment too much. Projected gradient descent is more like:

  1. take a normal GD step (for which we have code), or in fact any admissible step btw, just that GD makes sense and has a better chance to lead where you want
  2. project orthogonally onto R+ (should be something like vector minus a dot product or something similar)
  3. do that again

tI think this will be ok in simple cases but might be pretty bad in some cases (lots of non-admissible steps that get projected with the next step that looks very similar) and is not guaranteed to converge afaik. Might still be good to implement as a comparison.

in particular I have this algorithm in mind for the constrained lasso https://arxiv.org/pdf/1611.01511.pdf

right so just as a note, ADMM tends (in my experience) to be a pretty poor algorithm, hard to set up, and even when set up right it can be beaten pretty handily by simpler methods (but people thought it was cool because it can be somewhat parallelisable and lots of papers followed up on it however I don't think it's as good as it's famous).

QP might be interesting but typically needs a dedicated QP solver, from a quick search there's a few in Julia but it'll come at a cost of an an additional dependency which would need to be justified (e.g. if the plan is to only use the dependency for CLasso then I'm not super highly in favour).

To me it would be quite interesting to just try fminbox on the algorithms that are here, see if that works well, try to compare with standard-ish implementations in python / R to see if we have something that's as good or better and then move on from there; it might be that it leads to something that is vastly superior to the article you quote (which wouldn't surprise me).

@adienes
Copy link
Contributor Author

adienes commented Oct 4, 2023

if the plan is to only use the dependency for CLasso then I'm not super highly in favour

fair enough --- although if eventually going that route, I will say I've had good experiences using Clarabel.jl

I'll start to play with fminbox a bit. with the gramian training ready, box constraints are definitely the next most important thing for me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: priority low / involved
Development

No branches or pull requests

2 participants