-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
offset and/or weights for survival analysis #221
Comments
hmm ... interesting. So, if we provide an |
I'm interested but wish I had a better sense of how this algorithm scales. There are a couple dozen half-implemented group lasso solvers that sort of work on small datasets, and I'm looking for the one really good one that can scale to my problem (millions of rows and thousands of predictors) |
@shearerp it's always great to hear about concrete use cases in the context of feature requests. have a look at our readme page where a basic set of benchmarks is published comparing runtimes for 1000 samples x 100 features against scikit-learn, statsmodels, and R. we're slightly slower than scikit-learn and faster than statsmodels, primarily because we didn't want to prematurely optimize (cythonize) our solvers. if you'd like to run benchmarks against larger datasets, have a look at in general as far as scalability goes, we may be able to support in-memory computations but currently may not have enough resources to extend to distributed / streaming data use cases. |
Feel free to open a pull request. We are open to contributions |
Poisson regression is very commonly used for survival analysis. In this context, it is necessary to include the exposure time as a log-offset or via weighting. It appears that currently pyglmnet has neither option; the package would be much more widely useful for Poisson regression if it included one or both of these options.
The text was updated successfully, but these errors were encountered: