-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand docs to include cut and past examples for different Regressors #150
Comments
I don't disagree with this but would add that initially MLJLM was mainly meant to be used through MLJ by average users and there are examples in the MLJ tutorials of the various common regressors. For the latter part (when x is more appropriate) it's pretty tricky and debatable apart from pretty generic stuff. I don't think you'll find an opinionated view on whether some model with regularisation is better than some other without or vice versa, typically people should get a sense of what problem they're facing (eg big outliers) then shove several of the models they think might address that in HP tuning and pick the one they believe generalises better based on some metric. The current philosophy has been to follow sklearn with minimal doc explaining the loss function and letting users figure out whether that matches what they need. But as always, if someone here would like to edit the docs to make them better for users who would like a bit more, PRs are always welcome |
What about something like this? using MLJ
using MLJLinearModels
using Plots
using Random
# create data
t = 1:0.01:10;
n = length(t);
gaussian_noise = randn(n) * 3;
outliers = rand((zeros(round(Int64, n/20))..., 6, -8, 100, -200, 178, -236, 77, -129, -50, -100, -45, -33, -114, -1929, -2000), n);
# measurment y
y = 10 .+ 10 * sin.(t) .+ 5 * t .+ gaussian_noise .+ outliers;
# design matrix
X = hcat(ones(length(t)), sin.(t), t);
scale_penalty = false
fit_intercept = false
begin
scatter(t, y;
markerstrokecolor=:match,
markerstrokewidth=0,
label = "observations",
ylim = (-70, 70),
legend = :outerbottom,
color = :grey,
size = (700, 900)
)
# Base LSQ model fit
println("Base Julia Linear Least Squares")
@time θ = X \ y;
plot!(t, X * θ, label="Base Julia Linear Least Squares", linewidth=2)
regressor = LinearRegression(fit_intercept=fit_intercept);
println(typeof(regressor))
@time θ = fit(regressor, X, y)
plot!(t, X * θ, label=typeof(regressor), linewidth = 2)
regressor = HuberRegression(scale_penalty_with_samples=scale_penalty, fit_intercept=fit_intercept);
println(typeof(regressor))
@time θ = fit(regressor, X, y)
plot!(t, X * θ, label=typeof(regressor), linewidth = 2)
regressor = RidgeRegression(scale_penalty_with_samples=scale_penalty, fit_intercept=fit_intercept);
println(typeof(regressor))
@time θ = fit(regressor, X, y)
plot!(t, X * θ, label=typeof(regressor), linewidth = 2)
regressor = LassoRegression(scale_penalty_with_samples=scale_penalty, fit_intercept=fit_intercept);
println(typeof(regressor))
@time θ = fit(regressor, X, y)
plot!(t, X * θ, label=typeof(regressor), linewidth = 2)
regressor = ElasticNetRegression(scale_penalty_with_samples=scale_penalty, fit_intercept=fit_intercept);
println(typeof(regressor))
@time θ = fit(regressor, X, y)
plot!(t, X * θ, label=typeof(regressor), linewidth = 2)
regressor = QuantileRegression(scale_penalty_with_samples=scale_penalty, fit_intercept=fit_intercept);
println(typeof(regressor))
@time θ = fit(regressor, X, y)
plot!(t, X * θ, label=typeof(regressor), linewidth = 2)
regressor = LADRegression(scale_penalty_with_samples=scale_penalty, fit_intercept=fit_intercept);
println(typeof(regressor))
@time θ = fit(regressor, X, y)
plot!(t, X * θ, label=typeof(regressor), linewidth = 2)
regressor = GeneralizedLinearRegression(scale_penalty_with_samples=scale_penalty, fit_intercept=fit_intercept);
println(typeof(regressor))
@time θ = fit(regressor, X, y)
plot!(t, X * θ, label=typeof(regressor), linewidth = 2)
regressor = RobustRegression(scale_penalty_with_samples=scale_penalty, fit_intercept=fit_intercept);
println(typeof(regressor))
@time θ = fit(regressor, X, y)
plot!(t, X * θ, label=typeof(regressor), linewidth = 2)
end Base Julia Linear Least Squares
0.000168 seconds (36 allocations: 97.016 KiB)
GeneralizedLinearRegression{L2Loss, NoPenalty}
0.000119 seconds (41 allocations: 118.719 KiB)
GeneralizedLinearRegression{RobustLoss{HuberRho{0.5}}, ScaledPenalty{L2Penalty}}
0.001772 seconds (525 allocations: 699.094 KiB)
GeneralizedLinearRegression{L2Loss, ScaledPenalty{L2Penalty}}
0.000100 seconds (8 allocations: 21.984 KiB)
GeneralizedLinearRegression{L2Loss, ScaledPenalty{L1Penalty}}
0.003497 seconds (2.40 k allocations: 2.931 MiB)
GeneralizedLinearRegression{L2Loss, CompositePenalty}
0.008676 seconds (4.13 k allocations: 4.338 MiB)
GeneralizedLinearRegression{RobustLoss{QuantileRho{0.5}}, ScaledPenalty{L2Penalty}}
0.000732 seconds (323 allocations: 240.594 KiB)
GeneralizedLinearRegression{RobustLoss{QuantileRho{0.5}}, ScaledPenalty{L2Penalty}}
0.000718 seconds (323 allocations: 240.594 KiB)
GeneralizedLinearRegression{L2Loss, NoPenalty}
0.000143 seconds (41 allocations: 118.719 KiB)
GeneralizedLinearRegression{RobustLoss{HuberRho{0.1}}, ScaledPenalty{L2Penalty}}
0.001428 seconds (493 allocations: 660.344 KiB) |
I think that's very nice :) (some curves don't appear?) if you wanted to add a page of the sorts to the docs, that would be great. Small notes to add would be: (1) 2D data is quite different from nD data so the intuition you build with 2D might sometimes not help know what works best for nD, better to try when in doubt (2) hyperparameter tuning is essential for most of these models (in fact a nice small addition would be a visual representation of what happens to a curve, say to the L1 regression, when the strength of the regulariser is increased). But generally speaking though, if this helped you, then no doubt it'll help others and it should be in the docs :) |
I think we could reduce learning curves by including some cut and past examples for the various regressors... it would also be good to include some discussion of when one Regressor model might be more appropriate than another.
The text was updated successfully, but these errors were encountered: