Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adagrad Optimizer Implementation #154

Merged
merged 7 commits into from
Aug 6, 2023
Merged

Conversation

Spnetic-5
Copy link
Collaborator

@Spnetic-5 Spnetic-5 commented Jul 28, 2023

Reference: PyTorch Docs

@Spnetic-5 Spnetic-5 marked this pull request as ready for review July 28, 2023 05:14
src/nf/nf_optimizers.f90 Show resolved Hide resolved
src/nf/nf_optimizers.f90 Outdated Show resolved Hide resolved
src/nf/nf_optimizers.f90 Outdated Show resolved Hide resolved
@milancurcic
Copy link
Member

Thanks @Spnetic-5. I believe it's correct now. In your original implementation, the L2 regularization was not accounted for in the accumulation of the squared gradients because you applied it later in the param update. The learning rate decay was also doubly accounted for because in each step the learning rate should be amortized relative to the original learning rate, not the one from the previous step. Subtle differences that weren't caught in the tests.

I'll go ahead and merge, please release v0.15.0 when you get a chance.

@milancurcic milancurcic merged commit b119194 into modern-fortran:main Aug 6, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants