We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Merge pull request #139 from huggingface/bouteille/fix-weight-decay Add param group weight decay
fix readme
Merge pull request #54 from huggingface/xrsrke/feature_doremi_new_cod… …ebase [Feature] DoReMi
Merge pull request #71 from nopperl/topology-agnostic-loading Implement pipeline parallel size-agnostic optimizer state loading
Merge pull request #60 from huggingface/flexibility Lighteval naming