Tags: zirui/nanotron
Tags
Merge pull request huggingface#54 from huggingface/xrsrke/feature_dor… …emi_new_codebase [Feature] DoReMi
Merge pull request huggingface#71 from nopperl/topology-agnostic-loading Implement pipeline parallel size-agnostic optimizer state loading
Merge pull request huggingface#60 from huggingface/flexibility Lighteval naming