[RLlib] Add on_checkpoint_loaded
callback AND also store eval workers' policy_mapping_fn
in algo state.
#40350
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR fixes a problem with heavily customized eval WorkerSet setups, policy sets, and mapping functions.
Why are these changes needed?
When a user overrides the
on_algorithm_init
callback in order to setup special evaluation policies inside the evaluation worker set, including a new eval policy mapping function, then upon restoring this algorithm from a checkpoint, the evalpolicy_mapping_fn
information would be overridden by the mainpolicy_mapping_fn
(b/c that one is the only one stored in the checkpoint).To solve this problem and to add additional handles for users with such complex customization needs, this PR:
policy_mapping_fn
in the algorithm state, in case this mapping is different from the main mapping function.on_checkpoint_loaded()
callback called after the Algorithm was restored from a checkpoint (Algorithm.load_checkpoint()
has completed).Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.