Skip to content

Commit

Permalink
Worker fails if exit code = 1 (#400)
Browse files Browse the repository at this point in the history
  • Loading branch information
workingloong committed May 10, 2023
1 parent edc076b commit 7a166b3
Showing 1 changed file with 1 addition and 4 deletions.
5 changes: 1 addition & 4 deletions dlrover/python/master/node/worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -264,10 +264,7 @@ def remove_not_participated_workers(self, workers):
def has_failed_worker(self):
"""Check whether there is failed worker except evicted workers."""
for worker in self._nodes.values():
if worker.exit_reason in [
NodeExitReason.FATAL_ERROR,
NodeExitReason.UNKNOWN_ERROR,
]:
if worker.exit_reason == NodeExitReason.FATAL_ERROR:
return True
return False

Expand Down

0 comments on commit 7a166b3

Please sign in to comment.