forked from argoproj/argo-workflows
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Allow containers to be retried. (argoproj#661)
* Allow containers to be retried. This commit allows `container` templates to be retried. The user simply adds a 'retries' section to the templated. Currently, this section only has a 'limit' field that has the number of times the container should be retried in case of failure. Subsequent commits will add more features such as retries policies, retries on specific errors, erc. When a container template has a retries section, the workflow controller adds an intermediate node to the workflow. This node acts as a parent of all the retries. The argo command line tool's 'get' command correctly shows the intermediate node and the child nodes along with their pod names and status'. Added unit tests that check the various state transitions of the intermediate node based on the status of the child nodes (which actually execute the pod). Testing Done. * Unit tests succeeded. * Ran simple workflow that had retries. It completed and status of the nodes was correct. * Ran workflows with parallel steps, each of which had containers with retries. The workflow completed correctly and the retries were applied to each individual container correctly. * Ran other workflows which did not have retries. The completed correctly. * `argo get <wf-name>` shows the retries and the pod names. * Incorporate review comments. * Rename example files for retries. * Update autogenerated code to include RetryStrategy. * Call processNodeRetries from executeTemplate. Earlier, this was getting called during podReconciliation. Calling this from executeTemplate is better since other non-leaf nodes also getting processed in executeTemplate. podReconciliation should only process leaf-nodes (which are based on execution of pods). * Reduce failure probabilities of retry examples. This will prevent spurious failures during e2e tests. Users wanting to experiment with retries will have to explicitly change the failure probability in the yamls.
- Loading branch information
Showing
9 changed files
with
452 additions
and
70 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# This example demonstrates the use of infinite retries for running | ||
# the container to completion. It uses the `random-fail` container. | ||
# For more details, see | ||
# https://github.com/shrinandj/random-fail | ||
|
||
apiVersion: argoproj.io/v1alpha1 | ||
kind: Workflow | ||
metadata: | ||
generateName: container-retries- | ||
spec: | ||
entrypoint: container-retries | ||
templates: | ||
- name: container-retries | ||
retryStrategy: {} | ||
container: | ||
image: shrinand/random-fail | ||
command: ["python"] | ||
args: ["/run.py", "40"] | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# This example demonstrates the use of retries for a single container. | ||
# It uses the `random-fail` container. For more details, see | ||
# https://github.com/shrinandj/random-fail | ||
|
||
apiVersion: argoproj.io/v1alpha1 | ||
kind: Workflow | ||
metadata: | ||
generateName: container-retries- | ||
spec: | ||
entrypoint: container-retries | ||
templates: | ||
- name: container-retries | ||
retryStrategy: | ||
limit: 4 | ||
container: | ||
image: shrinand/random-fail | ||
command: ["python"] | ||
args: ["/run.py", "0"] | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
# This example demonstrates the use of retries with steps. | ||
# It uses the `random-fail` container. For more details, see | ||
# https://github.com/shrinandj/random-fail | ||
|
||
apiVersion: argoproj.io/v1alpha1 | ||
kind: Workflow | ||
metadata: | ||
generateName: retry-with-steps- | ||
spec: | ||
entrypoint: hello-hello-hello | ||
templates: | ||
- name: hello-hello-hello | ||
steps: | ||
- - name: hello1 | ||
template: random-fail | ||
arguments: | ||
parameters: | ||
- name: failPct | ||
value: "0" | ||
- - name: hello2a | ||
template: random-fail | ||
arguments: | ||
parameters: | ||
- name: failPct | ||
value: "0" | ||
- name: hello2b | ||
template: random-fail | ||
arguments: | ||
parameters: | ||
- name: failPct | ||
value: "0" | ||
- name: random-fail | ||
inputs: | ||
parameters: | ||
- name: failPct | ||
retryStrategy: | ||
limit: 4 | ||
container: | ||
image: shrinand/random-fail | ||
command: ["python"] | ||
args: ["/run.py", "{{inputs.parameters.failPct}}"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.