Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resubmit/Retry with the same workflow ID #762

Closed
dvavili opened this issue Feb 22, 2018 · 5 comments
Closed

Resubmit/Retry with the same workflow ID #762

dvavili opened this issue Feb 22, 2018 · 5 comments

Comments

@dvavili
Copy link
Contributor

dvavili commented Feb 22, 2018

Is this a BUG REPORT or FEATURE REQUEST?: FEATURE REQUEST

Have an argo retry or argo resubmit --retry option to resubmit the workflow without creating a new workflow object. This would allow the retry of workflow from the failed step/task. While argo resubmit --memoized offers retry of a failed workflow, the pods in the resubmitted workflow have reference to the previous workflow's pods. It becomes a nightmare if a user wants to look at the logs of all pods in a workflow. Also, there might be the case that the previous workflows have been deleted and all the pods of the workflows are already deleted.

Example of a workflow with multiple references:

argo get resubmit-b18gf
Name:             resubmit-b18gf
Namespace:        namespace1
ServiceAccount:   default
Status:           Succeeded
Created:          Wed Feb 21 16:13:37 -0800 (14 seconds ago)
Started:          Wed Feb 21 16:13:37 -0800 (14 seconds ago)
Finished:         Wed Feb 21 16:13:45 -0800 (6 seconds ago)
Duration:         8 seconds

STEP                 PODNAME                    DURATION  MESSAGE
 ✔ resubmit-b18gf
 ├-○ A                                                    original pod: resubmit-vqmml-2974682433
 ├-✔ B
 | ├-·-○ randfail1a                                       original pod: resubmit-jgwnb-182623956
 | | └-○ randfail1b                                       original pod: resubmit-32e2f-2053918229
 | └-·-○ randfail2a                                       original pod: resubmit-cabp0-1503908466
 |   ├-○ randfail2b                                       original pod: resubmit-jgwnb-175416469
 |   ├-✔ randfail2c  resubmit-b18gf-2152986244  2s
 |   └-○ randfail2d                                       original pod: resubmit-jgwnb-74750755
 ├-✔ D               resubmit-b18gf-3496078573  3s
 └-✔ C               resubmit-b18gf-3378635240  3s

Having this feature means that all the information in the workflow remains intact and traceable.

@dvavili
Copy link
Contributor Author

dvavili commented Feb 22, 2018

Or probably change the behavior with argo resubmit --memoized to submit with the same workflow ID?

@jessesuen
Copy link
Member

I agree with the use case. Another reason for using the same workflow name, is if artifacts need to be placed together. What about:

argo resubmit WORKFLOW --preserve-name

Then this flag could be used in non-memoized and memoized mode. e.g.:

argo resubmit WORKFLOW --preserve-name --memoized

@dvavili
Copy link
Contributor Author

dvavili commented Feb 22, 2018

That sounds good. I'm ok with the proposed change, but it seems to me that this behavior could be the default and would offer a better experience. Is there a specific reason to not have this as the default behavior and have a flag to resubmit under a different workflow name?

@jessesuen
Copy link
Member

I think you are right. argo resubmit and argo retry may be the most common ways to "redo" a workflow and we should make those the most convenient to run. So here would be the three use cases to support, and their respective commands:

  1. Resubmit workflow. Create a new workflow and carry over only the spec from the previous submission. Do not carry over the status.
argo resubmit workflowname
  1. Resubmit a memoized workflow. Create a new workflow and carry over only the spec, and just the successful steps from the previous submission.
argo resubmit workflowname --memoized
  1. "Retry" the workflow. Preserve the workflow name. Erase any failed/errored state from the workflow status, and delete the failed pods. The controller will react as if the failures never happened, and begin operating from the last successful state. It is the user's responsibility for the workflow steps to be idempotent.
argo retry workflowname

@bklaubos
Copy link

@jessesuen @lippertmarkus :
argo resubmit workflowname --memoized - will this work even if all the pods(regardless of status) has been deleted but its artifacts had been saved to S3?

argo retry - will this work even if all the pods(regardless of status) has been deleted but its artifacts had been saved to S3?

icecoffee531 pushed a commit to icecoffee531/argo-workflows that referenced this issue Jan 5, 2022
There was a word duplicated in docs. On running codegen I noticed that the mockery command was outdated as well
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants