Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid parameter when using exit hook #6948

Closed
aaronmell opened this issue Oct 15, 2021 · 6 comments
Closed

Invalid parameter when using exit hook #6948

aaronmell opened this issue Oct 15, 2021 · 6 comments
Labels

Comments

@aaronmell
Copy link
Contributor

Summary

What happened/what you expected to happen?
In the example below, when a step/task has retries and an onExit hook, and the onExit hook takes as input the output of the step running, the input to the onExit handler is invalid.

image

What version of Argo Workflows are you running?
3.2.0 and 3.1.13

Diagnostics

Either a workflow that reproduces the bug, or paste you whole workflow YAML, including status, something like:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: exit-handler-with-param-
spec:
  entrypoint: main
  templates:
    - name: main
      steps:
        - - name: step-1
            template: output
            hooks:
              exit:
                template: exit
                arguments:
                  parameters:
                    - name: message
                      value: "{{steps.step-1.outputs.parameters.result}}"
    
    - name: output
      container:
        image: alpine:latest
        command: [sh, -c]
        args: ["echo -n hello world > /tmp/hello_world.txt; exit 1"]
      retryStrategy:
        limit: '1'
        backoff:
          duration: '1s'
        retryPolicy: Always
      outputs:
        parameters:
          - name: result
            valueFrom:
              default: "Foobar"   # Default value to use if retrieving valueFrom fails. If not provided workflow will fail instead
              path: /tmp/hello_world.txt
    
    - name: exit
      inputs:
        parameters:
          - name: message
            value: GoodValue
      script:
        image: alpine:latest
        command: [ sh, -c ]
        args: ["echo {{inputs.parameters.message}}"]

What Kubernetes provider are you using?

What executor are you running? Docker/K8SAPI/Kubelet/PNS/Emissary
Docker

# Logs from the workflow controller:
time="2021-10-15T19:13:06.912Z" level=info msg="Processing workflow" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:06.921Z" level=info msg="Updated phase  -> Running" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:06.922Z" level=info msg="Steps node exit-handler-with-param-w5cft initialized Running" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:06.922Z" level=info msg="StepGroup node exit-handler-with-param-w5cft-1417826313 initialized Running" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:06.922Z" level=info msg="Retry node exit-handler-with-param-w5cft-4240249269 initialized Running" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:06.923Z" level=info msg="Pod node exit-handler-with-param-w5cft-2660966932 initialized Pending" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:06.966Z" level=info msg="Created pod: exit-handler-with-param-w5cft[0].step-1(0) (exit-handler-with-param-w5cft-output-2660966932)" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:06.966Z" level=info msg="Workflow step group node exit-handler-with-param-w5cft-1417826313 not yet completed" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:06.966Z" level=info msg="TaskSet Reconciliation" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:06.966Z" level=info msg=reconcileAgentPod namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:06.985Z" level=info msg="Workflow update successful" namespace=argo-workflows phase=Running resourceVersion=367148647 workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:16.968Z" level=info msg="Processing workflow" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:16.969Z" level=info msg="Pod failed: Error (exit code 1)" displayName="step-1(0)" namespace=argo-workflows pod=exit-handler-with-param-w5cft-output-2660966932 templateName=output workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:16.969Z" level=info msg="Updating node exit-handler-with-param-w5cft-2660966932 exit code 1" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:16.969Z" level=info msg="Setting node exit-handler-with-param-w5cft-2660966932 outputs: {\"parameters\":[{\"name\":\"result\",\"value\":\"hello world\",\"valueFrom\":{\"path\":\"/tmp/hello_world.txt\",\"default\":\"Foobar\"}}],\"artifacts\":[{\"name\":\"main-logs\",\"s3\":{\"key\":\"exit-handler-with-param-w5cft/exit-handler-with-param-w5cft-output-2660966932/main.log\"}}]}" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:16.969Z" level=info msg="Updating node exit-handler-with-param-w5cft-2660966932 status Pending -> Failed" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:16.969Z" level=info msg="Updating node exit-handler-with-param-w5cft-2660966932 message: Error (exit code 1)" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:16.970Z" level=info msg="node has maxDuration set, setting executionDeadline to: Mon Jan 01 00:00:00 +0000 (a long while ago)" namespace=argo-workflows node="exit-handler-with-param-w5cft[0].step-1" workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:16.970Z" level=info msg="1 child nodes of exit-handler-with-param-w5cft[0].step-1 failed. Trying again..." namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:16.970Z" level=info msg="Pod node exit-handler-with-param-w5cft-2056825553 initialized Pending" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:17.005Z" level=info msg="Created pod: exit-handler-with-param-w5cft[0].step-1(1) (exit-handler-with-param-w5cft-output-2056825553)" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:17.005Z" level=info msg="Workflow step group node exit-handler-with-param-w5cft-1417826313 not yet completed" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:17.005Z" level=info msg="TaskSet Reconciliation" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:17.005Z" level=info msg=reconcileAgentPod namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:17.020Z" level=info msg="Workflow update successful" namespace=argo-workflows phase=Running resourceVersion=367149161 workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:17.027Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo-workflows/exit-handler-with-param-w5cft-output-2660966932/labelPodCompleted
time="2021-10-15T19:13:27.005Z" level=info msg="Processing workflow" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:27.006Z" level=info msg="Pod failed: Error (exit code 1)" displayName="step-1(1)" namespace=argo-workflows pod=exit-handler-with-param-w5cft-output-2056825553 templateName=output workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:27.006Z" level=info msg="Updating node exit-handler-with-param-w5cft-2056825553 exit code 1" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:27.006Z" level=info msg="Setting node exit-handler-with-param-w5cft-2056825553 outputs: {\"parameters\":[{\"name\":\"result\",\"value\":\"hello world\",\"valueFrom\":{\"path\":\"/tmp/hello_world.txt\",\"default\":\"Foobar\"}}],\"artifacts\":[{\"name\":\"main-logs\",\"s3\":{\"key\":\"exit-handler-with-param-w5cft/exit-handler-with-param-w5cft-output-2056825553/main.log\"}}]}" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:27.006Z" level=info msg="Updating node exit-handler-with-param-w5cft-2056825553 status Pending -> Failed" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:27.006Z" level=info msg="Updating node exit-handler-with-param-w5cft-2056825553 message: Error (exit code 1)" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:27.006Z" level=info msg="node has maxDuration set, setting executionDeadline to: Mon Jan 01 00:00:00 +0000 (a long while ago)" namespace=argo-workflows node="exit-handler-with-param-w5cft[0].step-1" workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:27.006Z" level=info msg="No more retries left. Failing..." namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:27.007Z" level=info msg="node exit-handler-with-param-w5cft-4240249269 phase Running -> Failed" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:27.007Z" level=info msg="node exit-handler-with-param-w5cft-4240249269 message: No more retries left" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:27.007Z" level=info msg="node exit-handler-with-param-w5cft-4240249269 finished: 2021-10-15 19:13:27.007073079 +0000 UTC" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:27.007Z" level=info msg="Running OnExit handler" lifeCycleHook="&LifecycleHook{Template:exit,Arguments:Arguments{Parameters:[]Parameter{Parameter{Name:message,Default:nil,Value:*{{steps.step-1.outputs.parameters.result}},ValueFrom:nil,GlobalName:,Enum:[],},},Artifacts:[]Artifact{},},}" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:27.007Z" level=info msg="Pod node exit-handler-with-param-w5cft-167558300 initialized Pending" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:27.052Z" level=info msg="Created pod: exit-handler-with-param-w5cft[0].step-1.onExit (exit-handler-with-param-w5cft-exit-167558300)" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:27.052Z" level=info msg="Workflow step group node exit-handler-with-param-w5cft-1417826313 not yet completed" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:27.052Z" level=info msg="TaskSet Reconciliation" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:27.052Z" level=info msg=reconcileAgentPod namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:27.072Z" level=info msg="Workflow update successful" namespace=argo-workflows phase=Running resourceVersion=367149669 workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:27.077Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo-workflows/exit-handler-with-param-w5cft-output-2056825553/labelPodCompleted
time="2021-10-15T19:13:37.055Z" level=info msg="Processing workflow" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.055Z" level=info msg="Updating node exit-handler-with-param-w5cft-167558300 exit code 0" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.055Z" level=info msg="Setting node exit-handler-with-param-w5cft-167558300 outputs: {\"artifacts\":[{\"name\":\"main-logs\",\"s3\":{\"key\":\"exit-handler-with-param-w5cft/exit-handler-with-param-w5cft-exit-167558300/main.log\"}}]}" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.055Z" level=info msg="Updating node exit-handler-with-param-w5cft-167558300 status Pending -> Succeeded" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.056Z" level=info msg="Running OnExit handler" lifeCycleHook="&LifecycleHook{Template:exit,Arguments:Arguments{Parameters:[]Parameter{Parameter{Name:message,Default:nil,Value:*{{steps.step-1.outputs.parameters.result}},ValueFrom:nil,GlobalName:,Enum:[],},},Artifacts:[]Artifact{},},}" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.057Z" level=info msg="Step group node exit-handler-with-param-w5cft-1417826313 deemed failed: child 'exit-handler-with-param-w5cft-4240249269' failed" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.057Z" level=info msg="node exit-handler-with-param-w5cft-1417826313 phase Running -> Failed" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.057Z" level=info msg="node exit-handler-with-param-w5cft-1417826313 message: child 'exit-handler-with-param-w5cft-4240249269' failed" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.057Z" level=info msg="node exit-handler-with-param-w5cft-1417826313 finished: 2021-10-15 19:13:37.057065407 +0000 UTC" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.057Z" level=info msg="step group exit-handler-with-param-w5cft-1417826313 was unsuccessful: child 'exit-handler-with-param-w5cft-4240249269' failed" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.057Z" level=info msg="Outbound nodes of exit-handler-with-param-w5cft-4240249269 is [exit-handler-with-param-w5cft-167558300]" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.057Z" level=info msg="Outbound nodes of exit-handler-with-param-w5cft is [exit-handler-with-param-w5cft-167558300]" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.057Z" level=info msg="node exit-handler-with-param-w5cft phase Running -> Failed" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.057Z" level=info msg="node exit-handler-with-param-w5cft message: child 'exit-handler-with-param-w5cft-4240249269' failed" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.057Z" level=info msg="node exit-handler-with-param-w5cft finished: 2021-10-15 19:13:37.057177859 +0000 UTC" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.057Z" level=info msg="Checking daemoned children of exit-handler-with-param-w5cft" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.057Z" level=info msg="TaskSet Reconciliation" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.057Z" level=info msg=reconcileAgentPod namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.057Z" level=info msg="Updated phase Running -> Failed" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.057Z" level=info msg="Updated message  -> child 'exit-handler-with-param-w5cft-4240249269' failed" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.057Z" level=info msg="Marking workflow completed" namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.057Z" level=info msg="Checking daemoned children of " namespace=argo-workflows workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.084Z" level=info msg="Workflow update successful" namespace=argo-workflows phase=Failed resourceVersion=367150195 workflow=exit-handler-with-param-w5cft
time="2021-10-15T19:13:37.090Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo-workflows/exit-handler-with-param-w5cft-exit-167558300/labelPodCompleted


# Logs from in your workflow's wait container, something like:
time="2021-10-15T19:13:32.764Z" level=info msg="listed containers" containers="map[init:{3010713a109bea0b8daff5eb3d71577f1f674ad38849de64be87fcfb1ce985ad Exited {0 63769922008 <nil>}} main:{c95eaf8291310f49e74c2cef6c87900dccc677ba4c0fed3f6339b1d14f56123b Exited {0 63769922010 <nil>}} wait:{8f6d1a4e09b6161887237094720d4d0873da9149fa878e3e4b588b66b4ec2c86 Up {0 63769922009 <nil>}}]"
time="2021-10-15T19:13:32.791Z" level=info msg="listed containers" containers="map[init:{3010713a109bea0b8daff5eb3d71577f1f674ad38849de64be87fcfb1ce985ad Exited {0 63769922008 <nil>}} main:{c95eaf8291310f49e74c2cef6c87900dccc677ba4c0fed3f6339b1d14f56123b Exited {0 63769922010 <nil>}} wait:{8f6d1a4e09b6161887237094720d4d0873da9149fa878e3e4b588b66b4ec2c86 Up {0 63769922009 <nil>}}]"
time="2021-10-15T19:13:32.791Z" level=info msg="Killing sidecars [\"init\"]"
time="2021-10-15T19:13:32.797Z" level=info msg="Get pods 200"
time="2021-10-15T19:13:32.798Z" level=info msg="docker kill --signal TERM 3010713a109bea0b8daff5eb3d71577f1f674ad38849de64be87fcfb1ce985ad"
time="2021-10-15T19:13:32.823Z" level=error msg="`docker kill --signal TERM 3010713a109bea0b8daff5eb3d71577f1f674ad38849de64be87fcfb1ce985ad` failed: Error response from daemon: Cannot kill container: 3010713a109bea0b8daff5eb3d71577f1f674ad38849de64be87fcfb1ce985ad: Container 3010713a109bea0b8daff5eb3d71577f1f674ad38849de64be87fcfb1ce985ad is not running\n"
time="2021-10-15T19:13:32.823Z" level=warning msg="Ignored error from 'docker kill --signal TERM': Error response from daemon: Cannot kill container: 3010713a109bea0b8daff5eb3d71577f1f674ad38849de64be87fcfb1ce985ad: Container 3010713a109bea0b8daff5eb3d71577f1f674ad38849de64be87fcfb1ce985ad is not running"
time="2021-10-15T19:13:32.823Z" level=info msg="[docker wait 3010713a109bea0b8daff5eb3d71577f1f674ad38849de64be87fcfb1ce985ad]"
time="2021-10-15T19:13:32.850Z" level=info msg="Containers [3010713a109bea0b8daff5eb3d71577f1f674ad38849de64be87fcfb1ce985ad] killed successfully"
time="2021-10-15T19:13:32.850Z" level=info msg="Alloc=6073 TotalAlloc=16986 Sys=74065 NumGC=6 Goroutines=12"
time="2021-10-15T19:13:21.446Z" level=info msg="sh -c docker cp -a 59eb86b794316adf436d29f9aff260795d474f98ccbbf977be9301a07da357af:/tmp/hello_world.txt - | tar -ax -O"
time="2021-10-15T19:13:21.480Z" level=info msg="listed containers" containers="map[main:{59eb86b794316adf436d29f9aff260795d474f98ccbbf977be9301a07da357af Exited {0 63769921999 <nil>}} wait:{5826854caae1594a515f1ae4f2df59e0f8d2457210c185e5147f09e661ba1ccb Up {0 63769921998 <nil>}}]"
time="2021-10-15T19:13:21.568Z" level=info msg="Successfully saved output parameter: result"
time="2021-10-15T19:13:21.568Z" level=info msg="No output artifacts"
time="2021-10-15T19:13:21.568Z" level=info msg="Annotating pod with output"
time="2021-10-15T19:13:21.600Z" level=info msg="Patch pods 200"
time="2021-10-15T19:13:21.615Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=argo-workflows --filter=label=io.kubernetes.pod.name=exit-handler-with-param-w5cft-output-2056825553"
time="2021-10-15T19:13:21.646Z" level=info msg="listed containers" containers="map[main:{59eb86b794316adf436d29f9aff260795d474f98ccbbf977be9301a07da357af Exited {0 63769921999 <nil>}} wait:{5826854caae1594a515f1ae4f2df59e0f8d2457210c185e5147f09e661ba1ccb Up {0 63769921998 <nil>}}]"
time="2021-10-15T19:13:21.646Z" level=info msg="Killing sidecars []"
time="2021-10-15T19:13:21.646Z" level=info msg="Alloc=5942 TotalAlloc=16864 Sys=74065 NumGC=6 Goroutines=12"
time="2021-10-15T19:13:11.494Z" level=info msg="Copying /tmp/hello_world.txt from base image layer"
time="2021-10-15T19:13:11.495Z" level=info msg="sh -c docker cp -a 2f2ee311df102c48aa541034b25527d0392a8ad378613c2909c3f384e6dfd440:/tmp/hello_world.txt - | tar -ax -O"
time="2021-10-15T19:13:11.619Z" level=info msg="Successfully saved output parameter: result"
time="2021-10-15T19:13:11.619Z" level=info msg="No output artifacts"
time="2021-10-15T19:13:11.619Z" level=info msg="Annotating pod with output"
time="2021-10-15T19:13:11.649Z" level=info msg="Patch pods 200"
time="2021-10-15T19:13:11.658Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=argo-workflows --filter=label=io.kubernetes.pod.name=exit-handler-with-param-w5cft-output-2660966932"
time="2021-10-15T19:13:11.687Z" level=info msg="listed containers" containers="map[main:{2f2ee311df102c48aa541034b25527d0392a8ad378613c2909c3f384e6dfd440 Exited {0 63769921989 <nil>}} wait:{8b0ae12b163e52264745de622a8c65f0cd245121d94a387125857ff220500879 Up {0 63769921988 <nil>}}]"
time="2021-10-15T19:13:11.687Z" level=info msg="Killing sidecars []"
time="2021-10-15T19:13:11.687Z" level=info msg="Alloc=7428 TotalAlloc=16883 Sys=74065 NumGC=5 Goroutines=12"

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

@aaronmell
Copy link
Contributor Author

It works correctly if the retries are removed. It also works correctly if the step with the exit handler doesn't fail.

@smile-luobin
Copy link
Contributor

trying to fix this bug in PR: #6956

@sarabala1979
Copy link
Member

@smile-luobin Above PR will not fix this issue. I am trying to fix this issue. if you like to take my fix and merge with PR. I will create a draft PR for your reference.
here is the fix step.go line 263. (tested working )

			exitNode := &childNode
			if node.Type == wfv1.NodeTypeRetry{
				exitNode = getChildNodeIndex(&childNode, woc.wf.Status.Nodes, -1)
			}
			hasOnExitNode, onExitNode, err := woc.runOnExitNode(ctx, step.GetExitHook(woc.execWf.Spec.Arguments), childNode.Name, stepsCtx.boundaryID, stepsCtx.tmplCtx, "steps."+step.Name, exitNode.Outputs)
			if hasOnExitNode && (onExitNode == nil || !onExitNode.Fulfilled() || err != nil) {
				// The onExit node is either not complete or has errored out, return.
				completed = false
			}

here is the fix dag.go line 248

if taskNode != nil && taskNode.Fulfilled() {
			if taskNode.Completed() {
				fmt.Println("tasknodeComplete", taskNode)
				exitNode := taskNode
				if node.Type == wfv1.NodeTypeRetry {
					exitNode = getChildNodeIndex(taskNode, woc.wf.Status.Nodes, -1)
				}
				fmt.Println("exitnode:  ", exitNode)
				// Run the node's onExit node, if any. Since this is a target task, we don't need to consider the status
				// of the onExit node before continuing. That will be done in assesDAGPhase
				_, _, err := woc.runOnExitNode(ctx, dagCtx.GetTask(taskName).GetExitHook(woc.execWf.Spec.Arguments), taskNode.Name, dagCtx.boundaryID, dagCtx.tmplCtx, "tasks."+taskName, exitNode.Outputs)
				if err != nil {
					return node, err
				}
			}
		}

@smile-luobin
Copy link
Contributor

smile-luobin commented Oct 19, 2021

@sarabala1979 LGTM.
One more thing, I think these code maybe can moved into function runOnExitNode in exit_handler.go:

                  exitNode := taskNode
                  if node.Type == wfv1.NodeTypeRetry {
                      exitNode = getChildNodeIndex(taskNode, woc.wf.Status.Nodes, -1)
                  }

In PR: #6956, may fix another issue. Whether the output of lastChildNode (no matter it is failed or succeed) of retryNode should be copied as retryNode output or not?

@sarabala1979
Copy link
Member

@smile-luobin Good me, we can pass the node object into runOnExitNode. you can combine both in one PR

@smile-luobin
Copy link
Contributor

smile-luobin commented Oct 20, 2021

@sarabala1979 ok. i will do it.

Whether the output of lastChildNode (no matter it is failed or succeed) of retryNode should be copied as retryNode output or not? If the answer is true, this will be done in PR: #6956. As the lastChildNode outputs are copied as retryNode output before doing runOnExitNode. So, i think it will fix this issue also.

SunSparc pushed a commit to SunSparc/argo-workflows that referenced this issue Oct 22, 2021
…ixes argoproj#6948 (argoproj#6956)

* fix(controller): fix bugs when process retry node ouput

Signed-off-by: smile-luobin <[email protected]>

* fix(controller): fix runOnExitNode unable to get retryNode outputs

Signed-off-by: smile-luobin <[email protected]>
Signed-off-by: Jonathan Duncan <[email protected]>
kriti-sc pushed a commit to kriti-sc/argo-workflows that referenced this issue Oct 24, 2021
…ixes argoproj#6948 (argoproj#6956)

* fix(controller): fix bugs when process retry node ouput

Signed-off-by: smile-luobin <[email protected]>

* fix(controller): fix runOnExitNode unable to get retryNode outputs

Signed-off-by: smile-luobin <[email protected]>
Signed-off-by: kriti-sc <[email protected]>
@sarabala1979 sarabala1979 mentioned this issue Oct 26, 2021
25 tasks
sarabala1979 pushed a commit that referenced this issue Oct 26, 2021
…ixes #6948 (#6956)

* fix(controller): fix bugs when process retry node ouput

Signed-off-by: smile-luobin <[email protected]>

* fix(controller): fix runOnExitNode unable to get retryNode outputs

Signed-off-by: smile-luobin <[email protected]>
@sarabala1979 sarabala1979 mentioned this issue Nov 4, 2021
25 tasks
@sarabala1979 sarabala1979 mentioned this issue Dec 15, 2021
73 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants