Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow becomes completed before some template triggered by LifecycleHook Running is completed #11113

Closed
3 tasks done
toyamagu-2021 opened this issue May 21, 2023 · 2 comments · Fixed by #11176
Closed
3 tasks done
Labels
P3 Low priority type/bug

Comments

@toyamagu-2021
Copy link
Member

toyamagu-2021 commented May 21, 2023

Pre-requisites

  • I have double-checked my configuration
  • I can confirm the issues exists when I tested with :latest
  • I'd like to contribute the fix myself (see contributing guide)

What happened/what you expected to happen?

Description

  • Workflow is marked as completed before some templates triggered by LifecycleHook Running are marked as completed. And the node status never changed from Running.
    image

  • Workflow's Progress remains 1/2 permanently...
    image

Consideration

  • I'm not sure, but I think the root cause is the logic of executeWfLifeCycleHook

  • hook.expression of Running Lifecycle Hook is "workflow.status == \"Running\"", so argoexpr.EvalBool returns false when main template becomes workflow.status=Succeeded.

    • As a result, hookNode is not added to hookNodes, then executeWfLifeCycleHook returns true even if some nodes created by hook are in "Running" status.
  • To confirm, I added the following code (obviously not good. only PoC) to the master branch, and seems like it works fine.

     execute, err := argoexpr.EvalBool(hook.Expression, env.GetFuncMap(template.EnvMap(woc.globalParams)))
     if hook.Expression == "workflow.status == \"Running\"" && woc.globalParams["workflow.status"] == "Succeeded" {
     	execute = true
     }

Version

latest

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
 name: lifecycle-hook
 namespace: argo
spec:
 entrypoint: main
 hooks:
   running:
     expression: workflow.status == "Running"
     template: sleep-10
 templates:
 - name: main
   steps:
     - - name: step1
         template: heads

 - name: heads
   container:
     image: alpine:3.6
     command: [sh, -c]
     args: ["echo \"it was heads\""]

 - name: sleep-10
   container:
     image: alpine:3.6
     command: [sh, -c]
     args: ["sleep 10"]

Logs from the workflow controller

time="2023-05-21T16:59:10.395Z" level=info msg="Processing workflow" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:10.414Z" level=info msg="Updated phase  -> Running" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:10.414Z" level=info msg="Steps node lifecycle-hook-kmpbx initialized Running" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:10.414Z" level=info msg="StepGroup node lifecycle-hook-kmpbx-121855276 initialized Running" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:10.414Z" level=info msg="Pod node lifecycle-hook-kmpbx-892768671 initialized Pending" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:10.436Z" level=info msg="Created pod: lifecycle-hook-kmpbx[0].step1 (lifecycle-hook-kmpbx-heads-892768671)" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:10.436Z" level=info msg="Workflow step group node lifecycle-hook-kmpbx-121855276 not yet completed" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:10.436Z" level=info msg="Running workflow level hooks" lifeCycleHook=running namespace=argo node=lifecycle-hook-kmpbx.hooks.running workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:10.436Z" level=info msg="Pod node lifecycle-hook-kmpbx-1701654913 initialized Pending" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:10.457Z" level=info msg="Created pod: lifecycle-hook-kmpbx.hooks.running (lifecycle-hook-kmpbx-sleep-10-1701654913)" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:10.457Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:10.457Z" level=info msg=reconcileAgentPod namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:10.495Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=944 workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:20.435Z" level=info msg="Processing workflow" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:20.435Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:20.435Z" level=info msg="node changed" namespace=argo new.message=PodInitializing new.phase=Pending new.progress=0/1 nodeID=lifecycle-hook-kmpbx-1701654913 old.message= old.phase=Pending old.progress=0/1 workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:20.436Z" level=info msg="node changed" namespace=argo new.message=PodInitializing new.phase=Pending new.progress=0/1 nodeID=lifecycle-hook-kmpbx-892768671 old.message= old.phase=Pending old.progress=0/1 workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:20.436Z" level=info msg="Workflow step group node lifecycle-hook-kmpbx-121855276 not yet completed" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:20.436Z" level=info msg="Running workflow level hooks" lifeCycleHook=running namespace=argo node=lifecycle-hook-kmpbx.hooks.running workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:20.436Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:20.436Z" level=info msg=reconcileAgentPod namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:20.446Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=976 workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.447Z" level=info msg="Processing workflow" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.447Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.447Z" level=info msg="node changed" namespace=argo new.message= new.phase=Running new.progress=0/1 nodeID=lifecycle-hook-kmpbx-1701654913 old.message=PodInitializing old.phase=Pending old.progress=0/1 workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.447Z" level=info msg="node changed" namespace=argo new.message= new.phase=Succeeded new.progress=0/1 nodeID=lifecycle-hook-kmpbx-892768671 old.message=PodInitializing old.phase=Pending old.progress=0/1 workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.447Z" level=info msg="Step group node lifecycle-hook-kmpbx-121855276 successful" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.447Z" level=info msg="node lifecycle-hook-kmpbx-121855276 phase Running -> Succeeded" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.447Z" level=info msg="node lifecycle-hook-kmpbx-121855276 finished: 2023-05-21 16:59:30.447637521 +0000 UTC" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.447Z" level=info msg="Outbound nodes of lifecycle-hook-kmpbx-892768671 is [lifecycle-hook-kmpbx-892768671]" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.447Z" level=info msg="Outbound nodes of lifecycle-hook-kmpbx is [lifecycle-hook-kmpbx-892768671]" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.447Z" level=info msg="node lifecycle-hook-kmpbx phase Running -> Succeeded" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.447Z" level=info msg="node lifecycle-hook-kmpbx finished: 2023-05-21 16:59:30.447676721 +0000 UTC" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.447Z" level=info msg="Checking daemoned children of lifecycle-hook-kmpbx" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.447Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.447Z" level=info msg=reconcileAgentPod namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.447Z" level=info msg="Updated phase Running -> Succeeded" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.447Z" level=info msg="Marking workflow completed" namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.447Z" level=info msg="Checking daemoned children of " namespace=argo workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.453Z" level=info msg="cleaning up pod" action=deletePod key=argo/lifecycle-hook-kmpbx-1340600742-agent/deletePod
time="2023-05-21T16:59:30.468Z" level=info msg="Workflow update successful" namespace=argo phase=Succeeded resourceVersion=992 workflow=lifecycle-hook-kmpbx
time="2023-05-21T16:59:30.476Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/lifecycle-hook-kmpbx-heads-892768671/labelPodCompleted

Logs from in your workflow's wait container

N/A
@sarabala1979
Copy link
Member

@toyamagu-2021 this is an edge case. lifecycle hook node should update the status even workflow is completed. Do you like to submit the PR?

@toyamagu-2021
Copy link
Member Author

@sarabala1979
Thanks. OK, I'll try to submit PR.

sarabala1979 pushed a commit that referenced this issue Jun 4, 2023
…CycleHook (#11113, #11117) (#11176)

Signed-off-by: toyamagu2021@gmail.com <toyamagu2021@gmail.com>
JPZ13 pushed a commit to pipekit/argo-workflows that referenced this issue Jul 4, 2023
…CycleHook (argoproj#11113, argoproj#11117) (argoproj#11176)

Signed-off-by: toyamagu2021@gmail.com <toyamagu2021@gmail.com>
jeremyhager pushed a commit to jeremyhager/argo-workflows that referenced this issue Jul 7, 2023
…CycleHook (argoproj#11113, argoproj#11117) (argoproj#11176)

Signed-off-by: toyamagu2021@gmail.com <toyamagu2021@gmail.com>
Signed-off-by: Jeremy Hager <47301461+jeremyhager@users.noreply.github.com>
terrytangyuan pushed a commit that referenced this issue Jul 19, 2023
…CycleHook (#11113, #11117) (#11176)

Signed-off-by: toyamagu2021@gmail.com <toyamagu2021@gmail.com>
dpadhiar pushed a commit to dpadhiar/argo-workflows that referenced this issue May 9, 2024
…CycleHook (argoproj#11113, argoproj#11117) (argoproj#11176)

Signed-off-by: toyamagu2021@gmail.com <toyamagu2021@gmail.com>
Signed-off-by: Dillen Padhiar <dillen_padhiar@intuit.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 Low priority type/bug
Projects
None yet
2 participants