Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(executor): handle podlog in deadlineExceed termination. Fixes #7092 #7081 #7093

Merged

Conversation

tczhao
Copy link
Member

@tczhao tczhao commented Oct 29, 2021

Fixes #7092

At the moment, when deadlineexceed, the wait container will be stuck at the Wait step and won't proceed further to savelog and annotate output.
This is because when deadlineexceed, the main container.status.termianted is always nil, and k8sapi wait will only proceed when terminated becomes not nil

Fixes #7081


Don't bother creating a PR until you've done this:

  • Run make pre-commit -B to fix codegen, lint, and commit message problems.

Create your PR as a draft.

  • Your PR needs to pass the required checks before it can be approved. If the check is not required (e.g. E2E tests) it
    does not need to pass.
  • Once required tests have passed, you can make it "Ready for review".
  • Say how how you tested your changes. If you changed the UI, attach screenshots.

Tips:

  • If changes were requested, and you've made them, then dismiss the review to get it looked at again.
  • Add you organization to USERS.md if you like.
  • You can ask for help!

@tczhao tczhao changed the title fix(executor)): k8sapi handle deadlineExceed termination fix(executor): k8sapi handle deadlineExceed termination. Fixes #7092 Oct 29, 2021
@tczhao tczhao force-pushed the feature/allow-archive-when-deadlineexceed branch 3 times, most recently from db270b4 to 701433a Compare October 29, 2021 03:54
@tczhao tczhao changed the title fix(executor): k8sapi handle deadlineExceed termination. Fixes #7092 fix(executor): k8sapi/pns handle deadlineExceed termination. Fixes #7092 Oct 29, 2021
@tczhao tczhao marked this pull request as ready for review October 29, 2021 04:46
@tczhao tczhao requested a review from alexec November 1, 2021 12:58
@sarabala1979 sarabala1979 enabled auto-merge (squash) November 3, 2021 16:29
@sarabala1979
Copy link
Member

@tczhao please fix the conflict

auto-merge was automatically disabled November 3, 2021 22:58

Head branch was pushed to by a user without write access

@tczhao tczhao force-pushed the feature/allow-archive-when-deadlineexceed branch from 2d6ab61 to 5910013 Compare November 3, 2021 22:58
@tczhao
Copy link
Member Author

tczhao commented Nov 4, 2021

Thanks @sarabala1979 , fixed conflict

Copy link
Contributor

@alexec alexec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs master merge again.

test/e2e/artifacts_test.go Outdated Show resolved Hide resolved
Signed-off-by: Tianchu Zhao <[email protected]>
Signed-off-by: Tianchu Zhao <[email protected]>
Signed-off-by: Tianchu Zhao <[email protected]>
Signed-off-by: Tianchu Zhao <[email protected]>
Signed-off-by: Tianchu Zhao <[email protected]>
@tczhao tczhao force-pushed the feature/allow-archive-when-deadlineexceed branch 2 times, most recently from 969d381 to fe2f562 Compare November 5, 2021 23:18
@tczhao tczhao force-pushed the feature/allow-archive-when-deadlineexceed branch from fe2f562 to 191e490 Compare November 5, 2021 23:52
@tczhao tczhao requested a review from alexec November 10, 2021 02:38
@tczhao
Copy link
Member Author

tczhao commented Nov 16, 2021

any further comments?

@tczhao tczhao force-pushed the feature/allow-archive-when-deadlineexceed branch from 50c7211 to 708665c Compare December 6, 2021 15:32
@tczhao tczhao requested a review from alexec December 6, 2021 16:05
@alexec alexec assigned alexec and unassigned sarabala1979 Jan 19, 2022
@stale
Copy link

stale bot commented Feb 7, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the problem/stale This has not had a response in some time label Feb 7, 2022
@alexec alexec removed the problem/stale This has not had a response in some time label Feb 7, 2022
@tczhao
Copy link
Member Author

tczhao commented Feb 13, 2022

Hello @alexec , I've made some changes, could you have a review again?

Sorry was on holiday mode and took a while to get back.

@alexec
Copy link
Contributor

alexec commented Feb 23, 2022

Rate limiting is set by DEFAULT_REQUEUE_TIME. This defaults to 10s. When run make start it is set to 100ms to speed up tests. Try running make start DEFAULT_REQUEUE_TIME=10s?

@tczhao
Copy link
Member Author

tczhao commented Feb 24, 2022

setting DEFAULT_REQUEUE_TIME=1s works

I'm wondering what's the best strategy here or if there are any tricks available
seems with the existing change, its failing e2e because DEFAULT_REQUEUE_TIME=100ms

@alexec
Copy link
Contributor

alexec commented Feb 24, 2022

Change e2e test set-up?

I'm wondering what's the best strategy here or if there are any tricks available
seems with the existing change, its failing e2e because DEFAULT_REQUEUE_TIME=100ms

Change test set-up. If tests run much longer, then we discus. If not, happy day.

Copy link
Contributor

@alexec alexec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tczhao I plan to refactor assessNodeStatus. Perhaps you'd like to review my PR #7996 and see if that fixes your issue.

The k8sapi executer will be removed in v3.4 or v3.5. I don't know how much work we should do on it.

@tczhao
Copy link
Member Author

tczhao commented Feb 25, 2022

Change e2e test set-up?

I'm wondering what's the best strategy here or if there are any tricks available
seems with the existing change, its failing e2e because DEFAULT_REQUEUE_TIME=100ms

Change test set-up. If tests run much longer, then we discus. If not, happy day.

Change DEFAULT_REQUEUE_TIME from 100ms to 1s results in about 12% increase in test time
test time is recorded from all e2e Run make test- scripts

image

@tczhao
Copy link
Member Author

tczhao commented Feb 25, 2022

@tczhao I plan to refactor assessNodeStatus. Perhaps you'd like to review my PR #7996 and see if that fixes your issue.

The k8sapi executer will be removed in v3.4 or v3.5. I don't know how much work we should do on it.

  • That Re-factor assessNodeStatus  #7996 PR solves the issue of not annotating artifact
    but we still need DEFAULT_REQUEUE_TIME to be at 1s to pass the ActiveDeadlineSeconds test

  • I'm not concerned too much on the k8sapi part,
    maybe still a good patch on 3.3,
    I will leave it up to you to make a judgement on if we need that

@alexec
Copy link
Contributor

alexec commented Feb 25, 2022

Please resolve conflicts.

@tczhao tczhao requested a review from alexec February 26, 2022 00:25
@alexec alexec merged commit 931cbbd into argoproj:master Mar 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants