Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(controller): Handle quota issues during pod reconciliation #3175

Closed
wants to merge 3 commits into from
Closed

fix(controller): Handle quota issues during pod reconciliation #3175

wants to merge 3 commits into from

Conversation

terrytangyuan
Copy link
Member

@terrytangyuan terrytangyuan commented Jun 4, 2020

Related #721.

Checklist:

  • Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this is a chore.
  • The title of the PR is (a) conventional, (b) states what changed, and (c) suffixes the related issues number. E.g. "fix(controller): Updates such and such. Fixes #1234".
  • I've signed the CLA.
  • I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
  • My builds are green. Try syncing with master if they are not.
  • My organization is added to USERS.md.

@terrytangyuan terrytangyuan changed the title fix(controller): Handle and propogate quota issue during pod reconciliation fix(controller): Handle quota issues during pod reconciliation Jun 4, 2020
Signed-off-by: terrytangyuan <[email protected]>
@alexec alexec self-assigned this Jun 4, 2020
@sonarcloud
Copy link

sonarcloud bot commented Jun 4, 2020

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities (and Security Hotspot 0 Security Hotspots to review)
Code Smell A 3 Code Smells

57.1% 57.1% Coverage
0.0% 0.0% Duplication

Copy link
Member

@simster7 simster7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already fixed by #2385

@@ -819,8 +837,8 @@ func (woc *wfOperationCtx) podReconciliation() error {
// It is now impossible to infer pod status. The only thing we can do at this point is to mark
// the node with Error.
for nodeID, node := range woc.wf.Status.Nodes {
if node.Type != wfv1.NodeTypePod || node.Fulfilled() || node.StartedAt.IsZero() {
// node is not a pod, it is already complete, or it can be re-run.
if node.Type != wfv1.NodeTypePod || node.Fulfilled() || node.StartedAt.IsZero() || ExceededQuota(&node) || FailedQuota(&node) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would dead-lock the controller, making the solution in #2385 moot.

@@ -111,6 +119,16 @@ type failedNodeStatus struct {
FinishedAt metav1.Time `json:"finishedAt"`
}

// ExceededQuota checks if the error message indicates an exceeded quota in the namespace.
func ExceededQuota(n *wfv1.NodeStatus) bool {
return strings.Contains(n.Message, exceededQuotaString)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a preferable way to diagnose this error as node strings could be arbitrary

@simster7
Copy link
Member

simster7 commented Jun 4, 2020

Closing as this is already fixed

@simster7 simster7 closed this Jun 4, 2020
@terrytangyuan
Copy link
Member Author

Thanks for double checking. I was looking at an outdated version and porting some of the changes. Good to know that it’s fixed.

@simster7
Copy link
Member

simster7 commented Jun 5, 2020

No worries! Thanks for trying to fix

@terrytangyuan terrytangyuan deleted the quota-compt branch February 9, 2021 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants