Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(controller): Tolerate malformed resources. Fixes #3677 #3680

Merged
merged 18 commits into from
Aug 11, 2020

Conversation

alexec
Copy link
Contributor

@alexec alexec commented Aug 5, 2020

Checklist:

  • Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this is a chore.
  • The title of the PR is (a) conventional, (b) states what changed, and (c) suffixes the related issues number. E.g. "fix(controller): Updates such and such. Fixes #1234".
  • I've signed the CLA.
  • I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
  • My builds are green. Try syncing with master if they are not.
  • My organization is added to USERS.md.

Fixes #3677

Summary

This PR replaces the standard Kubernetes informer with a dynamic informer, where the code ignores malformed resources.

Changes

The approach differs slightly for workflow templates and cron workflows. Cron is simpler to directly replace the informer with our own informer, keeping the changes simpler. With workflow templates, the code wraps the informers in a "tolerant" translation layer that ignores malformed resources for list operations.

Testing

This is tested using e2e tests.

@alexec alexec marked this pull request as ready for review August 6, 2020 17:01
start := time.Now()
logCtx := log.WithFields(log.Fields{"workflow": workflowName, "condition": condition, "timeout": timeout})

fieldSelector := ""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the behaviour of this func to make it more useful, you no longer need to know the name of the workflow you want to wait for. We alternatively wait for any argo-e2e workflow.

@simster7 simster7 self-assigned this Aug 6, 2020

func (s *MalformedResourcesSuite) TestMalformedWorkflowTemplate() {
s.Given().
Exec("kubectl", []string{"apply", "-f", "testdata/malformed/malformed-workflowtemplate.yaml"}, fixtures.NoError).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to add a workflow that uses a malformed workflow template here

Copy link
Member

@simster7 simster7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand this PR very well. Asked a couple of questions

Comment on lines +178 to +180
var NoError = func(t *testing.T, output string, err error) {
assert.NoError(t, err, output)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

}

// a drop-in replacement for `extwfv1.ClusterWorkflowTemplateInformer` that ignores malformed resources
func NewTolerantClusterWorkflowTemplateInformer(dynamicInterface dynamic.Interface, defaultResync time.Duration) extwfv1.ClusterWorkflowTemplateInformer {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what makes this "tolerant". Can you explain a bit?

@@ -0,0 +1,28 @@
package informer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file name ends in .co.go. Typo?

@@ -39,9 +44,10 @@ type Controller struct {
wfInformer cache.SharedIndexInformer
wfLister util.WorkflowLister
wfQueue workqueue.RateLimitingInterface
cronWfInformer extv1alpha1.CronWorkflowInformer
cronWfInformer informers.GenericInformer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generated informers we currently use get "stuck" if a malformed resource in created. By "malformed" I mean that while they are correctly formatted as JSON or YAML, they cannot be the resource kind, e.g. because they have an array where an object is expected. This stops the inform working: the informer sits there repeatedly printing an error.

By using a dynamic informer (which is effectively the same as as an unstructured informer) we remove this problem and we control the parsing into the required type.

return &Controller{
wfClientset: wfclientset,
namespace: namespace,
managedNamespace: managedNamespace,
instanceId: instanceId,
cron: cron.New(),
restConfig: restConfig,
dynamicInterface: dynamicInterface,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for creating a generic informer

Then().
ExpectWorkflow(func(t *testing.T, metadata *metav1.ObjectMeta, status *wfv1.WorkflowStatus) {
assert.Equal(t, "wellformed", metadata.Name)
assert.Equal(t, wfv1.NodeError, status.Phase)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should also check the message makes sense to the user

Then().
ExpectWorkflow(func(t *testing.T, metadata *metav1.ObjectMeta, status *wfv1.WorkflowStatus) {
assert.Equal(t, "wellformed", metadata.Name)
assert.Equal(t, wfv1.NodeError, status.Phase)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should also check the message makes sense to the user

@alexec alexec merged commit da115f9 into argoproj:master Aug 11, 2020
@alexec alexec deleted the fix-3677 branch August 11, 2020 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make the controller tolerant to malformed resources
2 participants