Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Clean up shadow CRDs after API bindings are deleted #2298

Merged
merged 1 commit into from
Nov 11, 2022

Conversation

kylape
Copy link
Contributor

@kylape kylape commented Nov 3, 2022

Summary

When all API bindings that use a particular APIResourceSchema are deleted, the CRDs that back those bindings need to be deleted.

Fixes #2300

pkg/reconciler/apis/crdcleanup/crdcleanup_controller.go Outdated Show resolved Hide resolved
pkg/reconciler/apis/crdcleanup/crdcleanup_controller.go Outdated Show resolved Hide resolved
pkg/reconciler/apis/crdcleanup/crdcleanup_controller.go Outdated Show resolved Hide resolved

for _, binding := range bindingList {
if len(binding.Status.BoundResources) == 0 {
logger.V(1).Info("APIBinding has no bound resources. Need to wait until all bindings have bound resources")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry a bit that an "invalid" binding (e.g. referenced APIExport doesn't exist, whatever...) could permanently interrupt the cleanup process

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how can that happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how this controller can determine if any CRD can be deleted if it cannot see the set of schemas that every API binding points to. Perhaps there is a way to identify an "invalid" binding such that this controller can safely ignore it?

pkg/reconciler/apis/crdcleanup/crdcleanup_controller.go Outdated Show resolved Hide resolved
pkg/indexers/apibinding.go Outdated Show resolved Hide resolved
for _, binding := range bindingList {
if len(binding.Status.BoundResources) == 0 {
logger.V(1).Info("APIBinding has no bound resources. Need to wait until all bindings have bound resources")
c.queue.AddAfter(key, time.Second)
Copy link
Member

@sttts sttts Nov 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why requeue? This happens automatically. You are watching bindings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out we do in fact need this explicit requeue. If the APIBinding that was discovered to have no bound resources is not actually related to this CRD (we have no reasonable way to know), then we lost perhaps our last opportunity to delete this particular CRD (in the event that it was the deletion event for the last APIBinding for this CRD).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't this is true. There is some case when BoundResources gets empty (could be in theory). The solution with requeue is not correct. It causes a potential DoS.

What we could do:

  1. depend on resync to eventually relist the CRD
  2. do something that normally a big nogo: enqueue both old and new object on update.

Copy link
Contributor Author

@kylape kylape Nov 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The APIBinding controller will create the bound CRD, wait for it to become established, and then it will add the BoundResource to the API binding's status. I've seen the potential race condition cause a problem when running e2e tests with -count 20, but not -count 5.

depend on resync to eventually relist the CRD

This seems reasonable for now at least, but how can we run the e2e test if we rely on the resync to catch edge cases? Do we need to modify or remove our new check in the e2e test? Can we change the resync period for the test?

It causes a potential DoS

Could we call queue.AddRateLimited(key) instead if we're only concerned about DoS?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The APIBinding controller will create the bound CRD, wait for it to become established, and then it will add the BoundResource to the API binding's status. I've seen the potential race condition cause a problem when running e2e tests with -count 20, but not -count 5.

Got it. Yes, this is true. Very good insight!

I see how your requeue helps. I am worried that it endlessly requeues. Thinking. Not sure how AddRateLimited really works internally. It's won't keep a giant per-key map with requeues, will it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking out loud: could we check for APIResourceSchema existence, and everything else before the actual CRD creation, and only do the latter AFTER setting the BoundResources ?

Wondering whether we run into any other race that way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the creationTime idea would help. Let a CRD live up to a minute or so and only delete then.

Copy link
Contributor Author

@kylape kylape Nov 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's won't keep a giant per-key map with requeues, will it?

Yes, I'm afraid if some APIBinding somehow never made it to the Bound phase yet also had a valid APIExport, then eventually every CRD key would end up in the requeue map. This should be considered a bug, and we could even alert if the requeue count increased too quickly for the CRD cleanup controller.

could we check for APIResourceSchema existence, and everything else before the actual CRD creation, and only do the latter AFTER setting the BoundResources ?

Possibly, but this seems unintuitive to me. The resource isn't actually bound until the CRD is created!

creationTime idea would help.

If an APIBinding (that has a valid APIExport) never makes it to the Bound phase, then the creationTime idea would only delay the inevitable addition of the CRD to the requeue map.

Barring that edge case (which I believe doesn't exist today and if it did would be considered a bug), I can't think of any other scenarios that aren't covered by the code as-is, but I wouldn't mind adding a creationTime check as a "belts and suspenders" kind of thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then the creationTime idea would only delay the inevitable addition of the CRD to the requeue map.

No, I mean to delete the CRD after a minute, so give a minute of time to get over the race.

@kylape
Copy link
Contributor Author

kylape commented Nov 4, 2022

kcp-dev/kubernetes#110 is required for this PR in order to stop KCP from panicking upon CRD deletion.

pkg/reconciler/apis/crdcleanup/crdcleanup_controller.go Outdated Show resolved Hide resolved

logger := logging.WithObject(logging.WithReconciler(klog.Background(), ControllerName), binding)

for _, boundResource := range binding.Status.BoundResources {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally, we'd have an index on the CRD informer to track CRDs by their binding, and queue all matching CRDs. I think using the list here in status is still correct, but it would be good to ACK that we've thought through the edge cases here to ensure that we can't miss a CRD if it's still related to the binding but is not listed in the bound resources array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the index get updated before this function is called? If yes, then using an index would be no better than listing.

The only edge case I can think of is if an APIResourceSchema was removed from an APIExport. Is that allowed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the index will be updated first.

Yes, removing APIRS from an APIExport is possible.

pkg/reconciler/apis/crdcleanup/crdcleanup_controller.go Outdated Show resolved Hide resolved
return nil
}

// The index can't solely be relied on to be certain this CRD can be deleted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is doing what you think this is doing. Right now the c.listAPIBindings() call is using the lister - because of the way the machinery works, whether you access the list of bindings from the lister & filter manually using Go code, or if you are accessing the filtered list via the indexer, you are fundamentally operating against the exact same set of objects in the in-memory cache, and you will never (under normal circumstances) get a different result. I think you might want to use the client to do a live call here, instead? It may be useful then also to add labels or something to the bound CRDs for efficient server-side selection, as CRDs are large objects and having to do a full LIST here will be a heavy call.

Copy link
Contributor Author

@kylape kylape Nov 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment here is misleading. I'm not listing here due to a concern about resources in in the index being out of date due to an unprocessed event, etc. I'm instead seeing if any APIBindings are in an unready state to avoid deleting the CRD immediately after it was created. The APIBinding controller will wait for the bound CRD to become established before it adds the corresponding BoundResource to its status, so it's expected that the case where an APIBinding has no BoundResources will be common. This thread is discussing this as well.

Note that if any APIBinding is not ready, we cannot confidently delete any (bound) CRD.

If that all sounds reasonable to you, then I'll update the comment in the code to be less confusing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Let's add another index on APIBindings (by phase) and simply check the number of them that are pending - and update the comment :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I started writing the index, and then I realized that there was another edge case: the APIExport reference in an APIBinding is invalid. This makes the index much less clean than "by phase", some thing like "APIExportIsValidAndPhaseIsNotBound". So I kept the list in place but added a big comment trying to explain why we are doing the list after the index lookup.

Copy link
Contributor Author

@kylape kylape Nov 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's clear why this new edge case is related to the index.

Note that if any APIBinding is not ready, we cannot confidently delete any (bound) CRD.

The edge case makes this statement untrue since APIBindings with an invalid APIExport will never become ready, and we must exclude them from consideration to prevent every bound CRD from being requeued forever.


for _, boundResource := range newBinding.Status.BoundResources {
uidSet[boundResource.Schema.UID] = false
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is strange logic. Why loop over both? Why not just remove those UID that still show up in the new binding?

Copy link
Contributor Author

@kylape kylape Nov 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the code comment: It looks at old and new versions in case a schema gets removed from an APIExport. In that case, the last APIBinding to have the schema removed will trigger the CRD delete, but only the old version will have the reference to the schema.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What Kyle wrote makes sense to me

@ncdc
Copy link
Member

ncdc commented Nov 10, 2022

LGTM - please squash!
/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 10, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ncdc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 10, 2022
When all API bindings that use a particular APIResourceSchema are
deleted, the CRDs that back those bindings need to be deleted.

Set identity annotation on CRDs during refresh to prevent a panic
@csams
Copy link
Contributor

csams commented Nov 11, 2022

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 11, 2022
@openshift-merge-robot openshift-merge-robot merged commit 4882d9e into kcp-dev:main Nov 11, 2022
@kylape kylape deleted the crd-cleanup branch November 12, 2022 02:04
@kylape
Copy link
Contributor Author

kylape commented Nov 15, 2022

/cherry-pick release-0.9

@openshift-cherrypick-robot

@kylape: #2298 failed to apply on top of branch "release-0.9":

Applying: Clean up shadow CRDs after API bindings are deleted
Using index info to reconstruct a base tree...
M	pkg/indexers/apibinding.go
M	pkg/reconciler/apis/apibindingdeletion/apibinding_deletion_controller.go
M	pkg/server/apiextensions.go
M	pkg/server/controllers.go
M	pkg/server/server.go
M	test/e2e/apibinding/apibinding_deletion_test.go
Falling back to patching base and 3-way merge...
Auto-merging test/e2e/apibinding/apibinding_deletion_test.go
CONFLICT (content): Merge conflict in test/e2e/apibinding/apibinding_deletion_test.go
Auto-merging pkg/server/server.go
Auto-merging pkg/server/controllers.go
Auto-merging pkg/server/apiextensions.go
Auto-merging pkg/reconciler/apis/apibindingdeletion/apibinding_deletion_controller.go
CONFLICT (content): Merge conflict in pkg/reconciler/apis/apibindingdeletion/apibinding_deletion_controller.go
Auto-merging pkg/indexers/apibinding.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Clean up shadow CRDs after API bindings are deleted
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-0.9

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kylape
Copy link
Contributor Author

kylape commented Nov 16, 2022

Fixes #2304

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: KCP consumes lots of memory over time as bound APIs are created and deleted
8 participants