Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Externally Managed CRD Fields #5730

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dalton-hill-0
Copy link
Contributor

@dalton-hill-0 dalton-hill-0 commented May 23, 2024

Description of your changes

Use server-side apply for reconciling CRDs for composite resources and claims. This will allow for other processes (e.g., CertManager) to control fields on the CRD that are not owned by the reconciler.

Note: The current implementation only fixes composite resource CRDs. If this approach is desired, I will implement for Claims as well.

Fixes #5723

I have:

  • Read and followed Crossplane's [contribution process].
  • Run make reviewable to ensure this PR is ready for review.
  • Added or updated unit tests.
  • Added or updated e2e tests.
  • Linked a PR or a [docs tracking issue] to [document this change].
  • Added backport release-x.y labels to auto-backport this PR.

@@ -267,8 +282,9 @@ func NewReconciler(mgr manager.Manager, opts ...ReconcilerOption) *Reconciler {

// A Reconciler reconciles CompositeResourceDefinitions.
type Reconciler struct {
client resource.ClientApplicator
mgr manager.Manager
client resource.ClientApplicator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this still need to be a ClientApplicator? We don't call the Apply method anymore, right?

Comment on lines 219 to 220
// WithClientset specifies how the Reconciler should interact with the
// Kubernetes API.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It always bothers me a bit when we have to use both a controller-runtime client and a client-go clientset. I'm guessing its unavoidable in this case right, because the controller-runtime client can't SSA typed objects (only *Unstructured)?

@@ -278,13 +294,16 @@ type Reconciler struct {
xrInformers composedResourceInformers

options apiextensionscontroller.Options

name string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used anywhere?

@@ -97,6 +101,117 @@ func ForCompositeResource(xrd *v1.CompositeResourceDefinition) (*extv1.CustomRes
return crd, nil
}

func ModifyApplyConfigurationForCompositeResource(crd *appcfgextv1.CustomResourceDefinitionApplyConfiguration, xrd *v1.CompositeResourceDefinition) error { //nolint:gocognit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needing to do all of this feels a little like it defeats the point of SSA (i.e. just form a simple intent and send it). I wonder if it would be simpler to just do a get-mutate-update of the CRD? I think we could assume exclusive ownership of the CRD spec (i.e. just overwrite that each time).

@dalton-hill-0
Copy link
Contributor Author

@negz
I took an entirely different approach (no longer using applyconfigurations libraries) to address your comments. This just removes the applicator and replaces Apply with Patch. I grabbed the patch args from syncer_ssa.go.
I tested this and it still allows CertManager to function as desired.

@negz
Copy link
Member

negz commented May 24, 2024

@dalton-hill-0 This looks great, and is definitely ideally how SSA would work.

Unfortunately I think there are bugs when you SSA a 'real' type (i.e. a CRD) using the controller-runtime client today. See kubernetes-sigs/controller-runtime#347 (comment). This approach works in other Crossplane controllers because we're using *Unstructured rather than a real/structured runtime.Object.

@dalton-hill-0
Copy link
Contributor Author

Unfortunately I think there are bugs when you SSA a 'real' type (i.e. a CRD) using the controller-runtime client today.

That is indeed unfortunate.

Needing to do all of this feels a little like it defeats the point of SSA (i.e. just form a simple intent and send it). I wonder if it would be simpler to just do a get-mutate-update of the CRD? I think we could assume exclusive ownership of the CRD spec (i.e. just overwrite that each time).

I agree, and it introduces a lot of new code that could introduce bugs. However, CertManager needs to own the field at spec.conversion.webhook.clientConfig.caBundle so overriding spec each time would prevent the desired behavior of this PR (unless if I'm thinking of this wrong?).

A couple of ideas:

  • Maybe we can just use Patch anyways? It has some bugs, but do these bugs affect how Crossplane handles CRDs? I assume we are already affected by zero-value defaults with how we handle CRDs currently. Are there other bugs I'm not aware of that would also affect CRDs?
  • We could use one of the other approaches mentioned in the issue (see below)

Hard-Coded Approach
Add hard-coded logic in Crossplane's Claim CRD and CompositeResource CRD reconcilers that is specific to CertManager. This logic would check for a CA Bundle in the existing CRD and copy it over before applying the spec derived from the XRD.

Generic Approach
Extend the XRD API to allow users to specify paths that are externally managed (e.g., by CertManger). For each path, the CRD reconciler would attempt to copy the value from existing CRD before applying the CRD derived from the XRD.

@negz
Copy link
Member

negz commented May 24, 2024

Maybe we can just use Patch anyways? It has some bugs, but do these bugs affect how Crossplane handles CRDs? I assume we are already affected by zero-value defaults with how we handle CRDs currently. Are there other bugs I'm not aware of that would also affect CRDs?

This sounds good to me - if it looks like we're not actually affected in practice I'm happy to stick with the implementation you have here and maybe add a comment mentioning that we're aware of the issue but don't seem to be affected.

@dalton-hill-0
Copy link
Contributor Author

I added comments and updated tests.
Additionally I checked some diffs between the CRDs that get generated on the main branch vs this branch to make sure nothing changed.

@dalton-hill-0 dalton-hill-0 marked this pull request as ready for review May 28, 2024 16:07
@dalton-hill-0 dalton-hill-0 requested review from phisco and a team as code owners May 28, 2024 16:07
@negz
Copy link
Member

negz commented May 29, 2024

@dalton-hill-0 I noticed you closed this - did you find that this approach wouldn't work?

@dalton-hill-0 dalton-hill-0 reopened this May 29, 2024
@dalton-hill-0
Copy link
Contributor Author

@dalton-hill-0 I noticed you closed this - did you find that this approach wouldn't work?

No that was on accident. I accidentally synced the wrong fork branch and I think it closed this as a result.

Comment on lines +434 to +437
// We are aware that using controller-runtime server-side apply will result in
// some zero-value fields, however, this should not affect the way CRDs are
// implemented in this controller. For a discussion on the runtime issue, see
// https://github.com/kubernetes-sigs/controller-runtime/issues/347.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@turkenh Could you do a quick sanity check on this. I know you went pretty deep on SSA issues recently for provider-kubernetes.

I feel okay switching to SSA here as long as we don't see any issues applying CRDs in practice (which appears to be the case in @dalton-hill-0's testing.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a friendly reminder 🙂
@turkenh

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dalton-hill-0 It looks like @turkenh is on vacation. He'll be back in a week.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds good to me.

To see whether SSA causing any immediate problem, a quick sanity test could be leaving a watch request for a CRD of an XRD open for a while, e.g. kubectl get crds <crd-name> -w, and checking whether it gets periodic updates or not (i.e. resourceVersion keeps increasing). You may restart Crossplane or add some labels to XRDs to trigger new reconciliations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I performed the sanity checks above and all behavior was expected.

  • no increase to resource version on crossplane restart
  • no increase to resource version when adding annotations to xrds (they do not get copied over from meta)
  • resource version increased when adding label to xrd (copies over to crds)
  • no infinite patch loop between crossplane and cert manager

@dalton-hill-0
Copy link
Contributor Author

Rebased from master and re-tested.

@dalton-hill-0
Copy link
Contributor Author

dalton-hill-0 commented Jun 28, 2024

From what I can tell, the e2e tests that are failing are expected with the change and don't look concerning. If you agree, I'll update the tests to pass.

Edit: There might be some requirement that unspecified fields get wiped after upgrade? If so, this could be problematic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Externally Managed CRD Fields
3 participants