Allow use of $ref in in CRD validation schema #62872

Liujingfang1 · 2018-04-19T19:06:35Z

(combined with #76965)

`$ref` scenario 1:

In go types, we can define a type and use it in several other types. When generating CRD from this go types, we want to use reference for that type in the validation so that the schema has better readability.

package v1beta1

import (
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
type DataType struct {
  Value int `json:"value,omitempty"`
}
type MyKindSpec struct {
  Data []DataType `json:"data,omitempty"`
}

type MyKindStatus struct {
  Data []DataType `json:"data,omitempty"`
}

// +genclient
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object

// MyKind
// +k8s:openapi-gen=true
// +resource:path=mykinds
type MyKind struct {
    metav1.TypeMeta   `json:",inline"`
    metav1.ObjectMeta `json:"metadata,omitempty"`

    Spec   MyKindSpec   `json:"spec,omitempty"`
    Status MyKindStatus `json:"status,omitempty"`
}

What you expected to happen:
In the CRD for this type, we expect to see the validation as

     spec:
          properties:
            data:
              items:
                $ref: '#/definitions/<...>.DataType'

`$ref` scenario 2:

Original type information is lost when CRD authors expand their openapi definitions. openapi-gen-based operator code is similarly affected. CRDs forbid $ref (implicit GVK link) and x-* properties (GVK hints).

This information enables early, developer-friendly policy checks and useful manifest transforms. Our CICD automation extends the base k8s schema with custom policy and relies on the schema's $ref-erential knowledge to inspect and operate on target resource types, anywhere they appear, eg. deeply nested or embedded within custom resources. Prior to validation, a simple callback might first:

ObjectMeta

Require teams set app.kubernetes.io/* labels.
Require teams set org-specific cost/usage/alert aggregation labels.
Back and forward compat handling, eg. app, app.kubernetes.io/name, team, ...

Container

Inject dynamic ENV and volumes.
Resolve string properties with templates.
Require several resources.limits.* are set.

PodSpec ... PodSecurityContext ... Volume ...
Deployment ... CronJobSpec ... JobSpec ...
ServiceSpec ... SecurityContext ... (many more) ...

$ref expansion makes CRDs opaque to our automation. A pre-processing step is required to repair the type information and restore visibility into CRDs. Options considered are:

Import all golang-based openapi-gen'ed CRDs directly (wherever they write openapi_generated.go) and call each GetOpenAPIDefinitions function. Merge all results.
Run openapi-gen on k8s and CRD sources ourselves. Avoids multiple GetOpenAPIDefinitions calls and potential linking problems.
Use kube-apiserver. Run hack/update-openapi-spec.sh with extra CRDs "builtin" and enabled so GET /openapi/v2 simply returns what we need.

I'd love $ref support, but I understand why that could be difficult or undesirable. Alas, if CRDs permitted a known property, eg. x-kubernetes-group-version-kind (or similar), CRD libraries can set it accordingly for each unique $ref expansion (no need to nest GVK hints), allowing consumers (or even k8s itself) to restore this relationship if and when useful.

Are there options we can move on?

The text was updated successfully, but these errors were encountered:

Liujingfang1 · 2018-04-19T19:07:09Z

/sig api-machinery

Liujingfang1 · 2018-04-19T19:07:55Z

/assign mbohlool

lavalamp · 2018-04-19T20:24:48Z

/assign @mbohlool

I agree it seems good to allow references within a single CRD (as opposed to across resources, which I think we're not ready for).

liggitt · 2018-04-19T21:31:12Z

The reference-expanding code in the schema lib we're using for CRD validation is scary… if we do this, we shouldn't use an expander that has any possibility of making network or filesystem checks in the process

nikhita · 2018-04-20T08:17:59Z

The reference-expanding code in the schema lib we're using for CRD validation is scary… if we do this, we shouldn't use an expander that has any possibility of making network or filesystem checks in the process

👍

/area custom-resources
/cc @sttts

nikhita · 2018-04-20T10:41:06Z

/area custom-resources

sttts · 2018-04-24T16:21:42Z

reference-expanding code in the schema lib we're using for CRD validation is scary

Sounds like we have to abstract the reference resolver in kube-apiserver and instantiate it with something less scary.

sttts · 2018-04-24T16:21:53Z

/cc @ayushpateria

fejta-bot · 2018-07-23T17:08:52Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

nikhita · 2018-07-23T18:54:36Z

/remove-lifecycle stale

fejta-bot · 2018-10-21T19:05:08Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

nikhita · 2018-11-14T10:56:57Z

/remove-lifecycle stale

Will need to investigate if/how we want to do this.

alegacy · 2019-01-31T14:34:55Z

Has there been any progress on this investigation? We have a usecase where using references from within a single CRD would be useful.

The motivation behind this is better user experience when dealing with Kubernetes CRDs. CRDs in an openapi definition are expressed as one massive definition rather than the Kubernetes builtin types which rarely nest and instead make use of references within openapi. These references are desirable as it makes it more manageable to sparsely define resources utilizing the nested defaults as you go down the structure. Unfortunately it looks like support for references wrt CRDs is unlikely to be supported in the near future: kubernetes/kubernetes#62872 As a workaround, this adds a new option `--splitPaths` that takes a path and an optional new model name. The path roughly emulates the idea behind `kubectl explain` but I ended up using the `~` character instead of `.` such that the full parent model name could still be used without ambiguities. During type conversion any matches of path will cause a Dhall reference to be injected with the nested definition being pushed back on the stack of definitions to convert. The top level model name can either be specified on the command line via `=com.some.ModelName` or be guessed in the case the CRD author follows the best practice of using the field name as the first word in the description. Some future work that might need exploring related to this are: 1. Allow specifying `splitPaths` from a file 2. Clean up which types get accumulated into the `typesUnion`. I noticed while doing this that all? top level definitions are being dumped into this union where really it should only be top level definitions that have an `apiVersion` and `kind` 3. See if the `mergeNoConflicts` function needs to be improved such that it takes semantic equality (traversing imports) into account.

The motivation behind this is better user experience when dealing with Kubernetes CRDs. CRDs in an openapi definition are expressed as one massive definition rather than the Kubernetes builtin types which rarely nest and instead make use of references within openapi. These references are desirable as it makes it more manageable to sparsely define resources utilizing the nested defaults as you go down the structure. Unfortunately it looks like support for references wrt CRDs is unlikely to be supported in the near future: kubernetes/kubernetes#62872 As a workaround, this adds a new option `--splitPaths` that takes a path and an optional new model name. The path roughly emulates the idea behind `kubectl explain` but () must be used around some model names to disambiguate with a `.` character. During type conversion any matches of path will cause a Dhall reference to be injected with the nested definition being pushed back on the stack of definitions to convert. The top level model name can either be specified on the command line via `=com.some.ModelName` or be guessed in the case the CRD author follows the best practice of using the field name as the first word in the description. Some future work that might need exploring related to this are: 1. Allow specifying `splitPaths` from a file 2. Clean up which types get accumulated into the `typesUnion`. I noticed while doing this that all? top level definitions are being dumped into this union where really it should only be top level definitions that have an `apiVersion` and `kind` 3. See if the `mergeNoConflicts` function needs to be improved such that it takes semantic equality (traversing imports) into account.

the-voidl · 2022-09-28T16:08:40Z

Is there still any chance you let the users use $ref in CRDs?
I'd estimate a recursive CRD as relatively common - even the definitions of io.k8s.api resources highly rely on $ref.

I see you points @liggitt, still would love any feedbeek if anyone will be implementing this.

liggitt · 2022-09-28T16:18:56Z

This doesn't seem likely to make progress in an issue (no one is working on a design that addresses the significant additional complexity and failure modes introduced). I'd suggest closing this issue, and bringing up the topic in a sig-api-machinery meeting to gauge whether addition of this feature is likely to be accepted and whether there is priority / bandwidth to shepherd a design and implementation for it now

lavalamp · 2022-09-28T16:23:59Z

I think this issue has value as the place everyone is apparently watching.

I agree it's not making progress right now and to make progress needs an owner to drive it.

The first hurdle here is to gather requirements and demonstrate that some (possibly limited) version of this would be very helpful to a sufficient number of users. This could either start with a SIG meeting agenda item, or end with a SIG meeting agenda item once evidence has been collected, or both.

Schema recursion seems actually necessary to me to express some schemas? So I expect some utility to be forthcoming. However it is (obviously) extremely dangerous, too.

zenbones · 2022-11-28T18:52:40Z

I would call any schema meant to describe an object hierarchy, that can't handle self-referential recursion... impoverished. Would it really be that hard to handle a $ref that just recursively imported the schema pointed to by $ref? Inline the entire thing and then run the validation, which should be the same as having to write the whole thing out, which is currently allowed. I'm just looking for a way to avoid massive cut and paste that I need to find/replace on whenever I make a change. It can not be that hard to allow $ref in the same file as a starting point. I would do it but I find Go impoverished and hate grinding my teeth while I code. I know, that last comment is both unfair and unnecessary, and I apologize in advance, but I'm leaving it because it's true.

lavalamp · 2022-11-28T18:59:20Z

Some such references expand infinitely and can't be expanded, that's a large part of what makes it dangerous.

rjeberhard · 2022-11-28T19:28:53Z

I don't want to exclude others' use cases, but I would be satisfied with a mechanism to reference types from built-in schemas. For instance, we have a CRD where our controller will create Pods based on the contents of the custom resource. It would be very convenient to be able to reference existing types for volume and volume mounts, etc. These types (e.g. volumes) if properly expressed in a CRD's schema are very large. Note: there are other solutions such as having the custom resource reference a PodTemplate, but this isn't as convenient for our users.

liggitt · 2022-11-28T19:33:48Z

I don't want to exclude others' use cases, but I would be satisfied with a mechanism to reference types from built-in schemas

github helpfully hid my comment at #62872 (comment), but even referencing types from built-in schemas has a lot of potential problems

zenbones · 2023-01-10T08:36:37Z

Some such references expand infinitely and can't be expanded, that's a large part of what makes it dangerous.

How about a recursion limit as a tried and tested solution? Expand, look for expansions, expand... oh we've done that 10 times, throw an error. I mean, it's my machine and my compute time so I would appreciate a simple argument to set the limit, but I'd happily take a reasonable static limit.

jessefeinman · 2023-04-11T18:17:29Z

I'm facing a similar situation where $ref support would be helpful.

Specifically $ref support for accessing other locations in the same file would be nice but more importantly would be $ref support for accessing k8s types (or other loaded CRDs though thats not necessary).

e.g. we have rules to validate ObjectMeta (among other types) in our CICD pipeline, we currently have to do a bunch of tedious code generation in order to have the type references in the schemas we validate against. These references let us walk through the yaml and run custom rules against the specific types when they appear. We are unable to migrate to validating against the schemas defined by the CRDs directly because we wouldn't be able to correctly detect that a type defined in a CRD is actually a k8s type that we have a rule for. However, if there was a $ref with a path to a k8s type, then a) the crd is simpler to understand and b) we can determine if that path matches a type we have a custom rule for already and can then apply it.

lavalamp · 2023-04-12T21:26:18Z

a $ref with a path to a k8s type

This gets dicey around upgrades when a k8s type does some (legal) modification like add a field.

This is not an easy change to make and it's not currently prioritized (but we might do it eventually).

jpbetz · 2023-04-12T21:42:10Z

We've also seen some general problems with type sharing in the API. This came up recently with pod containers and initContainers sharing the same type, which caused problems when wanting to add a field needed only for initContainers (xref #115362).

That said, I can see some value to doing a $ref to things like podTemplate, which seems reasonable if the goal is to just envelope the podTemplate without needing to care about the contents.

rjeberhard · 2023-04-12T21:47:49Z

That said, I can see some value to doing a $ref to things like podTemplate, which seems reasonable if the goal is to just envelope the podTemplate without needing to care about the contents.

This use case would really help. We have a variety of operators that create pods and deployments and where our customers would like to customize the generated pods by specifying sidecar containers, volumes, etc. We've experimented with letting the customer specify a reference to a pod template, but they found this confusing. So, instead, we've reproduced the rather large schemas of these types in our CRD.

k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. kind/feature Categorizes issue or PR as related to a new feature. labels Apr 19, 2018

k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 19, 2018

k8s-ci-robot assigned mbohlool Apr 19, 2018

k8s-ci-robot added the area/custom-resources label Apr 20, 2018

nikhita mentioned this issue May 12, 2018

Umbrella issue for CRDs moving to GA #58682

Closed

54 tasks

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 23, 2018

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 23, 2018

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 21, 2018

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 14, 2018

mbohlool assigned roycaihw and unassigned mbohlool Apr 15, 2019

liggitt added this to Not required for GA in Custom Resource Definitions Apr 18, 2019

roycaihw mentioned this issue Apr 25, 2019

Consider $ref in CRDs or recommending authors add a GVK hint in replacements #76965

Closed

liggitt changed the title ~~Allow openapi references to CRD internal types in CRD validation schema~~ Allow use of $ref in in CRD validation schema May 3, 2019

mwylde mentioned this issue Jun 24, 2019

Adding openAPIV3Schema Validation for the Flink application custom resource lyft/flinkk8soperator#33

Merged

muff1nman mentioned this issue Jan 4, 2021

Add support for explicit splitting of definitions dhall-lang/dhall-haskell#2124

Merged

trevorloranger mentioned this issue Mar 24, 2021

KEYCLOAK-16488 Add AuthZ settings to client CRD Redux keycloak/keycloak-operator#333

Merged

3 tasks

Hedingber mentioned this issue Apr 28, 2021

[mpi-operator] Fix mpijob crd schema + cluster role name v3io/helm-charts#573

Merged

2 tasks

ilackarms mentioned this issue May 28, 2021

need to allow disabling openapi generation for proto types solo-io/skv2#244

Closed

ilackarms mentioned this issue Jun 21, 2021

fix code generation for CRDs with recursive schema fields solo-io/skv2#253

Closed

kazk mentioned this issue Jul 25, 2021

Add optional schema Arnavion/k8s-openapi#104

Closed

devguyio mentioned this issue Sep 9, 2021

Implement CE Subscriptions filters knative/eventing#5715

Merged

5 tasks

astefanutti mentioned this issue Dec 14, 2021

feat(api): Add YAML DSL structural schema to CRDs apache/camel-k#2831

Closed

andreaTP mentioned this issue Dec 15, 2021

CRD generation fails with StackOverflow in case of cyclic references fabric8io/kubernetes-client#3651

Closed

tadayosi mentioned this issue Jul 7, 2022

Support Flow in Custom Resource Definition Schema apache/camel-k#2229

Open

MisterMX mentioned this issue Jul 20, 2022

XRD spec.versions[0].schema.openAPIV3Schema is weakly typed as an object instead of full openapi schema crossplane/crossplane#3197

Closed

Haleygo mentioned this issue Apr 17, 2023

vmalertmanagerconfig: repeated child routes field VictoriaMetrics/operator#629

Closed

vishal-chdhry mentioned this issue Aug 30, 2023

[Feature] Find a solution to reduce size of the chart kyverno/kyverno#8162

Closed

2 tasks

tomleb mentioned this issue May 21, 2024

Fix schema definitions rancher/steve#215

Merged

sgaist mentioned this issue Jun 20, 2024

feat(crd): AmaltheaSession definition SwissDataScienceCenter/amalthea#560

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow use of $ref in in CRD validation schema #62872

Allow use of $ref in in CRD validation schema #62872

Liujingfang1 commented Apr 19, 2018 •

edited by liggitt

Liujingfang1 commented Apr 19, 2018

Liujingfang1 commented Apr 19, 2018

lavalamp commented Apr 19, 2018

liggitt commented Apr 19, 2018

nikhita commented Apr 20, 2018

nikhita commented Apr 20, 2018

sttts commented Apr 24, 2018

sttts commented Apr 24, 2018

fejta-bot commented Jul 23, 2018

nikhita commented Jul 23, 2018

fejta-bot commented Oct 21, 2018

nikhita commented Nov 14, 2018

alegacy commented Jan 31, 2019

the-voidl commented Sep 28, 2022 •

edited

liggitt commented Sep 28, 2022

lavalamp commented Sep 28, 2022

zenbones commented Nov 28, 2022 •

edited

lavalamp commented Nov 28, 2022

rjeberhard commented Nov 28, 2022

liggitt commented Nov 28, 2022

zenbones commented Jan 10, 2023

jessefeinman commented Apr 11, 2023

lavalamp commented Apr 12, 2023

jpbetz commented Apr 12, 2023

rjeberhard commented Apr 12, 2023

Allow use of $ref in in CRD validation schema #62872

Allow use of $ref in in CRD validation schema #62872

Comments

Liujingfang1 commented Apr 19, 2018 • edited by liggitt

$ref scenario 1:

$ref scenario 2:

Liujingfang1 commented Apr 19, 2018

Liujingfang1 commented Apr 19, 2018

lavalamp commented Apr 19, 2018

liggitt commented Apr 19, 2018

nikhita commented Apr 20, 2018

nikhita commented Apr 20, 2018

sttts commented Apr 24, 2018

sttts commented Apr 24, 2018

fejta-bot commented Jul 23, 2018

nikhita commented Jul 23, 2018

fejta-bot commented Oct 21, 2018

nikhita commented Nov 14, 2018

alegacy commented Jan 31, 2019

the-voidl commented Sep 28, 2022 • edited

liggitt commented Sep 28, 2022

lavalamp commented Sep 28, 2022

zenbones commented Nov 28, 2022 • edited

lavalamp commented Nov 28, 2022

rjeberhard commented Nov 28, 2022

liggitt commented Nov 28, 2022

zenbones commented Jan 10, 2023

jessefeinman commented Apr 11, 2023

lavalamp commented Apr 12, 2023

jpbetz commented Apr 12, 2023

rjeberhard commented Apr 12, 2023

Liujingfang1 commented Apr 19, 2018 •

edited by liggitt

`$ref` scenario 1:

`$ref` scenario 2:

the-voidl commented Sep 28, 2022 •

edited

zenbones commented Nov 28, 2022 •

edited