Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify reference and bundling behavior with JSON Schema #649

Closed
smoya opened this issue Nov 10, 2021 · 18 comments
Closed

Unify reference and bundling behavior with JSON Schema #649

smoya opened this issue Nov 10, 2021 · 18 comments
Labels
👻 Needs Champion RFC Needs a champion to progress (See CONTRIBUTING.md) 💭 Strawman (RFC 0) RFC Stage 0 (See CONTRIBUTING.md)

Comments

@smoya
Copy link
Member

smoya commented Nov 10, 2021

Context

Current AsyncAPI Reference Object documentation mentions:

For this specification, reference resolution is done as defined by the JSON Reference specification and not by the JSON Schema specification.

On the other hand, JSON Schema Draft-07, the version current AsyncAPI spec relies on, does not follow JSON Reference but rather its own implementation.

JSON Schema Draft-07 includes the $id keyword which modifies the behavior of $ref. It allows defining a base URI for the $ref keyword, where referenced URIs will be taken as relative to the one in $id.

On the contrary, $id property is not allowed on any AsyncAPI object.

Why is this a problem?

We do not have a unified reference object behaviour between AsyncAPI Reference Object and JSON Schema Draft-07. That impacts directly the user experience.

Tooling must support two different dereferencing and bundling behaviors:

  1. One for AsyncAPI Reference Object.
  2. Another one for JSON Schema Draft-07.

The following issues have been created:

Tooling availability is a problem as well. As there is no standard spec we all (AsyncAPI, OpenAPI, etc) can follow, tooling will need to be customized.

Possible next steps

1. Porting the same dereference + bundling resolution behavior as JSON Schema does

For the case of JSON Schema Draft-07, will be adding support for $id keyword.

In the case we decide to eventally update to JSON Schema Draft 2020-12, some other keywords and behaviors will be supported, such as allowing extra properties when using $ref.

2. Work with JSON Schema community to unify efforts by moving reference behavior to its own document

Meaning anyone would be able to reference that document from their own spec schemas.
Some questions to be answered:

  • Shall this be maintained and hosted by JSON Schema? Perhaps we should create a separate organization?
    • Governance is complex. There can be additions that make sense from JSON Schema perspective but not from any other.
    • Should ownership be shared across several initiatives?
  • How versioning would be handled?.
    • Ideally, the document should have its own versioning, not directly related to JSON Schema releases.
  • Are there more projects interested in this?
  • What happens if a keyword is introduced but from AsyncAPI perspective meaning is different?

3. Only porting bundling behavior but not referencing one.

An alternative to the first point could be to only port the bundling behavior from JSON Schema and skip referencing one.
Meaning, things like allowing extra properties when using $ref which got introduced in JSON Schema Draft 2019-09 could be adopted, however support for $id keyword will be discarded.

4. Accepting that there is a difference

We might want to clarify it even more in our docs.

Suggestion

I would suggest we follow this combo:

  1. To first annotate the current status in our docs, as detailed in the 4rth bullet.
  2. To port functionality from JSON Schema into AsyncAPI spec and tooling, as detailed in the 1st bullet.
  3. To work in parallel with JSON Schema community as detailed in the 2nd bullet.
  4. If we end up having reference behavior in its own document, then start using it (should not be a breaking change).

What do you all think? Looking for comments!

cc @Relequestual

@smoya smoya added the 💭 Strawman (RFC 0) RFC Stage 0 (See CONTRIBUTING.md) label Nov 10, 2021
@github-actions
Copy link

Welcome to AsyncAPI. Thanks a lot for reporting your first issue. Please check out our contributors guide and the instructions about a basic recommended setup useful for opening a pull request.
Keep in mind there are also other channels you can use to interact with AsyncAPI community. For more details check out this issue.

@smoya
Copy link
Member Author

smoya commented Nov 12, 2021

A PR with a clarification of the bundling and (de)reference behavior of Schema Object has been opened: #653

@smoya
Copy link
Member Author

smoya commented Nov 18, 2021

For the record, there is a recent effort on building JSON Schema foundational tooling with kind of (TBD) official support, lead by @jviotti and @Relequestual here json-schema-org/community#113.

@magicmatatjahu
Copy link
Member

@smoya You have to remember that referencing in a JSON Schema (compared to a regular JSON Pointer) differs not only because of additional keywords like $id, $anchor, $dynamicRef, $dynamicAnchor but also because that only JSON Schema instances can be referenced (JSON object and boolean value) and not other types like string, null, array, number - referencing e.g. strings is very important when using and referencing models/schemas in other formats than JSON Schema, like XSD, Protobuf, GraphQL. Additionally, if we are going to support $id fields in the specification we have to ask ourselves is it worth it? Does it make sense to define $id for an Info object? Will anyone want to make references to this field/object in other parts of the application? I don't think so.

It seems to me that the most sensible solution would be to support referencing in the AsyncAPI document according to the JSON Pointer guidelines, and in the case of schemas that have been defined as JSON Schema referencing according to the JSON Schema Pointer - similarly to OpenAPI, you can use $id and refer to it, but only in the schema area, not all the objects in the specification.

@smoya
Copy link
Member Author

smoya commented Nov 25, 2021

Ok, this is quite embarrassing. I wrote a comment in this issue a few days ago but didn't end being posted. I guess due to a failure in the request to Github and me closing the tab without noticing. Life is wonderful ✨ Let's try again!

Both @jonaslagoni and I have been investigating a bit about $ref usages both in JSON Schema and in AsyncAPI.
The conclusions can be now read in @jonaslagoni blog post (not published yet) here.

Just to summarize, the important part is giving an answer to "What features of JSON Schema reference and bundling system we consider suitable for AsyncAPI?"

$id is something we could not find a real use case for it. Probably the reason nobody asked for it ever. Same for the rest of keywords that modify the behavior of JSON Schema $ref, such as $anchor. They are too specific of JSON Schema ecosystem.
As far as we could see, only one feature seems potentially useful for us: overriding properties when bundling referenced schemas. An example of a use case can be found in #618. If adopted, the following would be valid:

asyncapi: '2.2.0'
info:
  title: Test $ref
  version: '1.0.0'
channels:
  test:
    publish:
      message:
        $ref: '#/components/messages/myMessage'
        name: "Awesome Message" # this value should take precedence and override the original `name` field.
components:
  messages:
    myMessage: 
     name: original name

Note that our parser is actually allowing this to happen unexpectedly, but because of the following issue asyncapi/parser-js#404

As per my comment in #596 (comment), overriding properties is not a thing in JSON Schema Draft-07 but starting from Draft-2019-09. However, the behavior is not what we would expect from it: properties to override will do a AND with the referenced schema. I.e.:

{"type": "object", "$ref": "foo"} is equivalent to {"type": "object", "allOf": [{... contents of foo schema ...}]}

We (@jonaslagoni and I) couldn't find an official doc place where they mention this, however you can find some mentions on this library issue APIDevTools/json-schema-ref-parser#145 (comment) (from where I got the example above) and also this Slack thread where @jonaslagoni is involved https://json-schema.slack.com/archives/CT8QRGTK5/p1637146426080000?thread_ts=1636976501.051400&cid=CT8QRGTK5

Having said that, I'm suggesting we close this issue as a won't do. Instead, to work on:

  • Separate bundling system (to be independent) between AsyncAPI and JSON Schema. This could mean using different tooling or build our own, etc.
  • To deprecated $ref usage in AsyncAPI objects. We could add a new keyword like $include or whatever but still keep support for $ref for some time. JSON Schema schemas won't be affected by this as they will keep the $ref keyword as per JSON Schema spec. This will clearly remove any confusion between AsyncAPI bundling system and JSON Schema one.

I hope this also answers @magicmatatjahu comment.

WDYT?

@magicmatatjahu
Copy link
Member

@smoya Thanks! I know about problems (I read all issues that you and Jonas created related to pointers), but maybe for other this comment will be very helpfull.

Separate bundling system (to be independent) between AsyncAPI and JSON Schema. This could mean using different tooling or build our own, etc.

It is a way out, but we should reuse existing tools as much as possible.

To deprecated $ref usage in AsyncAPI objects. We could add a new keyword like $include or whatever but still keep support for $ref for some time. JSON Schema schemas won't be affected by this as they will keep the $ref keyword as per JSON Schema spec. This will clearly remove any confusion between AsyncAPI bundling system and JSON Schema one.

I can't find information if $ref is used outside of JSON Schema (I mean, in normal JSON), but I'd rather we stay with the word $ref and not add a new word like $include :) But here we should think about what we want to achieve, because as I see in other related issues at all costs we try to match the JSON Schema pointers (especially with newer versions) and maybe a good way would be as I described in an earlier comment:

It seems to me that the most sensible solution would be to support referencing in the AsyncAPI document according to the JSON Pointer guidelines, and in the case of schemas that have been defined as JSON Schema referencing according to the JSON Schema Pointer - similarly to OpenAPI, you can use $id and refer to it, but only in the schema area, not all the objects in the specification.

As far as $id is concerned, as I said, it probably doesn't make sense to add such a value to every object in the spec and thus $anchor won't add an extra value either, not to mention $dynamicAnchor and $dynamicRef, but that's just my opinion.

As far as we could see, only one feature seems potentially useful for us: overriding properties when bundling referenced schemas. An example of a use case can be found in #618. If adopted, the following would be valid:

I agree with you that we should support overrides, but as normal overrides and not with logic from JSON Schema with allOf - maybe then we wouldn't need traits, because everything could be done with overrides :)

And I'll say it again. Pointers in JSON Schema should only point to JSON Schema instances, i.e. JSON object that is a valid JSON Schema or boolean value, everything else is forbidden. This is a serious problem because we have planned to support also other schema formats like XSD, Protobuf, GraphQL and without string referencing we will not be able to achieve this, so my suggestion:

  • support referencing throughout the AsyncAPI document as recommended by JSON Pointer + add support with overrides (we can also think about adding $id, $anchor, $dynamicAnchor, but only if really needed)
  • support JSON Schem referencing (with $id, $anchor etc) only in JSON Schem scope.

There is also an option to make a deal with JSON Schema team to have their specification for JSON Pointers extracted from spec, but with the ability to reference primitives like string, number etc :)

@smoya
Copy link
Member Author

smoya commented Nov 26, 2021

@magicmatatjahu we are talking about the same. Maybe I didn't clarify it properly in my previous comment but my suggestion is exactly that, to have two reference/bundling systems:

  1. to support all features of JSON Schema reference just only on JSON Schema schemas (including $id and whatever the supported draft specifies).
  2. for AsyncAPI objects, to only support JSON Reference (not sure why you are mentioning JSON pointer as it only defines the way you declare the string value for $ref) + adding support for overriding properties in a way we consider proper for us (not following that allOf strategy JSON Schema does, etc).

There is also an option to make a deal with JSON Schema team to have their specification for JSON Pointers extracted from spec, but with the ability to reference primitives like string, number etc :)

Considered as stated in the second bullet in "Possible next steps" in the body of this issue (except for suggesting support for referencing primitives).

@jonaslagoni
Copy link
Sponsor Member

It is a way out, but we should reuse existing tools as much as possible.

Agreed, however as I see it the fact is there are no tools that actually fully support either, so building the tools becomes inevitable, especially if we want to support other languages than JS. So we might as well find an optimal solution.

I don't see any reason to "merge" reference behavior with JSON Schema, because of this:

As per my comment in #596 (comment), overriding properties is not a thing in JSON Schema Draft-07 but starting from Draft-2019-09. However, the behavior is not what we would expect from it: properties to override will do a AND with the referenced schema. I.e.:

That leaves the question, would it be confusing for people if $ref have different behavior depending on where it is used such as (dummy references, just as an example):

asyncapi: '2.2.0'
...
channels:
  test/channel/1:
    publish:
      message:
        $ref: '#/components/messages/test1'
  test/channel/2:
    publish:
      message:
        schemaFormat: application/schema+json;version=draft-07
        payload: 
          $id: 'http:https://asyncapi.com/schemas/test2'
          type: object
          properties: 
            test:
              $ref: 'test3'

And with 2019-09 and beyond:

        schemaFormat: application/schema+json;version=draft-2019-09
        payload: 
          $id: 'http:https://asyncapi.com/schemas/test2'
          type: string
          $ref: 'test3'

If yes, then I think we should introduce a different keyword.
If no, then I suggest we stay with $ref to be consistant.

@smoya
Copy link
Member Author

smoya commented Nov 29, 2021

would it be confusing for people if $ref have different behavior depending on where it is used

That was totally my point. If I think of myself a few weeks ago, I would say it is confusing (a lot). However today, I'm totally biased because I went down the rabbit hole 😅 .

We might need more input from other users.

@magicmatatjahu
Copy link
Member

While writing my comment I knew that it might be confusing for users that there are different referencing logics in different places, but I stayed with $ref. Another keyword like $include would even be ok, but.... then we have to go with our own implementation of dereferencing - the question is whether it will be worth it.

Looking at the referencing problem in general and how it has impact on spec, I wonder if we will be able to solve these problems in 3.0.0 or rather in 4.0.0 spec?

@jonaslagoni
Copy link
Sponsor Member

While writing my comment I knew that it might be confusing for users that there are different referencing logics in different places, but I stayed with $ref. Another keyword like $include would even be ok, but.... then we have to go with our own implementation of dereferencing - the question is whether it will be worth it.

Will it be worth it is of course the question 😄 But with the current state of tooling, it might not be that much different.

Looking at the referencing problem in general and how it has impact on spec, I wonder if we will be able to solve these problems in 3.0.0 or rather in 4.0.0 spec?

Don't think we should target one or the other, lets see if we can figure out what to do about it first 😄

Cause if we do decide on switching, we need to create a new tool before we can merge in the spec 🙂

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity 😴

It will be closed in 120 days if no further activity occurs. To unstale this issue, add a comment with a detailed explanation.

There can be many reasons why some specific issue has no activity. The most probable cause is lack of time, not lack of interest. AsyncAPI Initiative is a Linux Foundation project not owned by a single for-profit company. It is a community-driven initiative ruled under open governance model.

Let us figure out together how to push this issue forward. Connect with us through one of many communication channels we established here.

Thank you for your patience ❤️

@github-actions
Copy link

github-actions bot commented Aug 3, 2022

This issue has been automatically marked as stale because it has not had recent activity 😴

It will be closed in 120 days if no further activity occurs. To unstale this issue, add a comment with a detailed explanation.

There can be many reasons why some specific issue has no activity. The most probable cause is lack of time, not lack of interest. AsyncAPI Initiative is a Linux Foundation project not owned by a single for-profit company. It is a community-driven initiative ruled under open governance model.

Let us figure out together how to push this issue forward. Connect with us through one of many communication channels we established here.

Thank you for your patience ❤️

@github-actions github-actions bot added stale and removed stale labels Aug 3, 2022
@handrews
Copy link

It seems to me that the most sensible solution would be to support referencing in the AsyncAPI document according to the JSON Pointer guidelines, and in the case of schemas that have been defined as JSON Schema referencing according to the JSON Schema Pointer - similarly to OpenAPI, you can use $id and refer to it, but only in the schema area, not all the objects in the specification.

This would make the most sense to me. I see some confusion/concern here over the use case for $id — when we (JSON Schema) worked with OpenAPI on them adopting 2020-12 for OAS 3.1, $id and related keywords were one of the major topics of concern. We were able to work through all of that (across many video calls) and I'm happy to discuss that experience if it helps.

As per [someone's] comment in #596 (comment), overriding properties is not a thing in JSON Schema Draft-07 but starting from Draft-2019-09. However, the behavior is not what we would expect from it: properties to override will do a AND with the referenced schema.

This is an important consequence of JSON Schema's architecture that is worth understanding: It is a constraint system, not a data definition system. In a constraint system {} allows everything, and each additional constraint restricts the allowed set. Nothing masks, overrides, or un-constrains anything else. In a data definition system, {} has no functionality, and every added property adds functionality, with a notion of precedence causing certain duplicates to override each other.

This means that a "base" schema cannot provide default constraints, unless those constraints are minimal (all "child" schemas will be more constrained, not less or differently constrained). So you end up with a different pattern of where and how you mix in default behavior in JSON Schema than you would in an OO programming langauge.


Also, (although this is a personal opinion that is being debated but not yet adopted by the JSON Schema org), I would strongly recommend not bothering with 2019-09. It was not out long enough to get broadly adopted before 2020-12 came out, and 2020-12 being adopted by OAS 3.1 means that there will be a much greater demand for 2020-12 tooling.

@dalelane
Copy link
Collaborator

It might be helpful to consider how we use $ref to refer to external, non-AsyncAPI resources, as part of this?

There was a question about how we refer to schemas within an Avro avsc file (where the avsc file contains multiple record definitions) in #414. I think we are perhaps a little ambiguous about how that should be done, as the raiser of that issue had a different interpretation to me.

@smoya
Copy link
Member Author

smoya commented Sep 20, 2022

I think this issue can be closed in favor of #825, where all those requirements are put in common. WDYT @dalelane @handrews @jonaslagoni ?

@dalelane
Copy link
Collaborator

dalelane commented Sep 20, 2022

I think this issue can be closed in favor of #825, where all those requirements are put in common

I hadn't seen #825 - catching up on it now. But at first glance, that makes sense

@smoya
Copy link
Member Author

smoya commented Sep 21, 2022

Closing in favor of #825.

@smoya smoya closed this as completed Sep 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
👻 Needs Champion RFC Needs a champion to progress (See CONTRIBUTING.md) 💭 Strawman (RFC 0) RFC Stage 0 (See CONTRIBUTING.md)
Projects
None yet
Development

No branches or pull requests

6 participants