Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/ref is delegation 2 #585

Closed
wants to merge 5 commits into from
Closed
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 52 additions & 9 deletions jsonschema-core.xml
Original file line number Diff line number Diff line change
Expand Up @@ -791,20 +791,22 @@

<section title='Schema References With "$ref"'>
<t>
The "$ref" keyword is used to reference a schema, and provides the ability to
validate recursive structures through self-reference.
The "$ref" keyword can be used to reference a schema which is to be applied to the
current instance location. "$ref" is an assertion key word, which MUST produce a boolean
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ref is an applicator rather than an assertion. It does not produce its own result, it merely conveys the result of the subschema that it applies (in this case the subschema is by reference rather than a literal subschema).

$ref should return all of the results of the subschema (assertion and annotations).

Copy link
Member Author

@Relequestual Relequestual May 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were your words buddy...

Delegation means that the target of the $ref is processed against the current instance location, and the "results" (boolean assertion outcome and optionally the collected annotations) of $ref are simply the results of the target schema.

#514 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The part I'm objecting to is '"$ref" is an assertion keyword', which is not something that I say in that text that you quote. I didn't explicitly classify it, and I'd be fine with not classifying it, but assertion keywords evaluate a condition based on a non-schema value (e.g. "maxLength": 5) which is not what $ref does.

Applicators apply a subschema and return its results (assertion and annotations). $ref applies a subschema by reference rather than directly, so I wasn't particularly trying to fit it into the existing classifications, but if we are doing so then applicator is the correct one (and we might need to tweak the wording defining applicators if it talks specifically about an immediate subschema).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this change sort of obscures the purpose of $ref. We should probably still say it "provides the ability to validate recursive structures through self-reference".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll chang this and mention recursive structures again.

assertion result when the resolved schema is applied.
</t>
<cref anchor="REF1" source="BH">
The use of "$ref" MUST NOT effect adjacent keywords.
Given previously the use of "$ref" would negate the use of other keywords in that object,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that most of this can be made clear by thoroughly describing the applicator behavior. Otherwise I would put this in the appendix for people who want to understand the changes, as noted in another comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should discuss this as it looks to me like your opinion on applicator or assertation for $ref has changed since the issue was raised.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not aware of my opinion changing although I certainly accept that I may have been unclear.

it seems like a good thing to mention.
Should be removed before leaving Internet Draft
</cref>
<t>
An object schema with a "$ref" property MUST be interpreted as a "$ref" reference.
The value of the "$ref" property MUST be a URI Reference.
Resolved against the current URI base, it identifies the URI of a schema to use.
All other properties in a "$ref" object MUST be ignored.
</t>
<t>
The URI is not a network locator, only an identifier. A schema need not be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still true. It relates to section 8.3.1 about loading schema references. You may have something like "$ref": "https://example.com/schemas/foo", but that host may be down, or you may be running behind a firewall. In which case you probably have the schema available locally and SHOULD NOT automatically fetch it.

We do need some improved wording about when it's OK to automatically fetch $ref'd schemas, as some environments will depend on such dynamic fetching. But that was true before this change so I'd suggest we work on that separately.

downloadable from the address if it is a network-addressable URL, and
implementations SHOULD NOT assume they should perform a network operation when they
encounter a network-addressable URI.
A URI may be a locator, a name, or both, per <xref target="RFC3986">RFC 3986</xref>.
</t>
<t>
A schema MUST NOT be run into an infinite loop against a schema. For example, if two
Expand All @@ -814,7 +816,30 @@
Schemas SHOULD NOT make use of infinite recursive nesting like this; the behavior is
undefined.
</t>
<section title="Loading a referenced schema">
<t>
A URI reference without a protocol MUST be considered a plain name fragment,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this part. There is nothing at all special about URI references in "$ref" values. Everything that needs to be said about resolving $ref and $id is already said in this document, and none of it changes as the result of the whole inclusion/delegation change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have both hit the nail on the head, and missed the point.

The reason for this section change, is trying to highlight specific aspects of how URI work, as the above removed lines (ending 807) clearly wern't suffecient to explain. There have been so many Stack Overflow questions on how resolution works that I've lost count.

These changes don't CHANGE any of the behaviour, but seek to clairfy it with a little structure. Everything that NEEDS to be said technically yes, but not everything that needs to be said to make the technical aspect clear enough in my experinece.

Even when I've pointed people to the spec section, it wasn't enough. So I tried to write stuff which was a simple form of discussions which resulted in people understanding the intent.

and the URI reference location resolved according to <xref target="id-keyword">"$id" keyword</xref> section.
</t>
A URI reference with a network addressable locator defined MAY be provided with an interface to retrieve
the network accessible resource.
<t>
<cref anchor="REF2" source="BH">
A schema author wants to define a means to retrieve a specific URI protocol. An implementation may allow
the user to specify a method to perform a network operation (or other operation) to retrieve the reference
content. For example, the behaviour of a URI with a protocol of "file" is not defined. The implementation
provides an interface to define a function which is called when it encounters a URI which uses the specified
URI protocol (which is "file" in this case). The user defines "file" as the protocol, and includes a function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole concept (the regular paragraph on lines 822-825 and this CREF) are implementation notes of the sort that we have not had before. Whether we need them or not, I don't think that delegation vs inclusion makes them more or less necessary.

If we had already gotten agreement in an issue that we need this sort of change, and if the main point of this PR (delegation vs inclusion) was otherwise clear, I'd have no problem doing this in one PR. But I'm not sold on the need for this section, so I'd like to get the agreed-upon changes clear and committed first before opening a new topic on what sort of implementation guidance does or does not need to be added.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought a better explanation of how things work was in scope under #514 ? "What is "$ref" and how does it work?" is pretty broad. I may want to write additional comments to this part before awaiting a reply =]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue title was pretty broad as I was trying not to prejudice the outcome. It's not that I'm against clarifying, it's that this level of implementation detail doesn't seem quite right to me. Perhaps I'm just not understanding it the way you intended.

(or some form of instruction) using the interface, which will be triggered to retrieve any reference defined
with a URI that uses a protocol of "file". This is similar to "adding a schema" where implementations track
added schema's $id, allowing them to be later used in reference.
</cref>
</t>
Any URI may be resolvable by use of externally defined references provided to the implementation as per the
<xref target="loading-references">Loading a referenced schema</xref> section.
<t>

</t>
<section title="Loading a referenced schema" anchor="loading-references">
<t>
The use of URIs to identify remote schemas does not necessarily mean anything is downloaded,
but instead JSON Schema implementations SHOULD understand ahead of time which schemas they will be using,
Expand Down Expand Up @@ -903,6 +928,24 @@
</t>
</section>
</section>
<section title="Dereferencing By Inclusion">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in the previous PR (I think, or maybe on Slack), I think that if this is just to explain the difference from prior behavior, it belongs in an appendix (which would be xref'd from a brief note in the main text). This is what I've been doing for all confusing changes, like the dependencies split, $defs vs definitions, and keywords that moved from validation to core.

If we do want to provide guidance on an inclusion mechanism for things like json-schema-ref-parser that want to do so to the extent it is possible, then it should describe the use with allOf per #523.

But I don't think there should be a dereference inclusion process as part of the standard. Just a clear description of the behavior, and tools that want to make schema transformations that preserve that behavior can do so (for instance, I know of at least three tools for collapsing allOf branches into a single schema, which is not part of the spec but follows directly from the described behavior).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I was trying to address the issues raised by making this change (ref is delegation) as per #523.

I make no reference to prior behaviour, so it's not explaining a difference.

If we do want to provide guidance on an inclusion mechanism for things like json-schema-ref-parser that want to do so to the extent it is possible, then it should describe the use with allOf per #523.

I thought we wanted to avoid specific implementation directives where possible (like saying "you should do this by using allOf)? More, this section puts the nessecary constraints on any process, which may need to resolve by inclusion, as to avoid any interoperability issues.

I think this is REQUIRED. If we don't define constraints around how to implement inclusion, I feel it will be done any number of invalid ways and cause problems.

<t>
It MUST NOT be expected that any schema can be dereferenced by the means of replacing any object that
uses the "$ref" keyword with the resolved referenced schema (inclusion). An interface MAY be provided
to dereference a schema by means of inclusion, however it MUST NOT be the default behaviour.
</t>
<t>
The use of "$id" and "$ref" from external schemas MUST be evaluate correctly, and not evaluated after
any inclusion process.
</t>
<t>
The result of any inclusion process MUST NOT effect previously adjacent keywords to the original "$ref" keyword
</t>
<t>
A behaviour when a resolved schema which defines a schema version which is different to that of the base JSON Schema document
is not defined.
</t>
</section>

<section title='Schema Re-Use With "$defs"'>
<t>
Expand Down