Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: initial documents #1

Merged
merged 10 commits into from
Jul 23, 2021
Merged

docs: initial documents #1

merged 10 commits into from
Jul 23, 2021

Conversation

jonaslagoni
Copy link
Member

@jonaslagoni jonaslagoni commented Jul 6, 2021

This PR introduces the following

  • A Code Of Conduct, which dictates the expected behavior.
    • In theory, this should be a global Code Of Conduct, but will remain until it is solved in the main project.
  • Minified charter
    • This gives an introduction to the project, the SIG, and the community how they all relate.
  • Git workflow
    • To reduce as little direct work being done on this repository I suggest everything is moved to forks.
    • If accepted it will need to be enabled in the repository.
  • Introduce all contributors bot

I don't consider this "done" but could not create the PR as a draft, but would like to get your initial thoughts on these documents.

  • Does the setup make sense?
  • Is something not as expected?
  • Anything I am missing?

@jonaslagoni jonaslagoni requested review from Relequestual and jdesrosiers and removed request for Relequestual July 6, 2021 12:29
CODE_OF_CONDUCT.md Outdated Show resolved Hide resolved
charter.md Outdated Show resolved Hide resolved
charter.md Outdated
Definitions that should be clarified to align meaning.

- **Validation rules**, i.e. a JSON Schema as such `{type: string}` define that data should validate against a string, it does not define that the data is a string. For small validation rules, there is almost no difference, but with more complex ones it becomes apparent.
- **Data definition**, i.e. it defines the exact structure of the data.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add any more definitions?

Copy link
Member

@jdesrosiers jdesrosiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good start to me.

charter.md Outdated Show resolved Hide resolved
git_workflow.md Outdated Show resolved Hide resolved
@jonaslagoni jonaslagoni requested a review from Relequestual July 9, 2021 11:14
Copy link
Member

@Relequestual Relequestual left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested a change and requested a change, but broadly this looks great.

Additional requests:

  • I'd like to see very clearly in the readme.md a project status section, making it clear this projec is just forming, and your immedate next steps (I guess that would be finding interested members). (We don't have to define a projects status workflow right now.)
  • I'd like to see in the readme.md how you recommend people can get involved. Currently you've provide guidance on how to open a PR, but nothing about how to make proposals. This can be as light as opening an issue with a sepcific tag.
  • Add a mention of the dedicated #vocab-idl Slack channel, with join link (Just made the channel now!)

If you have any thoughts on how you'd like the project to progress workflow wise, it could be beneficial to add that to the readme.md file too.

charter.md Outdated Show resolved Hide resolved
charter.md Show resolved Hide resolved
charter.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
jonaslagoni and others added 2 commits July 18, 2021 20:51
Co-authored-by: Ben Hutton <[email protected]>
Co-authored-by: Ben Hutton <[email protected]>
@jonaslagoni
Copy link
Member Author

Suggested a change and requested a change, but broadly this looks great.

Additional requests:

  • I'd like to see very clearly in the readme.md a project status section, making it clear this projec is just forming, and your immedate next steps (I guess that would be finding interested members). (We don't have to define a projects status workflow right now.)
  • I'd like to see in the readme.md how you recommend people can get involved. Currently you've provide guidance on how to open a PR, but nothing about how to make proposals. This can be as light as opening an issue with a sepcific tag.
  • Add a mention of the dedicated #vocab-idl Slack channel, with join link (Just made the channel now!)

If you have any thoughts on how you'd like the project to progress workflow wise, it could be beneficial to add that to the readme.md file too.

Great additions 👍 Gonna make the change as I mentioned in a comment, in a week or so 🙂

@Relequestual
Copy link
Member

I'm going to add this here so it doesn't vanish when a review comment is resolved...

The purpose of a JSON Schema Vocabulary is to add new keywords with specific expectaions and/or add provide meaning to existing keywords given specific different contexts. (This is explained more in the spec appendix.)

My point is, you can add new keywords, and make them required, in order to use the vocabulary.

A vocabulary is not just adding semantic meaning to existing keywords in another vocabulary, although it looks like you will mostly want to do that here.

Schema authors will need to specify the dialect they are using via $schema, and the defined meta-schema should also define the vocabulary that implementations are required to understand in order to process the provided schema.

The result of this work can not be to define rules for processing existing schemas which use currently existing dialects (such as 2020-12).
While it may seem like doing so would be beneficial, it would provide no interoperable guarantee to schema authors, which is the objective.

@jdesrosiers
Copy link
Member

I think the development of this vocabulary will help evolve and refine our concept of what a vocabulary is.

The result of this work can not be to define rules for processing existing schemas which use currently existing dialects (such as 2020-12).

Keep in mind that it's not enough to just introduce keywords and define semantics. We're going to need to define a whole new processing model. JSON Schema defines a processing model for validation that loosely speaking is something like schema + instance => boolean. For type generation, we are going to need something like schema => code. In validation, some sub-schemas apply or not based on whether they validate against the instance. Not having an instance changes everything.

it would provide no interoperable guarantee to schema authors

Hypothetically, I don't see any interoperability issues with using the 2020-12. When using the validation processing model, the validation semantics apply. When using the type generation processing model, the type-gen semantics apply. We need to take care to ensure that the semantics in both cases are compatible as a design goal, but I don't think that if we used the 2020-12 dialect (we won't) it would create an interoperability problem.

@karenetheridge
Copy link
Member

Keep in mind that it's not enough to just introduce keywords and define semantics. We're going to need to define a whole new processing model.... Not having an instance changes everything.

FWIW, I believe this processing model should already exist in implementations. It's necessary for extracting identifiers ("$id", "$anchor" and "$dynamicAnchor" keywords) to enable the $ref-ability of schemas during evaluation * -- so one can hook into that process, or do the same steps separately, to walk a schema in order to identify items for code generation (or docs, or linting, etc).

* at least, I believe it's required to do it that way -- if someone has achieved this in another way, I'd be interested in hearing how

@handrews
Copy link

FWIW, I believe this processing model should already exist in implementations. It's necessary for extracting identifiers ("$id", "$anchor" and "$dynamicAnchor" keywords) to enable the $ref-ability of schemas during evaluation * -- so one can hook into that process, or do the same steps separately, to walk a schema in order to identify items for code generation (or docs, or linting, etc).

Yeah I agree with @karenetheridge here. There are no doubt some additional subtleties, but when I started playing around with generative cases at one point it was based on the schema loading step that found identifiers and resolved references against base URIs.

@Relequestual
Copy link
Member

Keep in mind that it's not enough to just introduce keywords and define semantics. We're going to need to define a whole new processing model.

I The way in which elements of the processing model work makes sense, as you both say, such as $ref, but it doesn't work when you get to conditional path evaluation, which requires an instance.

Say if for example, or unevaluated* keywords, which depend on the result of applying the schema to an instance (aka "the validation process").

@jdesrosiers is right here in that this vocabulary will need to define a processing model, such as how to handle conditional applicators. The "applicator" phrasing may not even make sense in the context of code generation.


Hypothetically, I don't see any interoperability issues with using the 2020-12. - @jdesrosiers

I can't think now, but I'm pretty sure I had a few things in mind which would make it very problematic. I should have written them down at the time.

...but I don't think that if we used the 2020-12 dialect (we won't) it would create an interoperability problem.

Good good.
I'm not convinced the constaining approach suggested on an OAI issue is a viable approach. Ultimatly, we want people to use this with minimal modification (At least @jonaslagoni I assume that should be one of the primary goals).

@jonaslagoni
Copy link
Member Author

jonaslagoni commented Jul 22, 2021

@Relequestual I have updated the documents.

I'm not convinced the constaining approach suggested on an OAI issue is a viable approach. Ultimatly, we want people to use this with minimal modification (At least @jonaslagoni I assume that should be one of the primary goals).

I agree that the suggestion in OAI will not be sufficient as something to be considered a full "solution", however without having the process specified as @jdesrosiers said schema => code it becomes complicated otherwise and I fully understand why one would limit the keywords. The plan is that no JSON Schema keywords are gonna be restricted and all of it should be possible to interpret.

The reason, why I have not talked much about the vocab side of things, is because to me the vocabulary is a supplement to the process model, at least here, in the beginning 🙂 The entire problem (from what I see) is that we have no way of processing validation rules to data definition that can be interpreted to code.

One thing I have noticed is that trying to explain and discuss this topic ain't easy as we start to blend different explanations and expectations together, so it is important if you feel a specific word is misused or need further clearing that we document it, so we are aligned. And the wording is not my strong suit, so writing a process model in the terms of formal language is gonna be especially tricky. So please correct me anytime something does not make sense 🙂

Also in terms of getting started, how would you all suggest that this beast should be tackled?

@jonaslagoni jonaslagoni requested a review from Relequestual July 22, 2021 21:19
@handrews
Copy link

The plan is that no JSON Schema keywords are gonna be restricted and all of it should be possible to interpret.

A mandatory constraining approach is not viable. However, neither is it likely viable to require all keywords to have meaningful, consistent interpretations in all languages.

I The way in which elements of the processing model work makes sense, as you both say, such as $ref, but it doesn't work when you get to conditional path evaluation, which requires an instance.

Arguably a primary purpose of generative annotations is to disambiguate such constructs, such as if/then/else. This does not, however, mean that all schemas using if/then/else can be meaningfully interpreted, as they will not all have the disambiguators. This is fine. Some JSON Schemas fail validation against all instances, and we do not restrict those. Likewise, some JSONS Schemas will not be resolvable to generated output, and we should not restrict those either.

With generative work, particularly for code generation, some constructs may be usable by some languages but unusable by others. This should also be fine.

In terms of the processing model, each construct has a finite number of outcomes depending on the validation result of the construct. This can turn into a very large number of cases if you combine a lot of conditional-ish things, but they all break down into simple steps. The question then is "which of these cases are valid to generate" and "how must the cases be combined into a single code base."

An if/then/else might represent a dispatch function instantiating one subclass or another. In that case, you need the dispatch function, you need the then class, and you need the else class. In languages where you can make what's essentially a virtual constructor, you wouldn't have a separate dispatch function from the class hierarchy. In others, you man need a factory method or factory class. It's best not to get too deep into the details of this sort of thing, though, or else you will specify the languages you know but inadvertently make it impossible to use a somewhat different language you didn't think of. And in some languages, the whole concept of a "class" wouldn't make sense.

Copy link
Member

@Relequestual Relequestual left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small change but otherwise looks good!

I'll make a PR on the readme =D

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@jonaslagoni
Copy link
Member Author

I'll make a PR on the readme =D

Please do 😆 Iterative improvements 💯

@Relequestual Relequestual merged commit b7ba94c into main Jul 23, 2021
@Relequestual Relequestual deleted the feature/initial_docs branch July 23, 2021 08:57
@jonaslagoni
Copy link
Member Author

@all-contributors please add @Relequestual for review

@allcontributors
Copy link
Contributor

@jonaslagoni

I've put up a pull request to add @Relequestual! 🎉

@jonaslagoni
Copy link
Member Author

@all-contributors please add @jdesrosiers for review

@allcontributors
Copy link
Contributor

@jonaslagoni

I've put up a pull request to add @jdesrosiers! 🎉

@jonaslagoni
Copy link
Member Author

@all-contributors please add @jonaslagoni for doc

@allcontributors
Copy link
Contributor

@jonaslagoni

I've put up a pull request to add @jonaslagoni! 🎉

@jonaslagoni
Copy link
Member Author

@all-contributors please add @karenetheridge for ideas

@allcontributors
Copy link
Contributor

@jonaslagoni

I've put up a pull request to add @karenetheridge! 🎉

@jonaslagoni
Copy link
Member Author

@all-contributors please add @handrews for ideas

@allcontributors
Copy link
Contributor

@jonaslagoni

I've put up a pull request to add @handrews! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants