Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin: Cannot add required fields for StepData #2766

Open
LecrisUT opened this issue Mar 15, 2024 · 4 comments
Open

Plugin: Cannot add required fields for StepData #2766

LecrisUT opened this issue Mar 15, 2024 · 4 comments

Comments

@LecrisUT
Copy link
Contributor

LecrisUT commented Mar 15, 2024

I was trying to do

@dataclasses.dataclass
class PrepareCMakeData(PrepareStepData):
    preset: str = field(
        option=("-p", "--preset"),
        metavar="PRESET",
        help="Configure preset",
    )

I would like this to be a single-value required field, but I cannot do that because it follows non-required fields of PrepareStepData

Approach 1

Add kw_only=True

@dataclasses.dataclass(kw_only=True)
class PrepareCMakeData(PrepareStepData):

This kinda works, but the error handling is all messed up:

$ tmt plans show

    warn: /plan:prepare - {'how': 'cmake'} is not valid under any of the given schemas
/plan

Failed to load step data for PrepareCMakeData: PrepareCMakeData.__init__() missing 1 required keyword-only argument: 'preset'

Potential solution to this is to try ... catch this source of error and re-format it appropriately

Approach 2

Add attrs style validator. The most intuitive for the user using it, but requires quite some implementation on the backend. This would still have such reporting errors, but this is easier to try ... catch and report the errors

Approach 3

Wrap the dataclass decorator with json schema validation. I think this is the preferred approach that is planned to be implemented?

One missing part here, is how can the plugin add paths for the schema to check? Ideally this is something to specify when subclassing a plugin or plugin data class

@happz
Copy link
Collaborator

happz commented Mar 15, 2024

Yeah, this is not supported by dataclasses at all, the order is important. I'm not sure whether any of the solutions would work because none addresses the order of fields or (the default) value of the field. The only advice I have now would be to make the field Optional and use None as the default value - I know, it's ugly, more assert calls to make type-checking happy...

The solution might be switching to attrs, where this should not be a problem, and that's my long-term plan. However, I did not start working on it, because I'm mostly happy with what we have now. Not fully, but it's Good Enough(TM). It works, brings enough safety and formalism, no more dictionaries, which means I was able to focus on other areas and put attrs to backburner. Your case is valid, unfortunate, yet a corner one & affected by the core limitation of dataclasses package :(

@LecrisUT
Copy link
Contributor Author

I'm not sure whether any of the solutions would work because none addresses the order of fields or (the default) value of the field.

With either the attrs validator or json schema validations, it would work, but you would still have to specify it as Optional.

more assert calls to make type-checking happy...

Yeah, but this one would still break the error reporting. I am actually stuck at the next issue with the JSON schema validation:

One missing part here, is how can the plugin add paths for the schema to check? Ideally this is something to specify when subclassing a plugin or plugin data class

But the JSON schema validation is not performed for tmt run right? I am wondering how much I am blocked by this right now.


The solution might be switching to attrs

After you first introduced me to this, I can confirm attrs is the way to go. And actually I don't think it would be that difficult of a transition, since most of the API is compatible. We can at least open a PR to investigate gradual transition. Do you have a suggestion on the simplest dataclass to start with?

@happz
Copy link
Collaborator

happz commented Mar 15, 2024

I'm not sure whether any of the solutions would work because none addresses the order of fields or (the default) value of the field.

With either the attrs validator or json schema validations, it would work, but you would still have to specify it as Optional.

Yeah, but that's the difference: "with attrs..." - and the rest does not matter. It's not possible with stdlib dataclasses, i.e. kw_only, "attrs style validator" or wrapping dataclass with JSON validation won't let you not define a default value. Might be dummy one, None, and matching type, but as long as dataclasses control the class, field shall have a value :)

more assert calls to make type-checking happy...

Yeah, but this one would still break the error reporting.

Hmm, that would be unexpected. Once the class is successfully defined, existing code should be able to work with it easily, Optional or not.

I am actually stuck at the next issue with the JSON schema validation:

One missing part here, is how can the plugin add paths for the schema to check? Ideally this is something to specify when subclassing a plugin or plugin data class

Right, I wanted to address this one but missed it eventually. No, tmt does not have a way to include custom schema, but it should have, e.g. an environment variable similar to TMT_PLUGINS. Definitely worth a patch.

But the JSON schema validation is not performed for tmt run right? I am wondering how much I am blocked by this right now.

It is performed every time tmt loads fmf data. tmt asks fmf library to load data from files, then asks fmf library to run schema validation, then tmt turns the tree of fmf structures into tests and plans. The report from JSON validation is just a warning, I plan to improve its reporting - until then, it can be way too cryptic to be of any use as a mandatory test.

The solution might be switching to attrs

After you first introduced me to this, I can confirm attrs is the way to go. And actually I don't think it would be that difficult of a transition, since most of the API is compatible. We can at least open a PR to investigate gradual transition. Do you have a suggestion on the simplest dataclass to start with?

Hard to tell, maybe check tmt.hardware, those are pretty isolated, or tmt.queue, tmt.steps.provision.mrack. Basically, nothing that's derived from tmt.utils.DataContainer or uses field(), because there's a lot of code attached to that. Which is also something to think about, those containers need to be written to and read from files, plus we save plenty of field metadata (tmt.utils.FieldMetadata) because fields define keys, keys can be set by CLI options, have help strings, metavars, all the stuff that both CLI and Sphinx need to access, so some structured way is needed for storing this kind of info. Something will not be needed, maybe normalization would be possible to be performed by attrs, but something will remain with us. On the other hand, if switching to attrs, tackling field() & co. would be inevitable. I'd expect we would end up end up with two sets of code, the legacy one and the new one, and convert plugin by plugin, to make things reviewable.

@LecrisUT
Copy link
Contributor Author

Hmm, that would be unexpected. Once the class is successfully defined, existing code should be able to work with it easily, Optional or not.

I think I am thinking of the schema validation error that I currently have. I will come back to this if the issue persists with using assert.

The report from JSON validation is just a warning

Ok, I can than move on and see what I can hack up further.

I'd expect we would end up end up with two sets of code, the legacy one and the new one, and convert plugin by plugin, to make things reviewable.

Yep, that's the plan I had in mind. First figure out if we will have any surprises with EPEL + Fedora rawhide compatibility :D

@LecrisUT LecrisUT mentioned this issue Apr 5, 2024
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants