Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/sg 815 fix override dataset params #1092

Merged
merged 19 commits into from
Jun 1, 2023

Conversation

BloodAxe
Copy link
Contributor

@BloodAxe BloodAxe commented May 26, 2023

This PR fixes interpolation issue in our dataset configs.
There are two distinct use cases which are almost mutually exclusive, but looks like I managed to get them both working.

Scenario 1: Colab users

In colab, users override params via dataloader factory methods as follows:

dataset = coco2017_train_yolo_nas(dataset_params=dict(data_dir="e:/coco2017", input_dim=[320,320]))

Under the hood, we load dataset config from YAML file and apply user-provided overrides.

Scenario 2: Pycharm / Command line users

python train_from_recipe.py --config-name=coco2017_yolo_nas_s \
 dataset_params.train_dataset_params.input_dim.0=320
 dataset_params.train_dataset_params.input_dim.1=320

In command-line users are launching training with train_from_recipe and instantiation of the datasets is somewhat crooked. Since we still calling coco2017_train_yolo_nas, but we do a double work of merging a configuration for dataset.

In the call to coco2017_train_yolo_nas(dataset_params) the dataset_params variable comes from experiment configuration file with all the interpolations/instantiations done by hydra for us.
And inside the function coco2017_train_yolo_nas still loads default dataset params from YAML file.
And them we do the "merging".

The interpolation did not work as intended because of wrong order of instantiation/merging/overrides.
Looks like this PR addresses both scenarios. There are additional unit test cases which tests new logic of merging configs.

✅ Added test case to verify override works
✅ Ran all unit tests to ensure they are good (Some were actually failing so I fixed them, that's why there are additional files in the PR)
✅ Ran all integration tests to ensure they are good as well (Appart from failing tests caused by lack of DEKR pretrained weights the rest are good and we did not break anything)

Not fixed yet

(And I'm not sure how)
The COCO YOLO-NAS dataset config has this strange image preprocessing of resizing image to longest size of 636 and then padding to 640. Why this is the problem:
We cannot use interpolation in this case for obvious reasons

val_dataset_params:
  ...
  input_dim: [636, 636]
  transforms:
    - DetectionPadToSize:
        output_size: [640, 640]
    ...
    - DetectionTargetsFormatTransform:
        input_dim: [640, 640]

Solution 1: Change input_dim: [636, 636] to input_dim: [640, 640]
Pros: Can use interpolation
Cons: Lose a few decimals of mAP

Solution 2: Leave as is
Pros: None?
Cons: This recipe would not support changing image resolution via command line or colab

@BloodAxe BloodAxe marked this pull request as ready for review May 26, 2023 14:58
@shaydeci
Copy link
Contributor

Nice!
Regarding the COCO config issue you brought up - did we check how significant is the mAP decrease? I mean its such a tiny difference...

Also, from your command example:

dataset_params.train_dataset_params.input_dim.0=320
dataset_params.train_dataset_params.input_dim.1=320

I think the transforms that Interpolate input_dim should handle single int values and turning them into [input_dim, input_dim]. I don't think we have too much of them in use tbh, and some already have this option implemetnted.
But...I do vaguely remember there might have been issues with overriding different data types inside hydra..

Should we also add the interpolations for our other detection recipes while at it?

Makefile Show resolved Hide resolved
src/super_gradients/training/dataloaders/dataloaders.py Outdated Show resolved Hide resolved
@BloodAxe
Copy link
Contributor Author

BloodAxe commented May 29, 2023

Regarding the COCO config issue you brought up - did we check how significant is the mAP > decrease? I mean its such a tiny difference...

If we do just resize + pad to 640x640 without 636x636 in between we get this:

(First number is actual value, second is expected)

YOLO-NAS L
AssertionError: tensor(0.5184) != 0.5222 within 0.001 delta (tensor(0.0038) difference)

YOLO-NAS M
AssertionError: tensor(0.5113) != 0.5155 within 0.001 delta (tensor(0.0042) difference)

YOLO-NAS S
AssertionError: tensor(0.4705) != 0.475 within 0.001 delta (tensor(0.0045) difference)

@BloodAxe
Copy link
Contributor Author

Also, from your command example:
dataset_params.train_dataset_params.input_dim.0=320
dataset_params.train_dataset_params.input_dim.1=320
I think the transforms that Interpolate input_dim should handle single int values and turning them into [input_dim, input_dim]. I don't think we have too much of them in use tbh, and some already have this option implemetnted.
But...I do vaguely remember there might have been issues with overriding different data types inside hydra..

Should we also add the interpolations for our other detection recipes while at it?

I'm not aware of hydra's command-line override grammar to express override of list. None of:

  • dataset_params.train_dataset_params.input_dim=320,320
  • dataset_params.train_dataset_params.input_dim=(320,320)
  • dataset_params.train_dataset_params.input_dim=[320,320]
    Worked for me.

I agree that:

dataset_params.train_dataset_params.input_dim.0=320
dataset_params.train_dataset_params.input_dim.1=320

Looks ugly.

I think what we can do with single-scalar input_dim is the following:

We should allow input_dim: Union[int, Tuple[int,int]] and in our transforms we will call ensure_is_tuple_of_two to ensure our input dim is broadcasted to (rows, cols) if a single scalar is given:

def __init__(self, input_dim: Union[int, Tuple[int,int]]):
   self.input_dim = ensure_is_tuple_of_two(input_dim)

Copy link
Contributor

@Louis-Dupont Louis-Dupont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerning input_dim, there is a third option which is to have 2 default params _default_resizing_dim and _default_padding_dim but only one public param input_dim

  • By default input_dim=None and we use _default_resizing_dim=633, _default_padding_dim=640.
  • When setting input_dim, it overrides both _default_resizing_dim and _default_padding_dim accordingly.

Similar logic to resizing_dim = input_dim if input_dim else _default_resizing_dim

Pro

  • Keep our results in the default case
  • Single param to override when someone wants to set input_dim

Con

  • Increase the complexity of the recipe (and/or resolver if we need to use them to do this if ... else ... logic)

Not sure we want it to go for that, it might not be worth it if we're fine with losing this few digit in MaP. But if we do care about these few digits I think it might be worth considering this approach

src/super_gradients/training/dataloaders/dataloaders.py Outdated Show resolved Hide resolved
@BloodAxe BloodAxe requested review from shaydeci and Louis-Dupont May 31, 2023 10:47
Louis-Dupont
Louis-Dupont previously approved these changes May 31, 2023
Copy link
Contributor

@Louis-Dupont Louis-Dupont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@Louis-Dupont Louis-Dupont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@shaydeci shaydeci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feeling a bit unease about the fact that users wont know in case they mixed non-primitve instantiated objects and some values they wish to interpolate, but tbh I really dont have a better solution :

LGTM

@BloodAxe
Copy link
Contributor Author

BloodAxe commented Jun 1, 2023

I can think only of one case where this may go wrong:
When user combines config-based transform declaration and already instantiated instances of transform object in transforms:

dataset_params={
"input_dim": 512,
"transforms": [
 DetectionRandomAffine(...),
 {"DetectionPaddedRescale": {"input_dim": "${dataset_params.train_dataset_params.input_dim}" }}
]

Frankly speaking I'm not sure whether we should ever allow this mixing.

@BloodAxe BloodAxe merged commit 7907c48 into master Jun 1, 2023
@BloodAxe BloodAxe deleted the feature/SG-815-fix-override-dataset-params branch June 1, 2023 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants