Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warnings for deprecated datapipes are insufficient #322

Closed
pmeier opened this issue Mar 24, 2022 · 3 comments
Closed

Warnings for deprecated datapipes are insufficient #322

pmeier opened this issue Mar 24, 2022 · 3 comments
Assignees

Comments

@pmeier
Copy link
Contributor

pmeier commented Mar 24, 2022

Consider this snippet:

from torchdata.datapipes.iter import IterableWrapper

dp = IterableWrapper([])
dp = dp.read_from_tar()

There are three problems here:

  1. Running this with python $SCRIPT will not give you any warning, because DeprecationWarning's are filtered by default. They are only visible if a user activates them manually, i.e. running python -Wd $SCRIPT:

    DeprecationWarning: _DataPipeMeta and its functional API are deprecated and will be removed from the package `torch`. Please use TarArchiveLoader instead.
    

    Since most users won't do that, we should using a FutureWarning, which is the user-facing deprecation warning.

  2. Since the warning is emitted from __new__, we should use cls.__name__. Using type(cls).__name__ will give us the name of the meta class as seen as _DataPipeMeta above.

  3. If I used the functional API, it can be quite confusing what caused the warning. For example with the above fixed, the warning is

    FutureWarning: TarArchiveReaderIterDataPipe and its functional API are deprecated and will be removed from the package `torch`. Please use TarArchiveLoader instead.
    

    I have no TarArchiveReaderIterDataPipe in my code. Even if I had used the class interface, i.e. from torchdata.datapipes.iter import TarArchiveReader, there would still be a mismatch between my code and the warning.

I propose to switch deprecation_warning to something like

def deprecation_warning(
    old_class_name: str,
    *,
    old_functional_name: str = "",
    new_class_name: str = "",
    new_functional_name: str = "",
) -> None:
    msg = f"`{old_class_name}()`"
    if old_functional_name:
        msg = f"{msg} and its functional API `.{old_functional_name}()` are"
    else:
        msg = f"{msg} is"
    # TODO: Make the deprecation and removal version concrete.
    #  See https://github.com/pytorch/pytorch/wiki/PyTorch's-Python-Frontend-Backward-and-Forward-Compatibility-Policy#minimizing-the-disruption-of-bc-breaking-changes
    msg = f"{msg} deprecated and will be removed in the future."

    if new_class_name or new_functional_name:
        msg = f"{msg} Please use"
        if new_class_name:
            msg = f"{msg} `{new_class_name}()`"
        if new_class_name and new_functional_name:
            msg = f"{msg} or"
        if new_functional_name:
            msg = f"{msg} `.{new_functional_name}()`"
        msg = f"{msg} instead."

    warnings.warn(msg, FutureWarning)

With this, the warning for the code above reads:

FutureWarning: `TarArchiveReaderIterDataPipe()` and its functional API `.read_from_tar()` are deprecated and will be removed in the future. Please use `TarArchiveLoader()` or `.load_from_tar()` instead.

LMK if I should send a PR to PyTorch core.

@NivekT
Copy link
Contributor

NivekT commented Mar 24, 2022

@pmeier Thanks for the suggestion. I think your version is better and you should send a PR.

For your convenience, here is the list of DataPipes being deprecated:

Two small complications:

  1. RoutedDecoder is deprecated and will be removed without replacement
  2. These DataPipes live in different libraries, making it hard to merge the changes without breaking any CI.

My suggestion to minimize CI breakage:

  1. Name the new function deprecation_warning_with_replacement, switch to use that for all deprecated DataPipes, except RoutedDecoder
  2. Update the deprecation_warning now that RoutedDecoder is the only user of it (unless the existing message is sufficient for that case)
    If everyone is comfortable with CI failing on TorchData for a little bit then this isn't an issue.

I can definitely make these PRs and merge them if you prefer that.

@pmeier
Copy link
Contributor Author

pmeier commented Mar 24, 2022

  1. RoutedDecoder is deprecated and will be removed without replacement

Not sure why this an issue. If you look into my suggestion above, only old_class_name needs to be passed. If neither new_class_name or new_functional_name is passed, only the first part of the message is generated. For example:

deprecation_warning("RoutedDecoder")
FutureWarning: `RoutedDecoder()` is deprecated and will be removed in the future.
  1. These DataPipes live in different libraries, making it hard to merge the changes without breaking any CI.

This is internal functionality, right? In that case I wouldn't worry to much. You are looking at one day of failure, which is kind of ok for code that lives in separated libraries IMO.

@NivekT
Copy link
Contributor

NivekT commented Mar 24, 2022

Sounds reasonable, feel free to put in a PR

facebook-github-bot pushed a commit to pytorch/pytorch that referenced this issue Apr 8, 2022
Summary:
Fixes pytorch/data#322.

Pull Request resolved: #74685

Reviewed By: mrshenli

Differential Revision: D35143494

Pulled By: NivekT

fbshipit-source-id: b87c312a93ea6a8ce2090da961312cb7e2cbc2eb
pytorchmergebot pushed a commit to pytorch/pytorch that referenced this issue Apr 8, 2022
Summary:
Fixes pytorch/data#322.

Pull Request resolved: #74685

Reviewed By: mrshenli

Differential Revision: D35143494

Pulled By: NivekT

fbshipit-source-id: b87c312a93ea6a8ce2090da961312cb7e2cbc2eb
(cherry picked from commit cb74b81)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants