Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DataPipe] Properly cleanup unclosed files within generator function #6997

Merged
merged 2 commits into from
Dec 2, 2022

Conversation

ejguan
Copy link
Contributor

@ejguan ejguan commented Nov 30, 2022

There is a case that file.close is never called because when generator function has never reached to the end. A simple example would be zip two datepipes with different length. The longer DataPipe would never reach the end of generator and then it will be cleaned up by gc. So, the line of file.close is not executed. (This is the reason that Vitaly has to create this hack to retrieve all remaining data to make sure generator function is fully executed)

However, this hack introduces another problem where an infinite datapipe would make zip never end as it would try to deplete the infinite iterator. See: pytorch/data#865

So, in this PR, I am adding a try-finally clause to make sure the file.close is always executed during the destruction of generator object. Then, we don't need the hack within zip any more.

cc @pmeier @bjuncek

@ejguan ejguan requested a review from pmeier November 30, 2022 22:44
@ejguan ejguan requested a review from NivekT November 30, 2022 22:44
@ejguan ejguan marked this pull request as draft November 30, 2022 22:49
@ejguan ejguan changed the title [DataPipe] Properly cleanup unclosed files within generator function [3/4][DataPipe] Properly cleanup unclosed files within generator function Nov 30, 2022
Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, thanks @ejguan! One question out of curiosity:

I found a corner case that file.close is never called because when generator function has never reached to the end. A simple example would be zip two datepipes with different length. The longer DataPipe would never reach the end of generator and then it will be cleaned up by gc. So, the line of file.close is not executed.

Is this a problem for our datasets? Because that would indicate that something is off in our setup. If we zip, the datapipes should always have the same length. If not, did you figure out the issue for the datasets you fixed?

@ejguan
Copy link
Contributor Author

ejguan commented Nov 30, 2022

Is this a problem for our datasets? Because that would indicate that something is off in our setup. If we zip, the datapipes should always have the same length. If not, did you figure out the issue for the datasets you fixed?

@pmeier
The datasets are fine except the modified generator functions in this PR. And, torchvision would be broken is caused by demux that I will change in pytorch/pytorch#89973
The buffer within demux should be clean up when generator function exits.

@ejguan ejguan marked this pull request as ready for review November 30, 2022 23:20
@pmeier
Copy link
Collaborator

pmeier commented Dec 1, 2022

The datasets are fine except the modified generator functions in this PR

How are they broken? Are we zipping two pipes of different length like in the example you gave? Because that likely is not intended and thus we should not just mitigate, but investigate why this happens.

And, torchvision would be broken is caused by demux that I will change in pytorch/pytorch#89973
The buffer within demux should be clean up when generator function exits.

Not sure I understand. Are you saying that with this PR (3/4) here, our datasets will be broken after the core PR (2/4) is merged?

@ejguan ejguan changed the title [3/4][DataPipe] Properly cleanup unclosed files within generator function [2/4][DataPipe] Properly cleanup unclosed files within generator function Dec 1, 2022
@ejguan ejguan changed the title [2/4][DataPipe] Properly cleanup unclosed files within generator function [DataPipe] Properly cleanup unclosed files within generator function Dec 1, 2022
@ejguan
Copy link
Contributor Author

ejguan commented Dec 1, 2022

How are they broken? Are we zipping two pipes of different length like in the example you gave? Because that likely is not intended and thus we should not just mitigate, but investigate why this happens.

That makes sense. I have validated those datapipes have the same lengths. Changing zip would cause Error for the dataset in TorchVision due to internal mechanism of zip. Let's say we have two datapipes with the same length and we want to zip them together. zip would stop when the first iterator reaches StopIteration even though the second datapipe is about to reach Stopiteration. Then, the second datapipe's iterator would never execute file.close() as the iterator has not been iterated to the end.

Not sure I understand. Are you saying that with this PR (3/4) here, our datasets will be broken after the core PR (2/4) is merged?

I am sorry that I should be clear that this PR is a blocker for pytorch/pytorch#89974. It's totally fine as long this PR is landed before pytorch/pytorch#89974..

Copy link
Contributor

@NivekT NivekT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@pmeier
Copy link
Collaborator

pmeier commented Dec 2, 2022

Makes sense, thanks Erjia!

@ejguan
Copy link
Contributor Author

ejguan commented Dec 2, 2022

@pmeier Please feel free to merge this PR.

@pmeier pmeier merged commit 01c11a0 into pytorch:main Dec 2, 2022
@github-actions
Copy link

github-actions bot commented Dec 2, 2022

Hey @pmeier!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

facebook-github-bot pushed a commit to pytorch/data that referenced this pull request Dec 5, 2022
Summary:
There is a case that `file.close` is never called because when generator function has never reached to the end. A simple example would be `zip` two datepipes with different length. The longer DataPipe would never reach the end of generator and then it will be cleaned up by `gc`. So, the line of `file.close` is not executed. (This is the reason that Vitaly has to create this [hack](https://github.com/pytorch/pytorch/blob/4451eb24e6287dff62ff8a7ec0eda6a6998807b0/torch/utils/data/datapipes/iter/combining.py#L573-L583) to retrieve all remaining data to make sure generator function is fully executed)

However, this hack introduces another problem where an infinite datapipe would make `zip` never end as it would try to deplete the infinite iterator. See: #865

So, in this PR, I am adding a `try-finally` clause to make sure the `file.close` is always executed during the destruction of `generator` object. Then, we don't need the hack within `zip` any more.

- pytorch/pytorch#89973
- pytorch/vision#6997
- pytorch/pytorch#89974

Pull Request resolved: #910

Reviewed By: wenleix, NivekT

Differential Revision: D41633909

Pulled By: ejguan

fbshipit-source-id: 5613e131dc8b2962c22d2bc7e3a4bb3039440c48
facebook-github-bot pushed a commit that referenced this pull request Dec 12, 2022
…function (#6997)

Summary: Co-authored-by: Philip Meier <[email protected]>

Reviewed By: datumbox

Differential Revision: D41836894

fbshipit-source-id: b3a8c76245298f85b6ff19107bf2ef872bfb033a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants