-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added file_lock
in IoPathSaverIterDataPipe
#395
Conversation
As far as the test goes it seems like the So will extend this test to multiple processes and see how CI works on Windows and Linux @parmeet do you have some existing test/script I can leverage? Just wanna make sure your usecase is solved |
Unfortunately, no. we observed this issue when running DDP while training doc classification workflow using TorchRecipes internally. cc: @hudeven |
Side comment: I'm ok to have |
Completely correct. We are using this iopath file to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one comment on test.
@msaroufim has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please fix the followings before landing? Thank you!
@msaroufim has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please fix lint as well in this file? Really appreciate.
Note to self - there's now a genuine failure since I refactored def __iter__(self) -> Iterator[str]:
for filepath, data in self.source_datapipe:
if self.fn is not None:
> filepath = self.fn(filepath)
E TypeError: filepath_fn() missing 1 required positional argument: 'name'
E This exception is thrown by __iter__ of SaverIterDataPipe() |
@msaroufim has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
1 similar comment
@msaroufim has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
I think you might need to rebase onto |
Summary: Please read through our [contribution guide](https://github.com/pytorch/data/blob/main/CONTRIBUTING.md) prior to creating your pull request. - Note that there is a section on requirements related to adding a new DataPipe. Fixes #397 This is an identical PR to #395 which I accidentally deleted during a bad force push trying to resolve a rebase Pull Request resolved: #413 Reviewed By: ejguan Differential Revision: D36433231 Pulled By: msaroufim fbshipit-source-id: 0f71790eaded445f8e42911db4bea4c468175520
Untested, made this PR to make communication easier
Partially fixes #144
The main change is to lock a file when multiple processes are writing to the same file
The fix was upstreamed from an internal fix @hudeven made
And the test was extended from a gist @ejguan sent me
It does seem to be working as intended right now, @parmeet please let me know if this works for you
Logs with fix
Logs without fix