You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
The HybridBlock.export() function currently saves two files: [path]-symbol.json describing structure and [path]-####.params for parameters.
The parameters filename contains the epoch number, zero-padded to minimum length 4 (but, given the implementation, will be longer if the number exceeds 9999).
Because the calling code doesn't have control over this naming, any program operating in potentially non-empty directories would need to re-implement the logic of the API in order to keep track of what files the program had output: Making user code fragile to API changes.
It would be better if the API either supported override of the filenames, included them in the return value (maybe just a tuple?), or both. Seems like none of this would be a particularly big code change.
References
mxnet.recordio.MXIndexedRecordIO allows specification of both the record and index file names, even though complementary tools like gluoncv.data.RecordFileDetection require them to be named specifically.
In the latter's case, the two files *.idx and *.rec differ only by extension which is trivial. Introducing string number formatting logic is less so.
Use Case Example
As a training script author / MLOps engineer I want to track the exact locations of best and interval-checkpoint model exports made by my script So that I can tidy up superseded "best" parameters (keeping only current best and interval-driven checkpoints), without deleting any other jobs' outputs, no matter how disorganized my data scientist users are
The text was updated successfully, but these errors were encountered:
Happy to try @leezu, but what other language bindings besides Python would the PR need to update at the same time? As far as I can tell only Perl seems to have an equivalent implementation right?
In general, it should be fine to target Python first, as this change won't lead to an incompatibility between language bindings (at least "include filenames in the return value"). If you're familiar with Perl, you can include the corresponding changes.
Description
The HybridBlock.export() function currently saves two files:
[path]-symbol.json
describing structure and[path]-####.params
for parameters.The parameters filename contains the epoch number, zero-padded to minimum length 4 (but, given the implementation, will be longer if the number exceeds 9999).
Because the calling code doesn't have control over this naming, any program operating in potentially non-empty directories would need to re-implement the logic of the API in order to keep track of what files the program had output: Making user code fragile to API changes.
It would be better if the API either supported override of the filenames, included them in the return value (maybe just a tuple?), or both. Seems like none of this would be a particularly big code change.
References
mxnet.recordio.MXIndexedRecordIO
allows specification of both the record and index file names, even though complementary tools likegluoncv.data.RecordFileDetection
require them to be named specifically.In the latter's case, the two files
*.idx
and*.rec
differ only by extension which is trivial. Introducing string number formatting logic is less so.Use Case Example
As a training script author / MLOps engineer
I want to track the exact locations of best and interval-checkpoint model exports made by my script
So that I can tidy up superseded "best" parameters (keeping only current best and interval-driven checkpoints), without deleting any other jobs' outputs, no matter how disorganized my data scientist users are
The text was updated successfully, but these errors were encountered: