You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On the other hand, the ability to save multiple plots rather than define one dataset per plot is essential. I have used it myself many times and seen it used a lot.
So, my question is: should we replace the matplotlib save modes that do multiple plots with instead wrapping MatplotlibWriter in PartionedDataSet? Leaving aside how we do this technically for the moment, would this be a good change to make? i.e. will this be a user-friendly solution here? Will it allow everything we need to allow in terms of functionality?
My suspicion is that the only reason we don't already use PartionedDataSet for this is historical (MatplotlibWriter was added to contrib at the same time PartionedDataSet was added to core).
Tagging @Galileo-Galilei who I suspect will just have the answers here 😀
The text was updated successfully, but these errors were encountered:
One other (likely unnecessary) discrepancy is that PartitionedDataSet doesn't currently support versioning--either of the overarching or underlying dataset. MatplotlibWriter does for the overarching dataset. Perhaps relevant discussions, although more focused on the underlying dataset: kedro-org/kedro#521.
@deepyaman thanks, that is a very pertinent point given that experiment tracking is one of the main motivations here, and that directly relies on versioned datasets to work... So if we were to move to PartitionedDataSet for MatplotlibWriter then we should try and get kedro-org/kedro#521 done.
MatplotlibWriter
currently supports 3 different save modes:plt.figure
to a png fileList[plt.figure]
to multiple png files (labelled 0.png, 1.png, etc.)Dict[str, plt.figure]
to multiple png files (labelled by dictionary keys)There's a recently-added
overwrite
option associated with the latter two modes (kedro-org/kedro#868). This also exists forPartitionedDataSet
.The current behaviour has some problems:
On the other hand, the ability to save multiple plots rather than define one dataset per plot is essential. I have used it myself many times and seen it used a lot.
So, my question is: should we replace the matplotlib save modes that do multiple plots with instead wrapping
MatplotlibWriter
inPartionedDataSet
? Leaving aside how we do this technically for the moment, would this be a good change to make? i.e. will this be a user-friendly solution here? Will it allow everything we need to allow in terms of functionality?My suspicion is that the only reason we don't already use
PartionedDataSet
for this is historical (MatplotlibWriter
was added to contrib at the same timePartionedDataSet
was added to core).Tagging @Galileo-Galilei who I suspect will just have the answers here 😀
The text was updated successfully, but these errors were encountered: