You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A followup issue to #3269. We likely need to rationalize the relationship between ADIOS engines Close() and the engine destructor. Currently, you can get by without calling engine.Close() for most engines and everything still gets cleaned up neatly when the engine is destroyed. That is because file handles get closed on deallocation, and generally for the writer-side file data is stable after EndStep. Specifically, even file writers don't require a "finalize" step on close. But some streaming engines (at least SST) are different. If you don't do an explicit close() on both the writer and reader side on SST and instead rely upon the destructor, bad things can happen. At a minimum, we get "unexpected connection close" errors thrown, which may be unclear to the user (who may well expect that the destructor does close()).
That said, I don't think we can make the engine destructor (or even just the SST destructor) do close(). SST connection shutdown protocol equires MPI collective operations and may actually block for a considerable time while handshaking with its peer. Since destructors might be called during exception handling (which might not happen symmetrically across all ranks), doing collectives inside the destructor might result in the program hanging without explanation or error.
I'm thinking that the best we can do here is to setup the destructor for some engines to print a warning if it gets called while the stream is still open. There may be code that relies upon the "destructor acts like close()" behavior for engines where it actually does, so I would not enable this warning message for all engines, but enabling it for engines where it is important would reduce confusion. I'll be poking at this soon, but wanted to open an issue in case there were other thoughts.
The text was updated successfully, but these errors were encountered:
Some of this analysis was subtly wrong. In particular the active flag in our file engines with streaming support gets its status changed in file Close(). So while reader file engines can safely exit without Close(), writer engines that don't call close leave the file in a different state than the ones that do call Close(). For those file engines, I'm of the opinion that ADIOS should add a destructor to the writer engine that at least modifies the status of that flag on disk. That would allow files from terminated runs to perhaps be treated more normally. However, I'm not sure we can actually call Close() in the destructor because some of the things it does in BP4 (like Flush, WriteCollectiveMetadata, etc.) do perform MPI collective operations. Now in normal operation, those things probably happened in EndStep, but our API allows you to skip that sometimes too... @pnorbert, thoughts here? Fixing the active flag seems to be a the most minimalistic approach that improves on the current scenario. Not a complete solution by far, but doesn't seem like a huge thing to do in a destructor and would at least make some files more directly readable.
A followup issue to #3269. We likely need to rationalize the relationship between ADIOS engines Close() and the engine destructor. Currently, you can get by without calling engine.Close() for most engines and everything still gets cleaned up neatly when the engine is destroyed. That is because file handles get closed on deallocation, and generally for the writer-side file data is stable after EndStep. Specifically, even file writers don't require a "finalize" step on close. But some streaming engines (at least SST) are different. If you don't do an explicit close() on both the writer and reader side on SST and instead rely upon the destructor, bad things can happen. At a minimum, we get "unexpected connection close" errors thrown, which may be unclear to the user (who may well expect that the destructor does close()).
That said, I don't think we can make the engine destructor (or even just the SST destructor) do close(). SST connection shutdown protocol equires MPI collective operations and may actually block for a considerable time while handshaking with its peer. Since destructors might be called during exception handling (which might not happen symmetrically across all ranks), doing collectives inside the destructor might result in the program hanging without explanation or error.
I'm thinking that the best we can do here is to setup the destructor for some engines to print a warning if it gets called while the stream is still open. There may be code that relies upon the "destructor acts like close()" behavior for engines where it actually does, so I would not enable this warning message for all engines, but enabling it for engines where it is important would reduce confusion. I'll be poking at this soon, but wanted to open an issue in case there were other thoughts.
The text was updated successfully, but these errors were encountered: