Skip to content
This repository has been archived by the owner on Oct 17, 2023. It is now read-only.

Fix a memory leak in LCIO -> EDM4hep conversion #28

Merged
merged 1 commit into from
Feb 27, 2023

Conversation

jmcarcell
Copy link
Member

@jmcarcell jmcarcell commented Feb 7, 2023

BEGINRELEASENOTES

  • Fix a memory leak due to collections not being deleted

ENDRELEASENOTES

Fixes key4hep/k4MarlinWrapper#96. The issue is that there are some collections that are created but never deleted so the fix is to make sure to delete them. When running with a file with events with tracks and clusters I don't get any memory leaks with this fix (and I do get them without the fix). To reproduce, build k4LCIOReader and then build k4MarlinWrapper (needed because the signature of one function changed).

Edit: After some discussion in the key4hep call, it has been pointed out that these collections may not be registered correctly

@jmcarcell jmcarcell requested a review from tmadlener February 7, 2023 08:40
@tmadlener
Copy link
Contributor

From what I understand currently, the collections that are converted but never put into the Gaudi TES are leaked. I am not sure there is an easy fix for this, because ownership will probably be a proper mess. The solution here would mean that there are collections that we convert multiple times, because we no longer cache them. However, at the converter level there is no easy way of keeping track of which collections have been added to the TES and which were not, I think.

Having said that, we are working on a "standalone" conversion from LCIO to EDM4hep that should hopefully address these concerns, and also have an API that is more similar to the one found in k4EDM4hep2LCIOConv. We should have this ready shortly and are aiming to upstream that to k4EDM4hep2LCIOConv.

Could you check if those potential double conversions actually have a measurable impact on the runtime in the end?

@zoujh
Copy link
Collaborator

zoujh commented Feb 10, 2023

In my original thought, the LCIOInput algorithm is always loaded to register the collections to TES in the context of Gaudi. And then the LCIOInput.collections property must be set properly. The memory leak happens when some collections are set in LCIOInput.collections without their dependencies. In my opinion, a better solution is to register the dependencies to TES in LCIOInput automatically.

@jmcarcell
Copy link
Member Author

Some comments first to @tmadlener:

From what I understand currently, the collections that are converted but never put into the Gaudi TES are leaked.
Yes.

The solution here would mean that there are collections that we convert multiple times, because we no longer cache them.

While I haven't looked at if this is happening or not I think in most cases this is not happening. What is happening now is that we have a list of collections to go through, get them and register them in Lcio2EDMhep.cpp. During this process the collections that depend on those are also transformed but in the end only the 'main' one is returned and registered, see

for_each(m_name2src.begin(), m_name2src.end(), [this](auto &v) {
if (v.second->getTypeName() == "MCParticle")
getCollection(v.first);
});
for an example of how in a sort of recursive way the transformation of other collections is done. So I think this only happens once for each dependent collection most of the time unless there are lots of inter-dependencies between collections. Timing results are not conclusive but I think they show that there isn't a huge number of conversions (and I vaguely remember putting some prints inside the transformation functions and the ones I checked were called only once per event).

Some timing results, master of k4LCIOReader:

k4run main.py > /dev/null 2>&1  28.20s user 1.16s system 99% cpu 29.449 total

Branch in this PR:

k4run main.py > /dev/null 2>&1  26.81s user 0.78s system 99% cpu 27.693 total

(of course the timing tests are done in the same machine, with the same python script and the same file with 10000 events).

Comments for @zoujh:
I'm not sure I fully understand. Are you referring to the LCIOInput.h file? I assume it will have the same problem since it's also using the k4LCIOReader but the problem is in the k4LCIOReader right? But I don't think we are using it from k4MarlinWrapper since the k4LCIOReader is being called directly. Are you saying that we should change the list of collections, that starts here https://github.com/key4hep/k4MarlinWrapper/blob/9ef24c4d570899cb0b566aebbbb92fc35442685a/k4MarlinWrapper/src/components/Lcio2EDM4hep.cpp#L163 ? (similarly to how it is done in LCIOInput

@tmadlener
Copy link
Contributor

Thanks for checking again. I don't see a really easy solution here, unfortunately. The one thing that could be a sort of stop gap to this is to introduce a vector<string> into the k4LCIOReader to keep track of the collections that have actually been handed off to the TES, and then pass that list back to the converter e.g. in k4LCIOReader::endOfEvent to delete all the podio::CollectionBases the m_name2dest map that are not in this list.

@andresailer andresailer requested a review from zoujh February 21, 2023 15:00
Copy link
Contributor

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As just discussed during the meeting: this is not the ideal solution, but it works for now, and it seems that converting some collections multiple times is an acceptable thing to do performance wise.

Unless @zoujh has any concerns, I think we should merge this to at least fix the memory leak and then see how we handle this in the longer term.

@zoujh zoujh merged commit 8a52b3c into key4hep:master Feb 27, 2023
@tmadlener tmadlener mentioned this pull request Apr 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memory leak in LCIO to EDM4hep conversion
3 participants