Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fix in cache management for nested lists #1956

Closed

Conversation

MaximilianSchreff
Copy link
Contributor

SystemDS was previously not supporting nested lists correctly since the data of CacheableData objects within nested loops were always deleted after a function call.
Normally, there are rmvar statements after function calls to remove alll variables used within the function. To protect CacheableData objects (e.g. matrices) from having their data removed by the rmvar statements we use a cleanup-enabled flag. This flag was not correctly set for variables that were within a nested list. These commits fix this problem by flagging all elements, also within nested lists.

Automated tests have been added to test the changes.

@phaniarnab

@MaximilianSchreff
Copy link
Contributor Author

The bug can be triggered through a simple function call with a nested list as an argument.

Here is a dml script which does this.
list_bug.txt

@j143 j143 added this to the systemds-3.2.0 milestone Dec 17, 2023
@Baunsgaard
Copy link
Contributor

LGTM,

I think the Queue make things slower instead of the boolean list,
but if we need to support the dynamic lengths of the Lists or nested Lists, then this fix makes sense.
And the live variables management is a low overhead to begin with so, all good from my side.

We need to run the tests again once the main branch is clear of failing tests.

@MaximilianSchreff
Copy link
Contributor Author

@Baunsgaard
About the Queue - previously, we iterated through all variables and counted the number of cacheable objects. This was done to calculate the size of the boolean array but in a shallow manner. With the changes, we would now need to iterate through every single variable, i.e. laso recursively for lists, in order to get the size. That is why I chose a dynamic data structure. I'm not sure whether this is faster though. Since the number of live variables per function call is usually pretty low, I wasn't able to see any performance differences.

@Baunsgaard
Copy link
Contributor

thanks for the PR, it is now merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants