-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Memory fixes. Resolves #10867, and resolves #14080 #14372
Conversation
@mxnet-label-bot add [Memory, Scala, pr-awaiting-review] |
NeuralStyle on gpu seems to be failing. I'll have to dig into why. I also think I've thought of a better way to deal with the interdependency of symbol/executor than what is being done right now so I'll update that as well. |
If I understood correct(from the description), the problem is Executor creates some NDArrays that gets disposed before the executor? |
@nswamy This is because there is no guarantee that they are created at the same ResourceScope level. For example, the Executor gets created and returned to the user. The user then opens up a ResourceScope to do inference. In this scope, the executor might end up creating new resources for itself. Those resources are then in the scope level of the user's ResourceScope and get disposed when that closes. This leaves the Executor without the dependencies it created. The solution is to have objects like executor put any dependencies that they create into the same ResourceScope as themselves. This way they are all coupled together. I thought about using moveToOuterScope but there's no way to know how many levels of scope could be between the two. |
@andrewfayres, thanks for taking care of this, Its a really nice find :) I find the design of executor less than desirable(where it creates states in a method dynamically) after the
My suggestion is to create a couple methods in ResourceScope.
I am open to hear any other interesting ideas that is less intrusive. |
I came up with a different solution which we (@nswamy and I) discussed offline. Essentially, ResourceScope.using already has an optional parameter for an existing scope. A few changes minor changes should let us reuse that. This approach should be more consistent and intuitive. I'll update the PR. |
Any updates? |
I was on call last week and didn't have time to pick this back up unfortunately. I started working on it again yesterday and finished the code changes. I'm going to do testing today and if all goes well I'll have an update this afternoon. |
scala-package/core/src/main/scala/org/apache/mxnet/ResourceScope.scala
Outdated
Show resolved
Hide resolved
@@ -179,6 +175,16 @@ object ResourceScope { | |||
} | |||
} | |||
|
|||
private[mxnet] def usingIfScopeExists[A](scope: Option[ResourceScope])(body: => A): A = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need this new method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah so originally I wanted to use the .using method directly but ran into some problems. There are times when we need to add new native resources to the same scope of the parent but there's no guarantee that the parent is actually in an existing scope.
When this happens it leads to a few issues. First, the parent is in None scope which we cannot pass to ResourceScope.using. Second, we still want to execute the body and allocate all the new resources but don't want to want to make a new ResourceScope because all those new resources will disappear with the new scope.
Alternatives that I thought of were: 1.) to have the caller check whether or not it was in a scope and handle it appropriately. This is ugly and puts the onus on the callers in multiple places. 2.) Changing the using method to work with a None scope. I opted not to do this because it complicates that method and I believe would require changing the method parameters which I didn't want to do. 3.) Changing the default scope to be something other than None. This is probably a reasonable solution. Maybe we have some kind of base scope or something similar. That's likely to be a fairly significant change in both the design and behavior of this class.
Found a new bug last night where many of the optimizers cause a seg fault when used with FeedForward in 1.4.0. Surprisingly, no one has filed an issue for this so I went ahead and did so. It's fixed in this PR. Resolves #14498 |
CI seems to be experiencing unrelated failures. I'll restart this later once they appear to be resolved. |
I am surprised that the tests did not catch the seg-faults, I ran the code through several examples for many days and did not see the seg-faults. |
Feedforward is to be deprecated and Module should be used instead, we never deprecated that module, may be we should consider deprecating that. |
7ae3c3c
to
33c113c
Compare
scala-package/core/src/test/scala/org/apache/mxnet/ResourceScopeSuite.scala
Show resolved
Hide resolved
summary of offline discussion:
|
Thanks for nailing down this issue, great Job 💯 Please link the JIRA we discussed for fixing the scope closing if passed |
Link to Jira for looking into & fixing the ResourceScope.using closing scopes that are passed in. |
…e#14372) * Fixes for memory leak when reshaping executor * Fixed Adam Optimizer memory leak * Cleanup for PR * Added unit test for new ResourceScope method * Removing import that was added by overzealous ide * Add back in an import * Added flags for executor to know whether or not it owns NDArrays for disposal * Moving to ResourceScope.using implementation * Changes to make ResourceScope.using work with existing scope * Updating ResourceScope to work with existing scopes via usingIfScopeExists method * Fix clojure unit tests * Fixes to be compatibile with how clojure is using ResourceScope * Removing some unnecessary changes * Adding scope assertion in unit test
…e#14372) * Fixes for memory leak when reshaping executor * Fixed Adam Optimizer memory leak * Cleanup for PR * Added unit test for new ResourceScope method * Removing import that was added by overzealous ide * Add back in an import * Added flags for executor to know whether or not it owns NDArrays for disposal * Moving to ResourceScope.using implementation * Changes to make ResourceScope.using work with existing scope * Updating ResourceScope to work with existing scopes via usingIfScopeExists method * Fix clojure unit tests * Fixes to be compatibile with how clojure is using ResourceScope * Removing some unnecessary changes * Adding scope assertion in unit test
…e#14372) * Fixes for memory leak when reshaping executor * Fixed Adam Optimizer memory leak * Cleanup for PR * Added unit test for new ResourceScope method * Removing import that was added by overzealous ide * Add back in an import * Added flags for executor to know whether or not it owns NDArrays for disposal * Moving to ResourceScope.using implementation * Changes to make ResourceScope.using work with existing scope * Updating ResourceScope to work with existing scopes via usingIfScopeExists method * Fix clojure unit tests * Fixes to be compatibile with how clojure is using ResourceScope * Removing some unnecessary changes * Adding scope assertion in unit test
* Fixes for memory leak when reshaping executor * Fixed Adam Optimizer memory leak * Cleanup for PR * Added unit test for new ResourceScope method * Removing import that was added by overzealous ide * Add back in an import * Added flags for executor to know whether or not it owns NDArrays for disposal * Moving to ResourceScope.using implementation * Changes to make ResourceScope.using work with existing scope * Updating ResourceScope to work with existing scopes via usingIfScopeExists method * Fix clojure unit tests * Fixes to be compatibile with how clojure is using ResourceScope * Removing some unnecessary changes * Adding scope assertion in unit test
* Fixes for memory leak when reshaping executor * Fixed Adam Optimizer memory leak * Cleanup for PR * Added unit test for new ResourceScope method * Removing import that was added by overzealous ide * Add back in an import * Added flags for executor to know whether or not it owns NDArrays for disposal * Moving to ResourceScope.using implementation * Changes to make ResourceScope.using work with existing scope * Updating ResourceScope to work with existing scopes via usingIfScopeExists method * Fix clojure unit tests * Fixes to be compatibile with how clojure is using ResourceScope * Removing some unnecessary changes * Adding scope assertion in unit test
…e#14372) * Fixes for memory leak when reshaping executor * Fixed Adam Optimizer memory leak * Cleanup for PR * Added unit test for new ResourceScope method * Removing import that was added by overzealous ide * Add back in an import * Added flags for executor to know whether or not it owns NDArrays for disposal * Moving to ResourceScope.using implementation * Changes to make ResourceScope.using work with existing scope * Updating ResourceScope to work with existing scopes via usingIfScopeExists method * Fix clojure unit tests * Fixes to be compatibile with how clojure is using ResourceScope * Removing some unnecessary changes * Adding scope assertion in unit test
Description
There were a few memory leaks in executor and some instances of where ResourceScope was causing problems. This should fix those known issues.
To elaborate on the ResourceScope issue: there are some times when classes such as Executor creates other NativeResources (usually NDArrays) after the Executor has been instantiated. The executor then has a dependency on that new NativeResource. Since these happen at different times, they could potentially be created at different ResourceScope levels. The result is an unstable state where exiting the scope would cause a crash. Fixed this by adding a method to NativeResource which allows moving a resource to the same scope as another. This allows the parent class to control the scope of their dependencies.
Resolves #10867, and resolves #14080.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments