-
Notifications
You must be signed in to change notification settings - Fork 6.8k
simple_bind elemwise_add with group2ctx fails #7080
Comments
This seems tricky :-(... I don't think we have information of device placement during graph construction and that's when we choose to do CloneGradient method. @piiswrong Any ideas how to work around that? Is there a device placement pass in NNVM that we could use for making a copy node (or any pass that knows about devices)? |
@eric-haibin-lin you are right, I just verified the perl test is failing on the master branch as well and it's likely related to this issue. I'll be removing this perl test from the master branch via another pull request until the issue is fixed. You can go ahead with merging my pull for your sparse branch. |
@sergeykolychev thanks! |
@ptrendx are you referring to the place_device pass? |
@eric-haibin-lin requesting an update on this bug, has this been resolved? |
@vrakesh No. The example is already posted above and you should be able to reproduce it. |
For bugs or installation issues, please provide the following information.
The more information you provide, the more likely people will be able to help you.
Environment info
Operating System: AWS Deep Learning AMI
Package used (Python/R/Scala/Julia): python
Or if installed from source:
MXNet commit hash (
git rev-parse HEAD
): 8c81ee4If you are using python package, please provide
Python version and distribution: python 2.7
Error Message:
Please paste the full error message, including stack trace.
Minimum reproducible example
if you are using your own code, please provide a short script that reproduces the error.
Steps to reproduce
or if you are running standard examples, please provide the commands you have run that lead to the error.
What have you tried to solve it?
CloneGradient
(src/operator/elemwise_op_common.h) forelemwise_add
backward pass which reduces the number of copies. This however, failed to copy the gradient across devices. It could probably solved by registeringinplace_identity
attribute for inplace updates.The text was updated successfully, but these errors were encountered: