You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Unlike other layers, embedding layer uses specific form of Tensor called IndexedSlices and this data contains the specific indicies of the Tensor that we are interested in.
Thus, during the backwarding process, we do not have to set all the value-tensor-shaped gradient in VarGrad, but we can optimize it by formulating the specific part for the Gradient Tensor.
In current NNTrainer code, there is no such consideration like above, but it use same-shaped but zero-filled Tensor for un-interested-indexed portion of the Tensor. (redundant size Tensor declaration)
As far as I am concerned, we should work on this part in the near future for the memory optimization purpose.
The text was updated successfully, but these errors were encountered:
Unlike other layers, embedding layer uses specific form of Tensor called
IndexedSlices
and this data contains the specific indicies of the Tensor that we are interested in.Thus, during the backwarding process, we do not have to set all the value-tensor-shaped gradient in VarGrad, but we can optimize it by formulating the specific part for the Gradient Tensor.
In current NNTrainer code, there is no such consideration like above, but it use same-shaped but zero-filled Tensor for un-interested-indexed portion of the Tensor. (redundant size Tensor declaration)
As far as I am concerned, we should work on this part in the near future for the memory optimization purpose.
The text was updated successfully, but these errors were encountered: