You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The new dynamic dimension support for Tensors is great! Are there any plans to support it for Modules as well? To allow for e.g. a Linear layer where the input size is a const generic but the output size is dynamic?
I tried hacking this together myself, but ran into problems with allocation. It seems like the existing Device APIs are pretty strongly angled towards being able to allocate a Module based purely on its type. I can probably implement allocation and Module on my type manually, but it would be great to have a more ergonomic workaround.
The text was updated successfully, but these errors were encountered:
Glad you like it! Yeah aligning the existing nn layer to support both compile time & run time dimensions would complicate things quite a bit. It's definitely possible, for one example we could have linear defined as:
As far as instantiation, you're right the existing method assumes sizes known at compile time. We'd like have to do something similar to what tensors do where we have a version of creation with compile time shapes (dev.zeros(), and a separate version for runtime sizes (dev.zeros_like(&(3, 5))).
I think my main concern would be the increased complexity in adding these, both from internals perspective and also external usability perspective.
The other part of this is that currently in the deep learning ecosystem, most training/inference libraries redefine neural network types in their own libraries. huggingface does this all over this place. As far as dfdx goes, hopefully people can define their own nn layers similar to that. So if someone wanted to implement linear/transformer/etc with runtime shapes outside of dfdx, that should be possible.
The new dynamic dimension support for Tensors is great! Are there any plans to support it for
Module
s as well? To allow for e.g. aLinear
layer where the input size is a const generic but the output size is dynamic?I tried hacking this together myself, but ran into problems with allocation. It seems like the existing
Device
APIs are pretty strongly angled towards being able to allocate aModule
based purely on its type. I can probably implement allocation andModule
on my type manually, but it would be great to have a more ergonomic workaround.The text was updated successfully, but these errors were encountered: