-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Various fixes to bring notebook up to usable state. #105
base: master
Are you sure you want to change the base?
Conversation
Wow, so cool! Did you find that the current named tensors proposal matches the blog post? There are some new aspects that are pretty cool and different. I'm also not sure how stable it is, so we might want to see it calm down. Either way I can help render this, and maybe move it to a more modern template. People might find it useful. |
Oh, actually, there is one issue. As far as I know the namedtensor currently in pytorch is deprecated, and they have been experimenting with a new version here: https://github.com/facebookresearch/torchdim . I'm a little hesitant to update to the version they don't seem to be supporting. I wonder if we should move to the new one? |
Hmmn, it's not clear to me whether TorchDim or Named Tensors is more likely to be the future of the concept within the PyTorch ecosystem. Why do you say that the Named Tensors currently in PyTorch are deprecated? Their documentation page says
but equally the TorchDim ReadMe says
I didn't know about TorchDim until you linked it (thanks!). TorchDim is much closer to functionality-parity with your original NamedTensor library; PyTorch's current setup only supports very basic operations with Named Tensors (I had to write several hacky functions to work around this while porting over the code). However, TorchDim is barely a month old, so I'd assume that the current Named Tensor implementation is more widely used. From an educational-usefulness perspective, I don't know which is better for a rewrite - thoughts? |
Good question. I got the sense from talking with the torch team that the current implementation was kind of a dead end. The new one is pretty interesting and I'd like to support it, but you are right that it is experimental at the moment. Given that you did the work, I'm happy with whatever you feel like. I guess the question is, what is the purpose: Is it to argue that people should use this, should develop it further, or just want to understand the original source material. |
Another random idea would be to write this in Jax. Their implementation is really cool and very close to the original. https://jax.readthedocs.io/en/latest/notebooks/xmap_tutorial.html |
Ah, that's good to know! I agree, I prefer the new approach more, so if that is the future direction then it makes more sense to use than the current implementation. The initial point of the rewrite was that I thought the post was a fascinating idea, and it irked me that the code didn't actually run. The post is already, imo, a compelling argument for why people should use named tensors in some form and get behind their development. I just felt the argument would be even more compelling if their first interaction with named tensors (which I imagine for many people is playing with your notebook) worked correctly. This is both from a confidence standpoint (if they don't work in the demo, why trust them in your real projects?) and an experimentation standpoint (if the base code in the notebook works immediately, users can immediately build familiarity by tweaking the code to their own context, extending the demos, etc). With that in mind I'm leaning more towards TorchDim than the Jax version, because I imagine people will be less familiar with Jax than PyTorch. I'm fearful they might see quirky experiment rather than an actual improvement to their toolchain and mindset if the post is in Jax. What do you reckon? Happy to re-re-write in either framework. After mulling on this for a few days I'm just keen to spread the Named Tensor gospel in whatever way is most effective. |
So I think it would be fun to do TorchDim then. Not sure if everything translates over but it feels like it might be a bit more future oriented. But I think you should let me know if it feels like it makes sense. It is possible that some of the ideas in the blog post just don't transfer. In that case, might be worth just leaving as a historical document. If you do complete it, let's do something like I did with the annotated transformer and udpate to a more modern formatting (http://nlp.seas.harvard.edu/annotated-transformer/) |
Per discussion on Twitter (https://twitter.com/srush_nlp/status/1184458244363870209), this is my attempt to bring the original NamedTensor proposal / post up to date with the current PyTorch functionality. Where possible, I update the code to use PyTorch Named Tensors. PyTorch's Named Tensors do not support much of the functionality proposed in this note: in those cases, I updated the code to work with the latest version of the custom
namedtensor
library.The updated version of the notebook runs without error on Colab & my local machine, and reproduces the demonstrations from the original. I don't know how to render it as an HTML file to replace the current blog post, but hopefully somebody else does.