Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various fixes to bring notebook up to usable state. #105

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zoeqevans
Copy link

Per discussion on Twitter (https://twitter.com/srush_nlp/status/1184458244363870209), this is my attempt to bring the original NamedTensor proposal / post up to date with the current PyTorch functionality. Where possible, I update the code to use PyTorch Named Tensors. PyTorch's Named Tensors do not support much of the functionality proposed in this note: in those cases, I updated the code to work with the latest version of the custom namedtensor library.

The updated version of the notebook runs without error on Colab & my local machine, and reproduces the demonstrations from the original. I don't know how to render it as an HTML file to replace the current blog post, but hopefully somebody else does.

@srush
Copy link
Contributor

srush commented Jul 29, 2022

Wow, so cool!

Did you find that the current named tensors proposal matches the blog post? There are some new aspects that are pretty cool and different. I'm also not sure how stable it is, so we might want to see it calm down.

Either way I can help render this, and maybe move it to a more modern template. People might find it useful.

@srush
Copy link
Contributor

srush commented Jul 29, 2022

Oh, actually, there is one issue. As far as I know the namedtensor currently in pytorch is deprecated, and they have been experimenting with a new version here: https://github.com/facebookresearch/torchdim . I'm a little hesitant to update to the version they don't seem to be supporting. I wonder if we should move to the new one?

@zoeqevans
Copy link
Author

zoeqevans commented Jul 29, 2022

Hmmn, it's not clear to me whether TorchDim or Named Tensors is more likely to be the future of the concept within the PyTorch ecosystem. Why do you say that the Named Tensors currently in PyTorch are deprecated? Their documentation page says

The named tensor API is a prototype feature and subject to change.

but equally the TorchDim ReadMe says

torchdim is a preview release so that we can collect feedback on the API. It may have bugs, and there are known places where performance can be improved.

I didn't know about TorchDim until you linked it (thanks!). TorchDim is much closer to functionality-parity with your original NamedTensor library; PyTorch's current setup only supports very basic operations with Named Tensors (I had to write several hacky functions to work around this while porting over the code). However, TorchDim is barely a month old, so I'd assume that the current Named Tensor implementation is more widely used. From an educational-usefulness perspective, I don't know which is better for a rewrite - thoughts?

@srush
Copy link
Contributor

srush commented Jul 29, 2022

Good question. I got the sense from talking with the torch team that the current implementation was kind of a dead end. The new one is pretty interesting and I'd like to support it, but you are right that it is experimental at the moment.

Given that you did the work, I'm happy with whatever you feel like. I guess the question is, what is the purpose: Is it to argue that people should use this, should develop it further, or just want to understand the original source material.

@srush
Copy link
Contributor

srush commented Jul 29, 2022

Another random idea would be to write this in Jax. Their implementation is really cool and very close to the original. https://jax.readthedocs.io/en/latest/notebooks/xmap_tutorial.html

@zoeqevans
Copy link
Author

Good question. I got the sense from talking with the torch team that the current implementation was kind of a dead end. The new one is pretty interesting and I'd like to support it, but you are right that it is experimental at the moment.

Given that you did the work, I'm happy with whatever you feel like. I guess the question is, what is the purpose: Is it to argue that people should use this, should develop it further, or just want to understand the original source material.

Ah, that's good to know! I agree, I prefer the new approach more, so if that is the future direction then it makes more sense to use than the current implementation.

The initial point of the rewrite was that I thought the post was a fascinating idea, and it irked me that the code didn't actually run. The post is already, imo, a compelling argument for why people should use named tensors in some form and get behind their development.

I just felt the argument would be even more compelling if their first interaction with named tensors (which I imagine for many people is playing with your notebook) worked correctly. This is both from a confidence standpoint (if they don't work in the demo, why trust them in your real projects?) and an experimentation standpoint (if the base code in the notebook works immediately, users can immediately build familiarity by tweaking the code to their own context, extending the demos, etc).

With that in mind I'm leaning more towards TorchDim than the Jax version, because I imagine people will be less familiar with Jax than PyTorch. I'm fearful they might see quirky experiment rather than an actual improvement to their toolchain and mindset if the post is in Jax.

What do you reckon? Happy to re-re-write in either framework. After mulling on this for a few days I'm just keen to spread the Named Tensor gospel in whatever way is most effective.

@srush
Copy link
Contributor

srush commented Aug 7, 2022

So I think it would be fun to do TorchDim then. Not sure if everything translates over but it feels like it might be a bit more future oriented.

But I think you should let me know if it feels like it makes sense. It is possible that some of the ideas in the blog post just don't transfer. In that case, might be worth just leaving as a historical document.

If you do complete it, let's do something like I did with the annotated transformer and udpate to a more modern formatting (http://nlp.seas.harvard.edu/annotated-transformer/)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants