Close #21810 #22288

DevBhuyan · 2023-08-21T15:54:58Z

Update:

Added transformer layer in ~/ivy/ivy/stateful/layers.py
Added test test_transformer_layer in ~/ivy/ivy_tests/test_ivy/test_stateful/test_layers.py
Added custom composite strategy to generate data (ref. transformer_data())
This is my first PR. I tried to make it as appropriate as possible. Please let me know if there are any modifications that you'd suggest. Thank you!😊

github-actions · 2023-08-21T15:59:42Z

Thanks for contributing to Ivy! 😊👏
Here are some of the important points from our Contributing Guidelines 📝:
1. Feel free to ignore the run_tests (1), run_tests (2), … jobs, and only look at the display_test_results job. 👀 It contains the following two sections:
- Combined Test Results: This shows the results of all the ivy tests that ran on the PR. ✔️
- New Failures Introduced: This lists the tests that are passing on main, but fail on the PR Fork. Please try to make sure that there are no such tests. 💪
2. The lint / Check formatting / check-formatting tests check for the formatting of your code. 📜 If it fails, please check the exact error message in the logs and fix the same. ⚠️🔧
3. Finally, the test-docstrings / run-docstring-tests check for the changes made in docstrings of the functions. This may be skipped, as well. 📚
Happy coding! 🎉👨‍💻

rishabgit

Hi, thanks for looking into this @DevBhuyan 😄

You'll have to rewrite the syntax a bit so that it is in line with general convention. I'll recommend checking out other classes in that file.

Not super sure if the implementation is correct - it should ideally be similar to Pytorch's Transformer module - https://pytorch.org/docs/stable/generated/torch.nn.Transformer.html#torch.nn.Transformer https://github.com/pytorch/pytorch/blob/main/torch/nn/modules/transformer.py
and not creating a model with positional encodings pre-coded like this official Pytorch example making use of Transformer module - https://github.com/pytorch/examples/blob/13009eff7a80ebcf6ae89ed217d5d176bd3e019d/word_language_model/model.py#L107

@vedpatwardhan - do you think the above makes sense or @DevBhuyan's implementation is fine as it is? 🤔

DevBhuyan · 2023-08-24T19:39:23Z

Hi @rishabgit, Thanks for taking the time to review my PR. Yes, you're right, I need to change the syntax to fit with the other classes. My previous code was also just an attempt to write the layer as simply as possible (autoregressive, without a decoder), I guess I'll need to modify it to an encoder-decoder architecture (like that of PyTorch's).

I'll work on it and make a commit as soon as I have it ready. Thanks again for the suggestions:)

vedpatwardhan · 2023-08-25T04:10:25Z

Hi, thanks for looking into this @DevBhuyan 😄

You'll have to rewrite the syntax a bit so that it is in line with general convention. I'll recommend checking out other classes in that file.

Not super sure if the implementation is correct - it should ideally be similar to Pytorch's Transformer module - https://pytorch.org/docs/stable/generated/torch.nn.Transformer.html#torch.nn.Transformer https://github.com/pytorch/pytorch/blob/main/torch/nn/modules/transformer.py and not creating a model with positional encodings pre-coded like this official Pytorch example making use of Transformer module - https://github.com/pytorch/examples/blob/13009eff7a80ebcf6ae89ed217d5d176bd3e019d/word_language_model/model.py#L107

@vedpatwardhan - do you think the above makes sense or @DevBhuyan's implementation is fine as it is? 🤔

Hey @rishabgit, your suggestion makes perfect sense, we should definitely try and align with the torch.nn.Transformer module as closely as possible. I'm not sure why we wouldn't want to create a model with pre-coded positional encodings, do you mean we directly do a self.register_buffer instead to ensure that it's a non-trainable state variable? Thanks @rishabgit @DevBhuyan 😄

DevBhuyan · 2023-08-25T20:12:07Z

Hi @vedpatwardhan, I guess I was assuming a totally different direction, a decoder-only transformer. I agree with @vedpatwardhan and @rishabgit, and I'm rewriting it entirely to be in line with Pytorch's implementation and the other classes from layers.py.

Please excuse the unforeseen closing and opening of this Pull Request:) I'm still new to contributing on GitHub, I accidentally tried to remove another branch and this happened.

DevBhuyan · 2023-08-27T16:11:24Z

Hi @rishabgit, I have updated the Transformer class to fit in line with PyTorch's implementation as well as with the other classes in the layers.py file. The docstrings also have been edited to match with the other classes.

Since I had used PyTorch's implementation as a starting point for the class(es), There were portions in PyTorch's implementation that required lower-level backend implementation to speed up processes ('FlashAttention'). I have not totally removed those lines, instead, I commented them out with FIXME tags. If required, these can be incorporated later into Ivy's backend itself to speed up processing.

Kindly let me know if there are any changes you'd suggest.

Thank you for your patience:)

DevBhuyan · 2023-09-06T11:32:52Z

I guess I messed up this PR. I'll create a new fork of the repo and create a fresh PR

DevBhuyan mentioned this pull request Aug 21, 2023

Close #21810 #22256

Closed

ivy-seed assigned rishabgit Aug 21, 2023

rishabgit reviewed Aug 24, 2023

View reviewed changes

DevBhuyan closed this Aug 25, 2023

DevBhuyan deleted the secondary branch August 25, 2023 19:41

DevBhuyan restored the secondary branch August 25, 2023 19:45

DevBhuyan reopened this Aug 25, 2023

DevBhuyan requested a review from rishabgit August 29, 2023 17:11

DevBhuyan closed this Aug 29, 2023

DevBhuyan added 2 commits August 30, 2023 02:34

Add files via upload

17c679c

Add files via upload

26ffefb

DevBhuyan reopened this Aug 29, 2023

Add files via upload

5bb8fa1

DevBhuyan closed this Sep 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Close #21810 #22288

Close #21810 #22288

DevBhuyan commented Aug 21, 2023

github-actions bot commented Aug 21, 2023

rishabgit left a comment

DevBhuyan commented Aug 24, 2023

vedpatwardhan commented Aug 25, 2023

DevBhuyan commented Aug 25, 2023

DevBhuyan commented Aug 27, 2023

DevBhuyan commented Sep 6, 2023

Close #21810 #22288

Close #21810 #22288

Conversation

DevBhuyan commented Aug 21, 2023

github-actions bot commented Aug 21, 2023

rishabgit left a comment

Choose a reason for hiding this comment

DevBhuyan commented Aug 24, 2023

vedpatwardhan commented Aug 25, 2023

DevBhuyan commented Aug 25, 2023

DevBhuyan commented Aug 27, 2023

DevBhuyan commented Sep 6, 2023