Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage with x-transformers #10

Open
sonovice opened this issue Jul 12, 2023 · 4 comments
Open

Usage with x-transformers #10

sonovice opened this issue Jul 12, 2023 · 4 comments

Comments

@sonovice
Copy link

Is it possibly to easily use axial rotary embeddings with your x-transformers without having to disect the Attention module? At first glance it seems that there is no simple way to just pass an instance of RotaryEmbedding to an x-transformers encoder.

Any help would be appreciated.

@lucidrains
Copy link
Owner

@sonovice hey Simon 👋

you are seeing success with axial rotary embeddings, i'm guessing on mel spec?

that's a bit of a personal invention that i haven't broadcasted that much

i can think about integrating it if you share what your experimental results look like

@sonovice
Copy link
Author

sonovice commented Jul 12, 2023

@lucidrains Hey Phil and thanks for the fast response.

Actually, I didn't have any kind of spectral features in mind (though you just triggered an entire world of new ideas 😉 )

What I would like to try is to recreate something like LayoutLM for musical scores with meaningful 2d relative positional embeddings to capture the relations between musical glyphs in a score page. Your axial rotary embeddings seem like a perfect fit.

EDIT: LayoutLM in a nut shell would be: Take detected (and classified) objects from a text document image, add learned embeddings for x, y, w and h and use these embeddings to do things like paragraph classification etc. with it.

@sonovice
Copy link
Author

sonovice commented Nov 6, 2023

@lucidrains I finally found some time to look at this again. Would you be open to a pull request against x-transformers if I manage to introduce this?

@alvitawa
Copy link

@sonovice I'm looking into doing something similar (but different domain). Can I ask if you succeeded in trying 4d rotary embeddings?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants