Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusion about the different archetectures between code of VSSBlock and figure in paper of VSSBlock #56

Open
Allenem opened this issue May 20, 2024 · 4 comments

Comments

@Allenem
Copy link

Allenem commented May 20, 2024

x = input + self.drop_path(self.self_attention(self.ln_1(input)))

Hi, Thank you very much for your excellent work!

I want to know why the forward function of class VSSBlock only includes

1 layer norm, SS2D, drop & addition;

rather than what is ilustrated in paper Fig1 (b) :

2 Layer Norm, 3 Linear layer, 1 DW-Conv, 2 Activation, SS2D, Addition & 1 Element-wise production

@YunhengWu-IB
Copy link

I also would like to ask the same question!!
Do you know the reason now?

@Allenem
Copy link
Author

Allenem commented Aug 14, 2024

I also would like to ask the same question!! Do you know the reason now?

I understand the code now after reading the class VSSBlock & class SS2D together.

Both code and Fig1 (b) are same (only drop haven't been drawn in the figure). I add the comments to the fig1(b) here. The image below is easy to show the details. I hope you can understand the code easier after reading my comments 😄 :

image

@YunhengWu-IB
Copy link

Thank you very much for your very nice explanation!!!

@ZhouCong223
Copy link

Your explanation and annotations are excellent. Thank you for your response

@JCruan519 JCruan519 pinned this issue Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants