-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential Early Computation Issue in compute_q_matmul_k Function #1
Comments
Hello, |
Hi @jimmy-adams, we used Vitis HLS 2021.1 to synthesize the HLS design and used the resulting IP in Vivado 2022.1 for implementation. Let me know if you have any further questions. |
Hi @qhy991, thanks for your question and apologies for the late reply. Essentially, the algorithm in attention.cpp implements the algorithm described in Section IV.A and depicted in Figure 5 of the paper. Each block streamed out in I hope this helps. Let me know if you have any further questions. |
INFO: [HLS 200-10] Analyzing design file 'src/ViT_compute.cpp' ... |
@jimmy-adams Please try commenting out or removing the following lines from include/kernel.hpp: Lines 4 to 6 in 9d6dd16
These lines were a workaround for an issue described in this support.xilinx.com thread. They were necessary on our system but may not be necessary for yours. Removing them should not cause any issues. Let me know if you have any further questions. |
I learned from the paper that you use pynq to control the IP core. How do you implement it? Is there any relevant code? |
Thank you for your excellent job on HLS.
I've noticed a potential issue in the compute_q_matmul_k function in attention.cpp file. It appears that during the initial stages of computation, many elements within q_blocks are involved in calculations before they have been fully read in. This could potentially lead to inaccuracies in the computed results. Could you please explain the rationale behind this approach?
The text was updated successfully, but these errors were encountered: