Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential Early Computation Issue in compute_q_matmul_k Function #1

Open
qhy991 opened this issue Nov 29, 2023 · 6 comments
Open

Potential Early Computation Issue in compute_q_matmul_k Function #1

qhy991 opened this issue Nov 29, 2023 · 6 comments

Comments

@qhy991
Copy link

qhy991 commented Nov 29, 2023

Thank you for your excellent job on HLS.
I've noticed a potential issue in the compute_q_matmul_k function in attention.cpp file. It appears that during the initial stages of computation, many elements within q_blocks are involved in calculations before they have been fully read in. This could potentially lead to inaccuracies in the computed results. Could you please explain the rationale behind this approach?

@jimmy-adams
Copy link

Thank you for your excellent job on HLS. I've noticed a potential issue in the compute_q_matmul_k function in attention.cpp file. It appears that during the initial stages of computation, many elements within q_blocks are involved in calculations before they have been fully read in. This could potentially lead to inaccuracies in the computed results. Could you please explain the rationale behind this approach?

Hello,
In which version of Vitis and Vivado did you implement this design?

@ArkaneMoose
Copy link
Collaborator

Hi @jimmy-adams, we used Vitis HLS 2021.1 to synthesize the HLS design and used the resulting IP in Vivado 2022.1 for implementation. Let me know if you have any further questions.

@ArkaneMoose
Copy link
Collaborator

Thank you for your excellent job on HLS. I've noticed a potential issue in the compute_q_matmul_k function in attention.cpp file. It appears that during the initial stages of computation, many elements within q_blocks are involved in calculations before they have been fully read in. This could potentially lead to inaccuracies in the computed results. Could you please explain the rationale behind this approach?

Hi @qhy991, thanks for your question and apologies for the late reply. Essentially, the algorithm in attention.cpp implements the algorithm described in Section IV.A and depicted in Figure 5 of the paper. Each block streamed out in compute_q_matmul_k corresponds to the outputs of a row in Figure 5, which indeed may contain uninitialized or garbage data for the first and last iterations (represented by the blank spaces in the figure). However, since we know the exact indices at which these garbage data exist, they can and are ignored by the downstream tasks (e.g., here, where we are explicitly doing this check, and here, where we are implicitly discarding the garbage data from the start when v_patch < attn_patch_offset).

I hope this helps. Let me know if you have any further questions.

@jimmy-adams
Copy link

Hi @jimmy-adams, we used Vitis HLS 2021.1 to synthesize the HLS design and used the resulting IP in Vivado 2022.1 for implementation. Let me know if you have any further questions.
Hello !
Thank you for your kind reply. I follow your given Vitis configuration to try run tcl script. But the processing runs into the error of gmp.h missing. Is that related to my OS configuration?

INFO: [HLS 200-10] Analyzing design file 'src/ViT_compute.cpp' ...
ERROR: [HLS 207-812] 'gmp.h' file not found: src/../include/kernel.hpp:5:10
INFO: [HLS 200-111] Finished Command csynth_design CPU user time: 0.35 seconds. CPU system time: 0.09 seconds. Elapsed time: 0.21 seconds; current allocated memory: 141.787 MB.
command 'ap_source' returned error code
while executing
"source vitis_hls.tcl"
("uplevel" body line 1)
invoked from within
"uplevel #0 [list source $arg] "

@ArkaneMoose
Copy link
Collaborator

@jimmy-adams Please try commenting out or removing the following lines from include/kernel.hpp:

// https://support.xilinx.com/s/question/0D52E00006iHkfp/vivado-20153-hls-bug-gmph?language=en_US
#include <gmp.h>
#define __gmp_const const

These lines were a workaround for an issue described in this support.xilinx.com thread. They were necessary on our system but may not be necessary for yours. Removing them should not cause any issues.

Let me know if you have any further questions.

@qqxx0011
Copy link

qqxx0011 commented Nov 7, 2024

I learned from the paper that you use pynq to control the IP core. How do you implement it? Is there any relevant code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants