Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Flash attention" error in a GPU enabled laptop with CUDA 12.1, Windows 11, Pytorch 2.4.0 #219

Closed
sleeplessTLV opened this issue Aug 14, 2024 · 4 comments

Comments

@sleeplessTLV
Copy link

Hi,
Installed and downloaded all, Python 3.12 and Pytorch 2.4.0+cu121 (Cuda 12.1) installed.
running the basic Jupyter notebook.
at the prediction task I get
"c:\Python312\segment-anything-2\sam2\modeling\backbones\hieradet.py:68: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:555.)
x = F.scaled_dot_product_attention("
Previously on the
"from sam2.sam2_image_predictor import SAM2ImagePredictor"
stage I got just a warning
"c:\Python312\segment-anything-2\sam2\modeling\sam\transformer.py:23: UserWarning: Flash Attention is disabled as it requires a GPU with Ampere (8.0) CUDA capability.
OLD_GPU, USE_FLASH_ATTN, MATH_KERNEL_ON = get_sdpa_settings()"
So can I bypass this with some setting NOT to use FLASH Attention?
thanks

@sleeplessTLV
Copy link
Author

my bad, I don't know how to delete this issue, but apparently my mistake

@Dashenboy
Copy link

how did you fix this issue?

@ronghanghu
Copy link
Contributor

@Dashenboy This is mainly a warning suggesting that the GPU is not supporting Flash Attention, so it will fall back to other scaled dot-product kernels. It doesn't needs fixing and you can still use SAM 2 in this case.

more details: Flash Attention is generally faster but is only fully supported with GPUs that have CUDA capabilities >= 8.0. If you GPU has a lower CUDA capability (as can be check on https://developer.nvidia.com/cuda-gpus), this warning will be printed suggesting that Flash Attention is not available for you (but you can ignore it).

@skylning
Copy link

My GPU is rtx4090,and it have CUDA capabilities=8.9,i also get this error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants