Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: module 'torch_npu' has no attribute '_npu_flash_attention' #211

Closed
pjgao opened this issue Mar 1, 2025 · 4 comments
Closed

[Bug]: module 'torch_npu' has no attribute '_npu_flash_attention' #211

pjgao opened this issue Mar 1, 2025 · 4 comments
Labels
documentation Improvements or additions to documentation

Comments

@pjgao
Copy link

pjgao commented Mar 1, 2025

Your current environment

The output of `python collect_env.py`
[pip3] torch==2.5.1
[pip3] torch-npu==2.5.1.dev20250218

🐛 Describe the bug

执行下面脚本报错:

vllm serve ./Qwen2-VL-7B-Instruct --trust-remote-code

日志:

 File "/home/anaconda3/envs/vllm/lib/python3.10/site-packages/vllm/model_executor/models/qwen2.py", line 243, in forward
    hidden_states = self.self_attn(
  File "/home/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/anaconda3/envs/vllm/lib/python3.10/site-packages/vllm/model_executor/models/qwen2.py", line 177, in forward
    attn_output = self.attn(q, k, v)
  File "/home/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/anaconda3/envs/vllm/lib/python3.10/site-packages/vllm/attention/layer.py", line 220, in forward
    return self.impl.forward(self, query, key, value,
  File "/home/data//rl/vllm-ascend/vllm_ascend/attention.py", line 597, in forward
    torch_npu._npu_flash_attention(
AttributeError: module 'torch_npu' has no attribute '_npu_flash_attention'

问题原因:
#187 这个PR合入后torch_npu的版本依赖从 2.5.1.dev20250218更新到了2.5.1.dev20250226 ,但readme以及安装文档中未同步修改,导致按照readme以及https://vllm-ascend.readthedocs.io/en/latest/installation.html 这个文档说明安装 2.5.1.dev20250218 版本报错

@pjgao pjgao added the bug Something isn't working label Mar 1, 2025
@Yikun
Copy link
Collaborator

Yikun commented Mar 1, 2025

@pjgao Many thanks for report.

Currently, latest doc is for v0.7.1rc1 with torch_npu dev20250218.

The main doc is for main branch already update to 20250226.

So I think installation.md is right?

But yes, readme should be changed.

@pjgao
Copy link
Author

pjgao commented Mar 1, 2025

多谢回复,我是从readme里官网的链接进去的,这里默认是latest分支
Image
手动切换到main分支后确实显示的是main分支修改后的installation.md

@Yikun
Copy link
Collaborator

Yikun commented Mar 1, 2025

@pjgao Sorry for misleading. Currently, doc link links to latest release version.

  1. Do you want to submit a PR to fix PTA version to dev20250226 for both engilish and cn version README?
    https://github.com/vllm-project/vllm-ascend?tab=readme-ov-file#prerequisites

    ==> Addressed in Recover vllm-ascend dev image #209

  2. We can add stable version doc after first final release, do you think this make sense?
    [Doc]: Add latest / stable version after first final release #214

@Yikun Yikun added documentation Improvements or additions to documentation and removed bug Something isn't working labels Mar 1, 2025
@Yikun
Copy link
Collaborator

Yikun commented Mar 3, 2025

TLDR:

I will close this issue today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants