Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Architecture deepseek2 not supported #34335

Closed
2 of 4 tasks
czq99972 opened this issue Oct 23, 2024 · 5 comments
Closed
2 of 4 tasks

ValueError: Architecture deepseek2 not supported #34335

czq99972 opened this issue Oct 23, 2024 · 5 comments
Labels

Comments

@czq99972
Copy link

System Info

The current Transformers framework doesn't support the gguf quantized model files from deepseek2. Can you please advise when this support might be added? @SunMarc @MekkCyber

Who can help?

@SunMarc @MekkCyber

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

File "/home/work/miniforge3/envs/vllm/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 1006, in from_pretrained
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/work/miniforge3/envs/vllm/lib/python3.11/site-packages/transformers/configuration_utils.py", line 570, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/work/miniforge3/envs/vllm/lib/python3.11/site-packages/transformers/configuration_utils.py", line 661, in _get_config_dict
config_dict = load_gguf_checkpoint(resolved_config_file, return_tensors=False)["config"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/work/miniforge3/envs/vllm/lib/python3.11/site-packages/transformers/modeling_gguf_pytorch_utils.py", line 103, in load_gguf_checkpoint
raise ValueError(f"Architecture {architecture} not supported")
ValueError: Architecture deepseek2 not supported

Expected behavior

1

@czq99972 czq99972 added the bug label Oct 23, 2024
@VladOS95-cyber
Copy link
Contributor

VladOS95-cyber commented Oct 23, 2024

Hey @czq99972, @SunMarc, @MekkCyber! I can take it as soon as I finish current implementation for Mamba arch, but it wouldn't be so long. I think I will be able to start working on deepseek2 on this week. Link to main issue thread: #33260

@SunMarc
Copy link
Member

SunMarc commented Oct 23, 2024

Hey ! Deepspeedv2 gguf can be supported with gguf files once it is integrated in transformers: #31976 !

@wavy-jung
Copy link
Contributor

wavy-jung commented Nov 13, 2024

Any update on deepseek v2 support? @VladOS95-cyber

@VladOS95-cyber
Copy link
Contributor

hey @wavy-jung, I see that Deepspeedv2 architecture is not supported yet, this PR #31976 is still in progress

Copy link

github-actions bot commented Dec 7, 2024

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants