ValueError: Architecture deepseek2 not supported #34335

czq99972 · 2024-10-23T07:39:34Z

System Info

The current Transformers framework doesn't support the gguf quantized model files from deepseek2. Can you please advise when this support might be added? @SunMarc @MekkCyber

Who can help?

@SunMarc @MekkCyber

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

File "/home/work/miniforge3/envs/vllm/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 1006, in from_pretrained
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/work/miniforge3/envs/vllm/lib/python3.11/site-packages/transformers/configuration_utils.py", line 570, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/work/miniforge3/envs/vllm/lib/python3.11/site-packages/transformers/configuration_utils.py", line 661, in _get_config_dict
config_dict = load_gguf_checkpoint(resolved_config_file, return_tensors=False)["config"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/work/miniforge3/envs/vllm/lib/python3.11/site-packages/transformers/modeling_gguf_pytorch_utils.py", line 103, in load_gguf_checkpoint
raise ValueError(f"Architecture {architecture} not supported")
ValueError: Architecture deepseek2 not supported

Expected behavior

1

The text was updated successfully, but these errors were encountered:

VladOS95-cyber · 2024-10-23T08:39:01Z

Hey @czq99972, @SunMarc, @MekkCyber! I can take it as soon as I finish current implementation for Mamba arch, but it wouldn't be so long. I think I will be able to start working on deepseek2 on this week. Link to main issue thread: #33260

SunMarc · 2024-10-23T12:18:00Z

Hey ! Deepspeedv2 gguf can be supported with gguf files once it is integrated in transformers: #31976 !

wavy-jung · 2024-11-13T05:54:18Z

Any update on deepseek v2 support? @VladOS95-cyber

VladOS95-cyber · 2024-11-13T07:26:14Z

hey @wavy-jung, I see that Deepspeedv2 architecture is not supported yet, this PR #31976 is still in progress

github-actions · 2024-12-07T08:04:28Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

czq99972 added the bug label Oct 23, 2024

github-actions bot closed this as completed Dec 15, 2024

Nottlespike mentioned this issue Dec 27, 2024

Support for DeepseekV3 680B kvcache-ai/ktransformers#117

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: Architecture deepseek2 not supported #34335

ValueError: Architecture deepseek2 not supported #34335

czq99972 commented Oct 23, 2024

VladOS95-cyber commented Oct 23, 2024 •

edited

Loading

SunMarc commented Oct 23, 2024

wavy-jung commented Nov 13, 2024 •

edited

Loading

VladOS95-cyber commented Nov 13, 2024

github-actions bot commented Dec 7, 2024

ValueError: Architecture deepseek2 not supported #34335

ValueError: Architecture deepseek2 not supported #34335

Comments

czq99972 commented Oct 23, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

VladOS95-cyber commented Oct 23, 2024 • edited Loading

SunMarc commented Oct 23, 2024

wavy-jung commented Nov 13, 2024 • edited Loading

VladOS95-cyber commented Nov 13, 2024

github-actions bot commented Dec 7, 2024

VladOS95-cyber commented Oct 23, 2024 •

edited

Loading

wavy-jung commented Nov 13, 2024 •

edited

Loading