Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GLM4 model #33729

Closed
wants to merge 39 commits into from
Closed

Add GLM4 model #33729

wants to merge 39 commits into from

Conversation

Cyrilvallez
Copy link
Member

What does this PR do?

Adds GLM model.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nice, you are missing the test files, integration tests etc! (And readme etc)

src/transformers/models/glm/configuration_glm.py Outdated Show resolved Hide resolved
src/transformers/models/glm/configuration_glm.py Outdated Show resolved Hide resolved
src/transformers/models/glm/configuration_glm.py Outdated Show resolved Hide resolved
initializer_range=0.02,
rms_norm_eps=0.00000015625,
use_rms_norm=True,
apply_residual_connection_post_layernorm=False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this false for all models? If so, to delete!

self.mlp = GlmMLP(config)
self.input_layernorm = (
GlmRMSNorm(config.hidden_size, eps=config.rms_norm_eps)
if config.use_rms_norm
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check what config uses, but we avoid that in general as well! (code path)

"""

hidden_states_after_norm = self.input_layernorm(hidden_states)
residual = hidden_states_after_norm if self.apply_residual_connection_post_layernorm else hidden_states
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here! check if any released models have both

self.layers = nn.ModuleList(
[GlmDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
)
if config.post_layer_norm:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ArthurZucker and others added 26 commits September 30, 2024 16:03
* HQQ model serialization attempt

* fix hqq dispatch and unexpected keys

* style

* remove check_old_param

* revert to check HQQLinear in quantizer_hqq.py

* revert to check HQQLinear in quantizer_hqq.py

* update HqqConfig default params

* make ci happy

* make ci happy

* revert to HQQLinear check in quantizer_hqq.py

* check hqq_min version 0.2.0

* set axis=1 as default in quantization_config.py

* validate_env with hqq>=0.2.0 version message

* deprecated hqq kwargs message

* make ci happy

* remove run_expected_keys_check hack + bump to 0.2.1 min hqq version

* fix unexpected_keys hqq update

* add pre_quantized check

* add update_expected_keys to base quantizerr

* ci base.py fix?

* ci base.py fix?

* fix "quantization typo" src/transformers/utils/quantization_config.py

Co-authored-by: Arthur <[email protected]>

* fix post merge

---------

Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Arthur <[email protected]>
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something went wrong with the rebasing / merging as you have unrelated changes!

}


class GlmDecoderLayer(nn.Module):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one looks fairly classic, I would have supposed you don't need the forward (unless the issue is with the name of layers?)

@Cyrilvallez
Copy link
Member Author

Something went wrong with the rebasing / merging as you have unrelated changes!

Yes, currently looking at it

@ArthurZucker
Copy link
Collaborator

Superseed by #33823

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants