From 20bd63c0e66627aeb0eca253e5e3123bf0c4a25d Mon Sep 17 00:00:00 2001 From: Cyrus Leung Date: Tue, 7 Jan 2025 21:50:58 +0800 Subject: [PATCH] [Doc] Add note to `gte-Qwen2` models (#11808) Signed-off-by: DarkLight1337 --- docs/source/models/supported_models.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/source/models/supported_models.md b/docs/source/models/supported_models.md index 8c5f6836d6aa8..3ba34c77205e5 100644 --- a/docs/source/models/supported_models.md +++ b/docs/source/models/supported_models.md @@ -430,6 +430,9 @@ You can set `--hf-overrides '{"is_causal": false}'` to change the attention mask On the other hand, its 1.5B variant (`Alibaba-NLP/gte-Qwen2-1.5B-instruct`) uses causal attention despite being described otherwise on its model card. + +Regardless of the variant, you need to enable `--trust-remote-code` for the correct tokenizer to be +loaded. See [relevant issue on HF Transformers](https://github.com/huggingface/transformers/issues/34882). ``` If your model is not in the above list, we will try to automatically convert the model using