Replies: 3 comments
-
Use |
Beta Was this translation helpful? Give feedback.
-
@swordow positional embedding shift? |
Beta Was this translation helpful? Give feedback.
-
I found the problem. "Hello test" is splitted into two tokens:
which is consistent with code in https://github.com/ggml-org/llama.cpp/blob/master/src/llama-vocab.cpp
. |
Beta Was this translation helpful? Give feedback.
-
I tried following 3 test cases, but the result is confused.
Test1, create embedings for "Hello" (dont add special/bos)
llama-embedding.exe -m gte-qwen2-7b-instruct-f16.gguf -e -p "Hello" --verbose-prompt -ngl 0 --batch-size 4096
and get the outputs:
embedding 0: [-0.010509 -0.007925 -0.006991 ... -0.010548 -0.014585 0.018345 ]
Test2, create embedings for "test" (dont add special/bos):
llama-embedding.exe -m gte-qwen2-7b-instruct-f16.gguf -e -p "test" --verbose-prompt -ngl 0 --batch-size 4096
and get the outputs:
embedding 0: [-0.000707 0.007401 0.001886 ... -0.014110 -0.003793 0.016024 ]
Test3, create embedings for "Hello test" (dont add special/bos):
llama-embedding.exe -m gte-qwen2-7b-instruct-f16.gguf -e -p "Hello test" --verbose-prompt -ngl 0 --batch-size 4096
and get the outputs:
The Test3's embeding 0 is consistent with Test1's embedding 0, but The Test3's embeding 1 is not onsistent with Test2's embedding 0.
Beta Was this translation helpful? Give feedback.
All reactions