[TRANSFORMATIONS] SDPAToPagedAttention transformation: support decompression case in the Qwen-7b-Chat pattern #51881
Job | Run time |
---|---|
21s | |
1m 13s | |
19m 53s | |
3m 16s | |
21m 58s | |
1s | |
46m 42s |
Job | Run time |
---|---|
21s | |
1m 13s | |
19m 53s | |
3m 16s | |
21m 58s | |
1s | |
46m 42s |