[TRANSFORMATIONS] SDPAToPagedAttention transformation: support decompression case in the Qwen-7b-Chat pattern #42661
Job | Run time |
---|---|
18s | |
54s | |
6m 0s | |
0s | |
0s | |
11m 57s | |
1m 49s | |
3m 49s | |
1m 51s | |
5m 19s | |
1m 23s | |
2m 29s | |
2m 38s | |
2m 36s | |
6m 2s | |
0s | |
3m 52s | |
1s | |
50m 58s |