- Added support for Intel GPUs via SYCL.
- SYCL at present does not support runtime compilation required for
bm3dcuda_rtc
's counterpart. - Pre-compiled binary for pre-gen12lp devices will be dropped starting from the next release.
- TODO: runtime sub-group size selection. (Xe2 is 16 wide)
- SYCL at present does not support runtime compilation required for
preliminary benchmark
- Intel Arc A770 Graphics, ACM-G10, Xe-HPG, driver 1.3.26690, linux kernel 6.2.8, PCIe 4.0 x16, sub-group size 8
- Intel Data Center GPU Max 1100, Xe-HPC, driver 1.3.26516, linux kernel 5.15.0, PCIe 5.0 x16, large GRF mode, sub-group size 16
input: 1920x1080
chroma=False
:GrayS
chroma=True
:YUV444PS
backend: level zero
data format: fps
radius | chroma | final | Arc A770 | Max 1100 |
---|---|---|---|---|
0 | False | False | 252.46 | 323.51 |
0 | False | True | 205.89 | 264.46 |
0 | True | False | 103.46 | 103.51 |
0 | True | True | 78.51 | 80.76 |
1 | False | False | 83.37 | 46.41 |
1 | False | True | 67.31 | 42.15 |
1 | True | False | 27.15 | 15.75 |
1 | True | True | 22.09 | 13.90 |
2 | False | False | 51.40 | 29.11 |
2 | False | True | 41.54 | 24.51 |
2 | True | False | 16.35 | 8.17 |
2 | True | True | 13.40 | 7.40 |