R2.13
- Added support for AMD GPUs via HIP. supported GPUs
- Output frames may be broken. (#23)
- The code is designed for discrete RDNA GPUs (with wavefront size 32 and a separate address space), and may not work on GCN and CDNA GPUs.
- Since current AMD's implementation of HIP does not provide support for backward compatible virtual ISA like
PTX
in CUDA, thebm3dhip
binary will not be able to run on future AMD GPUs or GPUs that are not current compilation target. This could be modified here. - Only
bm3dhip
is available at present.bm3dhip_rtc
, the hiprtc-based counterpart tobm3dcuda_rtc
have to wait at least until ROCm 6.1.0 because of the missing support for some features.
Benchmark
- NVIDIA T4 (
bm3dcuda
)- AWS g4dn.2xlarge, Linux 6.2.0-1014-aws, driver version: 545.23.06, CUDA Toolkit 12.3
- AMD Radeon™ Pro V520 (
bm3dhip
)- AWS g4ad.2xlarge, Linux 6.2.0-1014-aws, driver version: 6.2.4, ROCm 5.7.1
- Hygon C86 7390 (
bm3dcpu
)- 32C @ 2.70GHz, L1i: 32 x 64 KB, L1d: 32 x 32 KB, L2: 32 x 512 KB, L3: 8 x 8 MB
- Windows Server 2022
- Intel Sapphire Rapids (
bm3dcpu
)- 32C @ 3.4GHz
- Windows Server 2022
- AMD EPYC Zen4 (
bm3dcpu
)- 32C @ 3.4GHz
- Windows Server 2022
- VapourSynth
R65-RC1-6-g3dcc6a35
input: 1920x1080
chroma=False
:GrayS
chroma=True
:YUV444PS
data format: fps
radius | chroma | final | NVIDIA T4 | AMD Radeon™ Pro V520 | Hygon 7390 | Intel Sapphire Rapids | AMD EPYC Zen4 |
---|---|---|---|---|---|---|---|
0 | False | False | 342.23 | 152.20 | 207.90 | 598.43 | 674.37 |
0 | False | True | 262.98 | 134.08 | 180.39 | 514.53 | 577.75 |
0 | True | False | 121.35 | 66.44 | 122.40 | 311.64 | 375.23 |
0 | True | True | 96.79 | 53.84 | 100.85 | 134.40 | 142.46 |
1 | False | False | 60.80 | 63.70 | 110.63 | 162.40 | 180.68 |
1 | False | True | 52.59 | 53.36 | 53.55 | 136.40 | 152.13 |
1 | True | False | 21.35 | 24.41 | 25.60 | 58.01 | 70.08 |
1 | True | True | 18.15 | 19.91 | 21.50 | 49.50 | 59.87 |
2 | False | False | 37.22 | 41.24 | 39.32 | 103.15 | 111.87 |
2 | False | True | 31.68 | 33.25 | 34.01 | 89.75 | 99.14 |
2 | True | False | 12.64 | 14.70 | 17.05 | 37.54 | 45.69 |
2 | True | True | 10.88 | 12.55 | 14.35 | 33.01 | 39.35 |