-
Notifications
You must be signed in to change notification settings - Fork 756
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Summary: switch from using xxd to bin2c when generating the .ptx.c files so that the PTX data can be null-terminated. In newer drivers or cuda versions, vmaf now segfaults when trying to do anything from the GPU. The coredumps indicate that the crash happens somewhere inside the cuModuleLoadData calls in init_fex_cuda. Documentation for cuModuleLoadData states that its `image` argument can be "obtained by mapping a cubin or PTX or fatbin file, [or] passing a cubin or PTX or fatbin file as a NULL-terminated text string...". It looks like VMAF is trying to do the latter, encoding PTX text files as an ASCII string using xxd, but there's no null-terminator in the data because nothing asked for one. I'm a CUDA noob and don't know how this ever worked on older driver versions, but I tried editing the .ptx.c files by hand to add 0x00 bytes at the end and it worked! Switch from xxd to bin2c (which is distributed with the cuda-nvcc package) that supports a `--padd` option to add a null byte to the PTX data, eliminating the segfaults. The arrays got renamed slightly to remove the src_ prefix, since bin2c doesn't do any automatic naming of the output array.
- Loading branch information
Showing
7 changed files
with
22 additions
and
17 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -333,14 +333,19 @@ if is_cuda_enabled | |
|
||
message('ptx_files = @0@'.format(ptx_files)) | ||
|
||
xxd_exe = find_program('xxd') | ||
# bin2c is distributed along with cuda tools. Use '--padd 0x00' to add a NULL-terminator byte | ||
# to the end of the generated array. | ||
bin2c_exe = find_program('bin2c') | ||
ptx_arrays = [] | ||
foreach name, _ptx : ptx_files | ||
t = custom_target('ptx_xxd_@0@'.format(name), | ||
t = custom_target('ptx_bin2c_@0@'.format(name), | ||
build_by_default: true, | ||
output : ['@[email protected]'], | ||
input : _ptx, | ||
command : [xxd_exe, '--include','@INPUT@', '@OUTPUT@'], | ||
capture : true, | ||
command : [bin2c_exe, '--const', '--padd', '0x00', | ||
'--name', '@BASENAME@_ptx', '@INPUT@', | ||
] | ||
) | ||
ptx_arrays += t | ||
endforeach | ||
|