Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Refactor CSV reader benchmarks with nvbench (#11678)
Closes #10941 This PR refactors the CSV reader benchmarks with nvbench and reduces the number of test cases by isolating data type, IO type, column selection, and row selection. Example output of the new benchmarks: <details> <summary>Benchmark results</summary> ## csv_read_data_type ### [0] Quadro RTX 8000 | data_type | Samples | CPU Time | Noise | GPU Time | Noise | bytes_per_second | peak_memory_usage | encoded_file_size | |-----------|---------|------------|-------|------------|-------|------------------|-------------------|-------------------| | INTEGRAL | 5x | 1.140 s | 0.09% | 1.140 s | 0.09% | 235553841 | 1.202 GiB | 668.564 MiB | | FLOAT | 5x | 1.262 s | 0.04% | 1.262 s | 0.04% | 212718321 | 1.041 GiB | 713.885 MiB | | DECIMAL | 5x | 272.787 ms | 0.03% | 272.784 ms | 0.03% | 984060406 | 396.279 MiB | 167.951 MiB | | TIMESTAMP | 7x | 1.681 s | 0.47% | 1.681 s | 0.47% | 159723724 | 2.281 GiB | 814.268 MiB | | DURATION | 7x | 2.121 s | 0.50% | 2.121 s | 0.50% | 126587514 | 2.588 GiB | 971.320 MiB | | STRING | 19x | 496.713 ms | 0.50% | 496.710 ms | 0.50% | 540426462 | 859.526 MiB | 277.082 MiB | ## csv_read_io ### [0] Quadro RTX 8000 | io | Samples | CPU Time | Noise | GPU Time | Noise | bytes_per_second | peak_memory_usage | encoded_file_size | |-------------|---------|----------|-------|----------|-------|------------------|-------------------|-------------------| | FILEPATH | 9x | 1.185 s | 0.49% | 1.185 s | 0.49% | 226466264 | 1.445 GiB | 618.876 MiB | | HOST_BUFFER | 5x | 1.170 s | 0.14% | 1.170 s | 0.14% | 229459856 | 1.445 GiB | 618.876 MiB | ## csv_read_column_selection ### [0] Quadro RTX 8000 | column_selection | row_selection | Samples | CPU Time | Noise | GPU Time | Noise | bytes_per_second | peak_memory_usage | encoded_file_size | |------------------|---------------|---------|----------|-------|----------|-------|------------------|-------------------|-------------------| | ALL | ALL | 5x | 1.246 s | 0.18% | 1.246 s | 0.18% | 215514992 | 1.582 GiB | 653.520 MiB | | ALTERNATE | ALL | 5x | 1.128 s | 0.08% | 1.128 s | 0.08% | 119009844 | 1.116 GiB | 648.908 MiB | | FIRST_HALF | ALL | 5x | 1.143 s | 0.07% | 1.143 s | 0.07% | 117443933 | 1.121 GiB | 653.520 MiB | | SECOND_HALF | ALL | 5x | 1.152 s | 0.16% | 1.152 s | 0.16% | 116478469 | 1.121 GiB | 653.520 MiB | ## csv_read_row_selection ### [0] Quadro RTX 8000 | column_selection | row_selection | num_chunks | Samples | CPU Time | Noise | GPU Time | Noise | bytes_per_second | peak_memory_usage | encoded_file_size | |------------------|---------------|------------|---------|----------|-------|----------|-------|------------------|-------------------|-------------------| | ALL | BYTE_RANGE | 1 | 5x | 1.244 s | 0.16% | 1.244 s | 0.16% | 215763257 | 1.582 GiB | 653.520 MiB | | ALL | BYTE_RANGE | 8 | 5x | 1.170 s | 0.04% | 1.170 s | 0.04% | 229339594 | 202.596 MiB | 653.520 MiB | | ALL | NROWS | 1 | 5x | 1.244 s | 0.12% | 1.244 s | 0.12% | 215808401 | 1.582 GiB | 653.520 MiB | | ALL | NROWS | 8 | 4x | 4.560 s | inf% | 4.560 s | inf% | 58870122 | 320.771 MiB | 653.520 MiB | | ALL | SKIPFOOTER | 1 | 5x | 1.245 s | 0.10% | 1.245 s | 0.10% | 215660012 | 1.582 GiB | 653.520 MiB | | ALL | SKIPFOOTER | 8 | 3x | 7.443 s | inf% | 7.443 s | inf% | 36065528 | 1.269 GiB | 653.520 MiB | </details> Authors: - Yunsong Wang (https://github.com/PointKernel) Approvers: - Nghia Truong (https://github.com/ttnghia) - Vukasin Milovanovic (https://github.com/vuule) URL: #11678
- Loading branch information