Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[warps] last cleanup for now of divergence metrics - disable divergen…
…ce test Note that the throughput degradation from divergence is real and reproducible. Without divergence, now around 6.4E8 - against 5.7E8 with divergence. It is very difficult to correlate the percent degradation in throughput to the metrics however. In summary, one should just aim at 100% uniform execution. On itscrd70.cern.ch (V100S-PCIE-32GB): ========================================================================= Process = EPOCH1_EEMUMU_CUDA [nvcc 11.0.221] FP precision = DOUBLE (NaN/abnormal=0, zero=0) EvtsPerSec[MatrixElems] (3) = ( 6.425099e+08 ) sec^-1 MeanMatrixElemValue = ( 1.371706e-02 +- 3.270315e-06 ) GeV^0 TOTAL : 0.741551 sec 2,589,547,187 cycles # 2.655 GHz 3,537,039,425 instructions # 1.37 insn per cycle 1.044156654 seconds time elapsed ==PROF== Profiling "sigmaKin": launch__registers_per_thread 120 ==PROF== Profiling "sigmaKin": sm__sass_average_branch_targets_threads_uniform.pct 100% : smsp__sass_branch_targets.sum 53 2.89/usecond : smsp__sass_branch_targets_threads_uniform.sum 53 2.89/usecond : smsp__sass_branch_targets_threads_divergent.sum 0 0/second : smsp__warps_launched.sum 1 ------------------------------------------------------------------------- FP precision = DOUBLE (nan=0) EvtsPerSec[MatrixElems] (3)= ( 4.454874e+05 ) sec^-1 MeanMatrixElemValue = ( 5.532387e+01 +- 5.501866e+01 ) GeV^-4 TOTAL : 0.602111 sec 2,193,960,041 cycles # 2.654 GHz 2,948,877,241 instructions # 1.34 insn per cycle 0.885704400 seconds time elapsed ==PROF== Profiling "sigmaKin": launch__registers_per_thread 255 ==PROF== Profiling "sigmaKin": sm__sass_average_branch_targets_threads_uniform.pct 100% : smsp__sass_branch_targets.sum 17,683 1.52/usecond : smsp__sass_branch_targets_threads_uniform.sum 17,683 1.52/usecond : smsp__sass_branch_targets_threads_divergent.sum 0 0/second : smsp__warps_launched.sum 1 =========================================================================
- Loading branch information