Skip to content

Ray Tracing / BVH Building benchmark and tuning utility for https://github.com/DGriffin91/obvhs

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

DGriffin91/tray_racing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tray Racing

Ray Tracing / BVH Building benchmark and tuning utility for OBVHS.

TLDR:

  • For best on CPU: Embree managed (where embree fully manages the scene, BVH building/traversal, etc...).
  • For best on GPU: Vulkan/DX12 hardware ray tracing.

OBVHS exists for cases where neither of those are an option: GPU traversal on older GPUs, finer control needed over BVH building/traversal, don't want to depend on Embree, etc...

GPU timings are via GPU timestamp queries with locked clocks.

Notes:

  • tfb/vfb/med/vsb are presets in OBVHS. tfb: fastest_build, vfb: very_fast_build, med: medium_build, vsb: very_slow_build.
  • Embree CWBVH uses a BVH8 builder with RTCBuildQuality::HIGH.
  • Embree managed is limited to SSE2 as OBVHS does not yet have AVX support. (Embree managed is a bit faster with AVX but not dramatically. OBVHS will eventually also add AVX support)

All times are in (milli)seconds. Less is better. cpu_traversal_bench gpu_traversal_bench cpu_single_threaded_building_bench cpu_multi_threaded_building_bench Last updated on 6/30/24

Test Scenes:

For GPU benchmarking, please lock GPU/VRAM clocks: NVIDIA Instructions. GPU benchmarking requires dxc to be in path.

Example: cargo run --release -- -i "assets/scenes/kitchen.ron" --benchmark --build ploc_cwbvh

USAGE:
    tray_racing [FLAGS] [OPTIONS] -i <input>

FLAGS:
        --animate                          Animate noise seed, etc...
        --auto-tune                        Find best settings for the given scenes.
        --benchmark                        Runs timestamp queries and extra dispatches to try to normalize timings.
        --cpu                              Render on the CPU
        --disable-auto-tune-model-cache    Bypass model cache (eg. if not all models will fit in memory at once)
        --flatten-blas                     Use tlas building/traversal path but flatten model into 1 blas.
        --hardware                         Use Vulkan hardware RT (requires --hardware feature and alternate wgpu, see
                                           cargo.toml)
    -h, --help                             Prints help information
        --png                              Save a png of the rendered frame. (Currently only cpu mode)
        --split                            Split large tris into multiple AABBs
        --tlas                             Use tlas (top level acceleration structure)
    -V, --version                          Prints version information
        --verbose                          Prints misc info about BVH (depth, node count, etc..)

OPTIONS:
        --build <build>
            Specify BVH builder [default: ploc_cwbvh]  [possible values: ploc_cwbvh, ploc_bvh2, embree_cwbvh,
            embree_bvh2_cwbvh, embree_managed, svenstaro_bvh2, parry_qbvh]
    -i <input>
            Input file path, also supports multiple comma separated paths (use with benchmark & render-time). Use
            `demoscene` for included procedurally generated scene.
        --max-prims-per-leaf <max-prims-per-leaf>
            Maximum primitives per leaf. For CWBVH the limit is 3 [default: 3]

        --post-collapse-reinsertion-batch-ratio-multiplier <post-collapse-reinsertion-batch-ratio-multiplier>
            For BVH2 only, a second pass of reinsertion after collapse. Since collapse reduces the node count, this
            reinsertion pass will be faster. 0 to disable. Relative to the initial reinsertion_batch_ratio. [default:
            0.0]
        --preset <preset>
            Overrides BVH build options. [default: ]  [possible values: , fastest_build, very_fast_build, fast_build,
            medium_build, slow_build, very_slow_build]
    -r <reinsertion-batch-ratio>
            Typically 0..1: ratio of nodes considered as candidates for reinsertion. Above 1 to evaluate the whole set
            multiple times. A little goes a long way. Try 0.01 or even 0.001 before disabling for build performance.
            [default: 0.15]
        --render-time <render-time>
            Stop rendering the current scene after n seconds. [default: 0]

        --search-depth-threshold <search-depth-threshold>
            Below this depth a search distance of 1 will be used for ploc. [default: 2]

        --search-distance <search-distance>
            In ploc, the number of nodes before and after the current one that are evaluated for pairing. 1 has a fast
            path in building and still results in decent quality BVHs esp. when paired with a bit of reinsertion.
            [default: 14]  [possible values: 1, 2, 6, 14, 24, 32]
        --sort-precision <sort-precision>
            Bits used for ploc radix sort. [default: 64]  [possible values: 64, 128]
        --width <width>                                      Render resolution width. [default: 1920]
        --height <height>                                    Render resolution height. [default: 1080]

MIT & Apache 2.0 Licenses don't apply to assets.

About

Ray Tracing / BVH Building benchmark and tuning utility for https://github.com/DGriffin91/obvhs

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published