Improve execution policy heuristics #380

pca006132 · 2023-03-18T18:40:21Z

The current heuristics in https://github.com/elalish/manifold/blob/master/src/utilities/include/par.h is basically a random number that results in an OK-ish performance but definitely not optimal. There are two problems here:

The policy is not stored in the vector. If a vector was passed to the GPU, it would make sense to prefer GPU instead of CPU for future operation.
The optimal number of elements for parallelizarion will depend on the algorithm and element size.

I think we need a more complete wrapper around thrust to do this, and VecDH should have a boolean indicating whether or not it was passed to the GPU or was used on the host. Ideally we can also try things like vulkan compute shader as an alternative backend for this API, selectively implementing it for some functions that can get good apeesup.

pca006132 · 2023-03-18T18:41:43Z

Related: openscad/openscad#391

elalish · 2023-03-20T04:43:07Z

This makes sense, but do we have a sense of how much this can gain us? I wonder if effort wouldn't be better spent parallelizing some more of the single-threaded code. What fraction of total time are we spending on triangulation and decimation and such?

pca006132 · 2023-03-20T04:45:27Z

Probably a lot, at least for the CUDA case. On my laptop with a 3050Ti mobile and 12900HK CPU, small models are 10 times slower with CUDA enabled, and large models are only < 10% faster with CUDA.

elalish · 2023-03-20T05:13:58Z

Oh wow, fair enough! My benchmarking has tended to focus on problems with large numbers of triangles (spheres, sponge). What do you think would be a good benchmark for small models?

pca006132 · 2023-03-20T05:22:47Z

Not sure, I am testing those python examples. I think we can port some more simple OpenSCAD benchmarks, which are usually not too large.

pca006132 · 2023-03-21T05:34:17Z

@pca006132 good to know, thanks! Also, noticed Manifold::Transform seems ~8x slower than OpenSCAD's PolySet::transform (which uses Eigen transforms); haven't fully investigated but wondering if TBB has too much overhead maybe? (even without Eigen's SIMD optimizations, I'd expect to match its speed when throwing a dozen cores at it). I tried batching the thrust::transform calls in Impl::Transform, to no avail.

@ochafik One possible reason is that Manifold::Transform will perform a collider update if the transform is not axis aligned, and that is pretty expensive. Can you give an example model for me to check?

pca006132 · 2023-03-21T06:45:35Z

If we are concerned about that performance, maybe we can also make collider update lazy (actually seems to be a good idea if users are going to do many transforms)

elalish · 2023-03-21T15:24:31Z

Isn't it already lazy since it's part of the lazy application of transforms in general?

pca006132 · 2023-03-21T15:46:22Z

Well, we can be more lazy: don't compute the collider if we don't use the mesh for further boolean operations.

pca006132 added the enhancement New feature or request label Mar 18, 2023

pca006132 mentioned this issue Mar 21, 2023

Use Manifold for much faster & multithreaded CSG & minkowski operations openscad/openscad#4533

Merged

12 tasks

Repository owner locked and limited conversation to collaborators Apr 3, 2023

pca006132 converted this issue into discussion #397 Apr 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Improve execution policy heuristics #380

Improve execution policy heuristics #380

pca006132 commented Mar 18, 2023

pca006132 commented Mar 18, 2023

elalish commented Mar 20, 2023

pca006132 commented Mar 20, 2023

elalish commented Mar 20, 2023

pca006132 commented Mar 20, 2023

pca006132 commented Mar 21, 2023

pca006132 commented Mar 21, 2023

elalish commented Mar 21, 2023

pca006132 commented Mar 21, 2023

This issue was moved to a discussion.

This issue was moved to a discussion.

Improve execution policy heuristics #380

Improve execution policy heuristics #380

Comments

pca006132 commented Mar 18, 2023

pca006132 commented Mar 18, 2023

elalish commented Mar 20, 2023

pca006132 commented Mar 20, 2023

elalish commented Mar 20, 2023

pca006132 commented Mar 20, 2023

pca006132 commented Mar 21, 2023

pca006132 commented Mar 21, 2023

elalish commented Mar 21, 2023

pca006132 commented Mar 21, 2023

This issue was moved to a discussion.