You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We recently had a discussion and came to the conclusion that it might be a good idea to not use TBB by default.
We found that in many situations on the "engineering" side of the accuracy spectrum, compute times are slower by a factor of up to 2 when building with TBB even if NTHREADS=1 and most users would rather parallelize "embarrassingly" by running multiple simulations in parallel instead of leveraging the benefit of parallelizing individual simulations.
So making this the default would make sense in my view.
We still want to explore whether it's possible to upload different versions to conda-forge s.t. users can set a flag to install a version with TBB. Maybe build variants are what we're looking fore.
Also, we should benchmark this. Maybe that's a job for @jbreue16 for when he comes back.
Edit: See also here for a discussion on build variants.
The text was updated successfully, but these errors were encountered:
After some thoughts, I believe that having two separate packages would actually be advantageous. This way, we could catch n_threads on a higher level (i.e. CADET-Process) and delegate to the corresponding library / cli / dll.
Alternatively, we could add an option to CMAKE to compile CADET-Core twice, once with TBB, once without TBB, then we wouldn't have to provide two packages. Not sure if this is a good idea, though.
I would still prefer to identify the root cause of the TBB issues and see if we can fix them. This had been postponed because @jbreue16 first wanted to get the 2D GRM with DG running before entering more detailed profiling.
I understand. However, building CADET with TBB will always have a substantial overhead. So here we're talking about the situation where a simulation with n_cores=1 performed with CADET built with TBB will be significantly slower than the variant witout TBB (which obviously will also be n_cores=1.
We recently had a discussion and came to the conclusion that it might be a good idea to not use TBB by default.
We found that in many situations on the "engineering" side of the accuracy spectrum, compute times are slower by a factor of up to 2 when building with TBB even if NTHREADS=1 and most users would rather parallelize "embarrassingly" by running multiple simulations in parallel instead of leveraging the benefit of parallelizing individual simulations.
So making this the default would make sense in my view.
We still want to explore whether it's possible to upload different versions to conda-forge s.t. users can set a flag to install a version with TBB. Maybe build variants are what we're looking fore.
Also, we should benchmark this. Maybe that's a job for @jbreue16 for when he comes back.
Edit: See also here for a discussion on build variants.
The text was updated successfully, but these errors were encountered: