Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of surface integral #997

Merged
merged 8 commits into from
Nov 25, 2021
Merged

Improve performance of surface integral #997

merged 8 commits into from
Nov 25, 2021

Conversation

ranocha
Copy link
Member

@ranocha ranocha commented Nov 24, 2021

When SIMD optimizations land in Trixi.jl, the other parts become relatively more expensive. This is a simple way to increase the performance of the surface integral by reducing the number of memory accesses and total floating point operations.

@ranocha ranocha added the performance We are greedy label Nov 24, 2021
@ranocha ranocha requested a review from sloede November 24, 2021 09:05
@ranocha
Copy link
Member Author

ranocha commented Nov 24, 2021

Some tests failed previously. These were either sensitive to the CFL number or previously known to introduce issues in CI. For the former, I reduced the CFL number, ran the setup on main, and used the reported errors for CI. For the latter, I increased the tolerance a bit 🙈

@codecov
Copy link

codecov bot commented Nov 25, 2021

Codecov Report

Merging #997 (38e974c) into main (d9a8666) will decrease coverage by 0.01%.
The diff coverage is 86.49%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #997      +/-   ##
==========================================
- Coverage   93.66%   93.65%   -0.01%     
==========================================
  Files         287      287              
  Lines       20972    20982      +10     
==========================================
+ Hits        19643    19650       +7     
- Misses       1329     1332       +3     
Flag Coverage Δ
unittests 93.65% <86.49%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
examples/p4est_3d_dgsem/elixir_euler_ec.jl 100.00% <ø> (ø)
src/equations/compressible_euler_2d.jl 93.13% <0.00%> (-0.21%) ⬇️
src/equations/compressible_euler_3d.jl 93.22% <0.00%> (-0.17%) ⬇️
src/equations/shallow_water_2d.jl 89.08% <0.00%> (-0.34%) ⬇️
src/solvers/dgsem_p4est/dg_2d.jl 97.15% <100.00%> (+0.02%) ⬆️
src/solvers/dgsem_p4est/dg_3d.jl 97.78% <100.00%> (+0.01%) ⬆️
src/solvers/dgsem_tree/dg_1d.jl 95.19% <100.00%> (+0.05%) ⬆️
src/solvers/dgsem_tree/dg_2d.jl 97.04% <100.00%> (+0.01%) ⬆️
src/solvers/dgsem_tree/dg_3d.jl 97.76% <100.00%> (+0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d9a8666...38e974c. Read the comment docs.

@ranocha
Copy link
Member Author

ranocha commented Nov 25, 2021

@sloede All tests pass - coverage is slightly reduced due to additional inlining and #841

Copy link
Member

@sloede sloede left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. As a general note: Should we adopt the policy to @inline all functions that might be used in a performance critical hot kernel and that are applied pointwise? I see few downsides, and it would make it easier to decide when to use @inline and when not.

Comment on lines +599 to +601
# Access the factors only once before beginning the loop to increase performance.
# We also use explicit assignments instead of `+=` to let `@muladd` turn these
# into FMAs (see comment at the top of the file).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, these comments are really helpful to understand why something was done the way it is, and also ensure that this won't get revised by an eager developer in the future.

@ranocha
Copy link
Member Author

ranocha commented Nov 25, 2021

LGTM. As a general note: Should we adopt the policy to @inline all functions that might be used in a performance critical hot kernel and that are applied pointwise? I see few downsides, and it would make it easier to decide when to use @inline and when not.

Maybe, that's an option.

@ranocha ranocha merged commit 34ed1e8 into main Nov 25, 2021
@ranocha ranocha deleted the hr/surface_integral branch November 25, 2021 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance We are greedy
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants