-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of surface integral #997
Conversation
Some tests failed previously. These were either sensitive to the CFL number or previously known to introduce issues in CI. For the former, I reduced the CFL number, ran the setup on |
Codecov Report
@@ Coverage Diff @@
## main #997 +/- ##
==========================================
- Coverage 93.66% 93.65% -0.01%
==========================================
Files 287 287
Lines 20972 20982 +10
==========================================
+ Hits 19643 19650 +7
- Misses 1329 1332 +3
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. As a general note: Should we adopt the policy to @inline
all functions that might be used in a performance critical hot kernel and that are applied pointwise? I see few downsides, and it would make it easier to decide when to use @inline
and when not.
# Access the factors only once before beginning the loop to increase performance. | ||
# We also use explicit assignments instead of `+=` to let `@muladd` turn these | ||
# into FMAs (see comment at the top of the file). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, these comments are really helpful to understand why something was done the way it is, and also ensure that this won't get revised by an eager developer in the future.
Maybe, that's an option. |
When SIMD optimizations land in Trixi.jl, the other parts become relatively more expensive. This is a simple way to increase the performance of the surface integral by reducing the number of memory accesses and total floating point operations.