Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LensFlarePass (ogre2 engine) #752

Merged
merged 36 commits into from
Dec 2, 2022

Conversation

darksylinc
Copy link
Contributor

@darksylinc darksylinc commented Oct 30, 2022

🎉 New feature

Dependency

Depends on gazebosim/gz-common#474

Ticket

Affects #730

But it doesn't close it because the ticket says ogre backend as well.

Summary

This PR does 3 things:

  • Adds Ogre2LensFlare
  • Adds tests to test this new pass to INTEGRATION_render_pass
  • Works for both normal Camera and WideAngleCamera.
  • Significantly optimizes Ogre2RayQuery
    • The original CPU implementation is ridiculously slow
    • PR Use selection buffer in ray queries (ogre2) #378 used a GPU implementation to address these issues. However the GPU version can't be easily used (without bigger refactors to how gz-rendering operates) because we need to render (for ray queries) while we are about to render (for lens flares). We cannot use the GPU version.
    • This PR adds an option to force CPU method.
    • LensFlare heavily relies on a lot of raycasts for occlusion (depends on the value passed to SetOcclusionSteps).

The speed up of Ogre2RayQuery comes from:

  1. Eliminate unnecessary calls
  2. Eliminate pointer aliasing
  3. Removed unnecessary bound checks
  4. Work in local space instead of world space when possible (i.e. as long as the world transform matrix is affine, which is the case most of the time) to avoid expensive 4x4 matrix multiplications vs 3D vectors
  5. Use multithreading to evenly spread the calculations over all cores

This results in a massive performance increase.

Results of iterating 1000 rays over Ogre2_demo scene intentionally hitting the broadphase stage of duck mesh but missing every single triangle (to avoid an early out) on a Ryzen 5950X 12C/24T 32GB:

Test Time (Debug, Lower is better) Time (Release, Lower is better)
Original 1282ms 228ms
Optimized 421ms 100ms
Optimized + Threaded (24 threads) 83ms 33ms

Therefore this implementation achieves a 15x performance improvement on Debug builds and 6.91x improvement on Release builds.

YMMV not only because of the number of threads your system may have, but also because the threaded version is more susceptible to power state changes (Original & Optimize don't change their readings much when switching the governor from 'On Demand' to 'Performance' however Optimized + Threaded can take 1.6x longer as the CPU is not able ramp the frequency in time for all CPUs)

Test it

You can run INTEGRATION_render_pass or add this code to any sample (e.g. ogre2_demo):

RenderPassPtr pass = rpSystem->Create<LensFlarePass>();
LensFlarePassPtr lensFlarePass =
  std::dynamic_pointer_cast<LensFlarePass>(pass);
lensFlarePass->Init(scene);
lensFlarePass->SetEnabled(true);
lensFlarePass->SetLight(myLight);
camera->AddRenderPass(lensFlarePass);
out.mp4

Checklist

  • Signed all commits for DCO
  • Added tests
  • Added example and/or tutorial
  • Updated documentation (as needed)
  • Updated migration guide (as needed)
  • Consider updating Python bindings (if the library has them)
  • codecheck passed (See contributing)
  • All tests passed (See test coverage)
  • While waiting for a review on your PR, please help review another open pull request to support the maintainers

Note to maintainers: Remember to use Squash-Merge and edit the commit message to match the pull request summary while retaining Signed-off-by messages.

@darksylinc darksylinc requested a review from iche033 as a code owner October 30, 2022 21:59
@darksylinc darksylinc force-pushed the matias-lensFlareOnMain branch 2 times, most recently from 949844f to 335da24 Compare November 6, 2022 23:15
@darksylinc
Copy link
Contributor Author

darksylinc commented Nov 7, 2022

I just pushed more commits. WideAngleCamera support was added.

It took me longer than I thought because my previous PR (PR #733) added Ogre2WideAngleCamera but left out the ability to add RenderPasses to it. And I only realized when the time came.

I had to slightly refactor how Ogre2WideAngleCamera works in order to support all kinds of RenderPasses (like LensFlare) in it.

Fortunately I was VERY thorough in the new integration tests which let me caught various bugs with Render passes in Ogre2WideAngleCamera.

It is now complete. Only ogre engine support is missing.

@scpeters scpeters requested a review from sanjuksha November 9, 2022 21:29
ogre2/src/Ogre2WideAngleCamera.cc Outdated Show resolved Hide resolved
ogre2/src/Ogre2WideAngleCamera.cc Outdated Show resolved Hide resolved
include/gz/rendering/LensFlarePass.hh Outdated Show resolved Hide resolved
test/common_test/LensFlarePass_TEST.cc Outdated Show resolved Hide resolved
test/common_test/LensFlarePass_TEST.cc Outdated Show resolved Hide resolved
ogre2/src/Ogre2RayQuery.cc Outdated Show resolved Hide resolved

if (!this->dataPtr->preferGpu || //
!this->dataPtr->camera || //
!this->dataPtr->camera->Parent() || //
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the extra // at the end?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That comment prevents clang format from merging all statements into fewer lines.

test/integration/render_pass.cc Outdated Show resolved Hide resolved
const bool bIsAffine = transform.isAffine();

Ogre::Ray mouseRay;
#ifndef SLOW_METHOD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a new macro or something defined in ogre? Maybe add a brief comment / doc about this macro?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question goes for the SINGLE_THREADED macro

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added explanations on the top.

These macros are basically "just in case" but (knocks on wood) should never be needed other than for benchmarking

Signed-off-by: Matias N. Goldberg <[email protected]>
Add RayQuery::SetPreferGpu
Add RayQuery::UsesGpu
Add RenderPass::PreRender(const CameraPtr &) overload
Add RenderTarget::PreRender(const CameraPtr &) overload

Signed-off-by: Matias N. Goldberg <[email protected]>
Signed-off-by: Matias N. Goldberg <[email protected]>
Signed-off-by: Matias N. Goldberg <[email protected]>
Signed-off-by: Matias N. Goldberg <[email protected]>
It would only work it the Pass was added before the Camera initializes
the Compositor workspace

Signed-off-by: Matias N. Goldberg <[email protected]>
SetOcclusionSteps(0) disables occlusion instead of causing divisions by
0

Signed-off-by: Matias N. Goldberg <[email protected]>
Signed-off-by: Matias N. Goldberg <[email protected]>
Signed-off-by: Matias N. Goldberg <[email protected]>
Fix warning

Signed-off-by: Matias N. Goldberg <[email protected]>
Add SINGLE_THREADED macro

Signed-off-by: Matias N. Goldberg <[email protected]>
Changed how we render to it, because the old way was not friendly to
adding RenderPass(es) to it
Fix Ogre2WideAngleCamera::Copy, which would return garbage

Signed-off-by: Matias N. Goldberg <[email protected]>
It's failing, but for very little. Needs more research.

Signed-off-by: Matias N. Goldberg <[email protected]>
Signed-off-by: Matias N. Goldberg <[email protected]>
These accidentally slipped through the commits.

Signed-off-by: Matias N. Goldberg <[email protected]>
Enable GPU RayQuery version on Apple
Document SINGLE_THREADED & SLOW_METHOD macros
Use kWideAngleNumCubemapFaces instead of literal 6
Fix typos

Signed-off-by: Matias N. Goldberg <[email protected]>
Signed-off-by: Matias N. Goldberg <[email protected]>
Signed-off-by: Matias N. Goldberg <[email protected]>
@darksylinc
Copy link
Contributor Author

OK I pushed various fixes for the problems CI found.

The only problem I see is that Metal version is complaining that VisualAt test is failing on macOS. However:

  • GPU RayQuery was recently enabled for macOS (is it enabled in the CI version? AFAIK it should be)
    • That may be the reason of the failure
  • If CPU RayQuery is (for some odd reason?) then it could be this PR's fault.

It would be great to know if this failure happens with on CI with regular main to discard whether this is a fault from this PR.

Perhaps I should submit a PR with an innocuous change to see if macOS CI fails as well?

Signed-off-by: Matias N. Goldberg <[email protected]>
@darksylinc
Copy link
Contributor Author

darksylinc commented Nov 15, 2022

OK macOS test has finished again, and this time VisualAt succeeded. Which is even more anxiety-inducing.

Now macOS CI failed again; but this time it's something I recognize:

33: /Users/jenkins/workspace/ignition_rendering-ci-pr_any-homebrew-amd64/ign-rendering/test/integration/render_pass.cc:830: Failure
33: Expected: (brightness[0][0]) > (brightness[0][1]), actual: 1169621 vs 1392161
33: /Users/jenkins/workspace/ignition_rendering-ci-pr_any-homebrew-amd64/ign-rendering/test/integration/render_pass.cc:831: Failure
33: Expected: (brightness[0][0]) > (brightness[1][0]), actual: 1169621 vs 1183689
33: /Users/jenkins/workspace/ignition_rendering-ci-pr_any-homebrew-amd64/ign-rendering/test/integration/render_pass.cc:832: Failure
33: Expected: (brightness[0][0]) > (brightness[1][1]), actual: 1169621 vs 1169631
33: /Users/jenkins/workspace/ignition_rendering-ci-pr_any-homebrew-amd64/ign-rendering/test/integration/render_pass.cc:837: Failure
33: Expected: (brightness[1][1]) > (brightness[0][1]), actual: 1169631 vs 1392161
33: /Users/jenkins/workspace/ignition_rendering-ci-pr_any-homebrew-amd64/ign-rendering/test/integration/render_pass.cc:838: Failure
33: Expected: (brightness[1][1]) > (brightness[1][0]), actual: 1169631 vs 1183689

This particular test checks handedness is correct (i.e. the texture isn't upside down, mirrored, rotated). The sun should be coming from top left corner [0][0]; which means it should be the brightest part.

But only the LensFlareWideAnglePass test fails; which means the actual problem is likely in the previous PR #733 and went undetected.

I'll add another test for WideAngleCamera to detect handedness.

Update

OK I just charted the values from Metal's test:

1169621 1183689
1392161 1169631

The values are clearly Y-flipped, a common problem when dealing with shaders and different RenderSystems.

For reference this is what my computer produces:

1392635 1169639
1169629 1183573

Now all that remains is to rule out whether this problem is caused by WideAngleCamera's last pass shader, or by LensFlare's shader during WideAngleCamera rendering.

Copy link
Contributor

@sanjuksha sanjuksha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried building and running as per the instructions, I am getting this error while building. Am I doing something wrong or missing anything ?
image

@iche033
Copy link
Contributor

iche033 commented Nov 15, 2022

I tried building and running as per the instructions, I am getting this error while building. Am I doing something wrong or missing anything ?

you need the changes from gazebosim/gz-common#474

@iche033
Copy link
Contributor

iche033 commented Nov 15, 2022

OK macOS test has finished again, and this time VisualAt succeeded. Which is even more anxiety-inducing.

ahh weird. I thought it's my fault. I just went back to check CI for #758 and noticed that VisualAt failed there after enabling GPU ray queries. Not sure why it's no longer failing - could be flaky

@darksylinc
Copy link
Contributor Author

Btw there is no codecov comment. I thought it would appear once the CIs successfully ran the tests... but I guess it's not?

@iche033
Copy link
Contributor

iche033 commented Nov 15, 2022

Btw there is no codecov comment. I thought it would appear once the CIs successfully ran the tests... but I guess it's not?

I think it codecov may not work on PRs created from forks. I see that they are only run if it's a branch in this repo https://app.codecov.io/gh/gazebosim/gz-rendering/pulls

@darksylinc
Copy link
Contributor Author

But... it's there?

image

AFAIK there's nothing different from how I normally submit PRs EXCEPT that this PR originally could not compile at all due to gz-common5 dependency.

@codecov
Copy link

codecov bot commented Nov 16, 2022

Codecov Report

Merging #752 (7b175e8) into main (55fdafe) will increase coverage by 0.76%.
The diff coverage is 92.00%.

@@            Coverage Diff             @@
##             main     #752      +/-   ##
==========================================
+ Coverage   74.38%   75.15%   +0.76%     
==========================================
  Files         160      162       +2     
  Lines       14673    15084     +411     
==========================================
+ Hits        10915    11336     +421     
+ Misses       3758     3748      -10     
Impacted Files Coverage Δ
include/gz/rendering/RayQuery.hh 100.00% <ø> (ø)
include/gz/rendering/RenderPass.hh 100.00% <ø> (ø)
include/gz/rendering/RenderTarget.hh 100.00% <ø> (ø)
include/gz/rendering/base/BaseRayQuery.hh 33.33% <0.00%> (-3.51%) ⬇️
ogre2/src/Ogre2RenderEngine.cc 76.67% <ø> (ø)
ogre2/src/Ogre2RenderPass.cc 86.66% <50.00%> (+4.84%) ⬆️
ogre2/src/Ogre2RenderTarget.cc 82.42% <66.66%> (+0.26%) ⬆️
ogre2/src/Ogre2RayQuery.cc 88.94% <90.99%> (+27.74%) ⬆️
ogre2/src/Ogre2WideAngleCamera.cc 90.95% <94.23%> (+0.95%) ⬆️
ogre2/src/Ogre2LensFlarePass.cc 95.20% <95.20%> (ø)
... and 14 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@iche033
Copy link
Contributor

iche033 commented Nov 16, 2022

ah there we go, we now have codecov report :) I retriggered a few more github actions

SetBackgroundColor would work before creating a Camera.
However once a Camera was initialized, subsequent calls to
SetBackgroundColor would be ignored

Signed-off-by: Matias N. Goldberg <[email protected]>
Signed-off-by: Matias N. Goldberg <[email protected]>
@darksylinc
Copy link
Contributor Author

OK I've added a coordinate convention test to WideAngleCamera and as I suspected, the Metal version was upside down.

In the process of writing this test I discovered a bug in Ogre2Scene::SetBackgroundColor as further calls to it would be ignored after a Camera has been created. I applied the same fix that Ogre1 engine uses.

This PR includes the fix for Ogre2Scene::SetBackgroundColor as it is necessary for the WideAngleCamera coordinate convention test.

Now all tests pass on Metal, except for VisualAt, but appears to be unrelated to this PR.

All tests related to this PR are passing so I see it as 100% ready for merge.

BTW: Just to be sure, I ran our own (OgreNext) Readback Test(*) for Metal and everything appears to be OK from OgreNext side.

(*)The readback test is an OgreNext test that tries to force race conditions to come out when reading from CPU what the GPU just wrote (if this test fails, then gazebo tests like VisualAt would definitely fail at random). I originally wrote it to stress-test our Vulkan RenderSystem.

@iche033
Copy link
Contributor

iche033 commented Dec 1, 2022

@sanjuksha here's a patch to ogre2_demo to see the effect:

patch
diff --git a/examples/ogre2_demo/Main.cc b/examples/ogre2_demo/Main.cc
index 337cf1d1..1f9bdef0 100644
--- a/examples/ogre2_demo/Main.cc
+++ b/examples/ogre2_demo/Main.cc
@@ -303,6 +303,15 @@ CameraPtr createCamera(const std::string &_engineName,
     noisePass->SetStdDev(0.08);
     noisePass->SetEnabled(false);
     camera->AddRenderPass(noisePass);
+
+    auto light = scene->LightByIndex(0u);
+    RenderPassPtr p = rpSystem->Create<LensFlarePass>();
+    LensFlarePassPtr lensFlarePass =
+      std::dynamic_pointer_cast<LensFlarePass>(p);
+    lensFlarePass->Init(scene);
+    lensFlarePass->SetEnabled(true);
+    lensFlarePass->SetLight(light);
+    camera->AddRenderPass(lensFlarePass);
   }
 
   return camera;

Once you launch the demo, you can press s to disable sky to better see the lens flares:

Screenshot from 2022-12-01 15-16-06

Can you see if that works for you?

@sanjuksha
Copy link
Contributor

Thanks @iche033 for the patch. Both of the mentioned tests in the description worked for me now.
ogre2_demo test
Screenshot from 2022-12-02 11-53-36

INTEGRATION_render_pass test

Screenshot from 2022-11-23 11-39-29

@iche033 iche033 removed the needs upstream release Blocked by a release of an upstream library label Dec 2, 2022
@iche033 iche033 merged commit 29bcc28 into gazebosim:main Dec 2, 2022
@darksylinc darksylinc mentioned this pull request Dec 3, 2022
9 tasks
iche033 pushed a commit that referenced this pull request Dec 19, 2022
Previous PR #752 added LensFlarePass to ogre2.
This PR adds LensFlarePass to ogre1.

This includes LensFlarePass support for ogre1's WideAngleCamera.
WideAngleCamera had to be modified a little to support RenderPass at all.

In theory the gaussian filter pass should also work with WideAngleCamera but I did not test it.

It also implements OgreWideAngleCamera::Copy so that integration tests can function (Ogre2WideAngleCamera::Copy had already been implemented).

OgreRayQuery also had to be modified to support OgreWideAngleCamera; since LensFlares need ray queries for occlusion support.

Additionally I add RemoveAllRenderPasses since I noticed a lot of RenderPasses didn't get to properly deinitialize. Though it didn't seem like much harm was being done because currently all RenderPasses are very simple.

Signed-off-by: Matias N. Goldberg <[email protected]>
@iche033 iche033 added the 🎵 harmonic Gazebo Harmonic label Feb 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🎵 harmonic Gazebo Harmonic
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants