Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert shader compilation and occlusion buffer to use parallel for_range() #79481

Closed
wants to merge 1 commit into from
Closed

Convert shader compilation and occlusion buffer to use parallel for_range() #79481

wants to merge 1 commit into from

Conversation

myaaaaaaaaa
Copy link
Contributor

@myaaaaaaaaa myaaaaaaaaa commented Jul 14, 2023

Simplifies threading compared to direct usage of WorkerThreadPool, since there is no longer a need for intermediate functions or structs.

Some instances of WorkerThreadPool have been omitted from this PR to avoid conflicting with #72687

RaycastOcclusionCull::RaycastHZBuffer::update_camera_rays() may experience performance regressions until #72716 is merged.

This PR passes the regression test here: godotengine/godot-benchmarks#34
This PR has not been tested on https://github.com/Calinou/godot-rendering-tests

Copy link
Member

@RandomShaper RandomShaper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. I'm only a bit concerned with the capture-by-value ([=]) thing. In this case it's fine, but there's a risk of over-capturing as the code evolves. We can just let this go and try to come up with rules (like capturing this explicitly and using interim locally-scoped structs with all the values needed inside the lamba when the list would be long) for this and revise all usages at some point.

@myaaaaaaaaa
Copy link
Contributor Author

I'm only a bit concerned with the capture-by-value ([=]) thing. In this case it's fine, but there's a risk of over-capturing as the code evolves.

I think I initially went with [=] where possible for safety reasons, but for consistency sake perhaps it's better to just standardize on [&](const ...) {} so that it can be a drop-in replacement for any for loops that are embarrassingly parallel. Over-capturing shouldn't be a problem here, since parallel for loop bodies should generally be kept as small as possible anyway to make the multithreading easier to reason about.

Of course, other lambda use cases might need different rules.

We can just let this go and try to come up with rules (like capturing this explicitly and using interim locally-scoped structs with all the values needed inside the lamba when the list would be long) for this and revise all usages at some point.

If for_range() becomes more widely used across the Godot codebase, it might be a good idea to make specialized macros that hide the lambda entirely.

The current for_range() signature is a bit ridiculous anyway since I was prioritizing feature parity with WorkerThreadPool::add_*_group_task() and for(), so replacing it at some point is basically mandatory.

@RandomShaper
Copy link
Member

Well, in #79490 you are using [&]. I'd suggest this one being consistent with it indeed.

@myaaaaaaaaa myaaaaaaaaa deleted the forrange-shader branch August 16, 2023 16:22
@AThousandShips AThousandShips removed this from the 4.x milestone Aug 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants