-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelize State call in ECM #451
Conversation
Signed-off-by: John Shepherd <[email protected]>
Signed-off-by: John Shepherd <[email protected]>
Signed-off-by: John Shepherd <[email protected]>
Signed-off-by: John Shepherd <[email protected]>
Signed-off-by: John Shepherd <[email protected]>
Signed-off-by: John Shepherd <[email protected]>
Signed-off-by: John Shepherd <[email protected]>
Signed-off-by: John Shepherd <[email protected]>
Signed-off-by: John Shepherd <[email protected]>
…b.com/ignitionrobotics/ign-gazebo into jshep1/scenebroadcaster_optimizations Signed-off-by: John Shepherd <[email protected]>
Signed-off-by: John Shepherd <[email protected]>
Signed-off-by: John Shepherd <[email protected]>
Signed-off-by: John Shepherd <[email protected]>
Signed-off-by: John Shepherd <[email protected]>
Here's what I got comparing performance of the This is from running the shapes_population.sdf world. It does look there it shaved a few ms off the Note careful about using remotery for profiling. I noticed that since new threads are spawned every iteration, it made remotery a bit laggy and unresponsive sometimes. |
Good to see that you're seeing improvements... I guess Remotery was a bit wonky for me given all the threads spawning. I'll clean up the logic and make it more generalizable and get the PR in for review soon. |
by the way, I wonder if it's possible to reuse threads, e.g. store in member variable?, and have them wait on a condition_variable until there's work to do. |
That's also what I was thinking and was going to attempt something like that. It would certainly reduce noise in Remotery as well as cut down any overhead from spawning so many threads every iteration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested by adding and deleting objects in a shapes.sdf, and spawning objects from fuel and it seems to be work fine for the case of single thread.
Interestingly, I tried shapes_population.sdf and noticed that a sphere is missing (sphere_534):
I also tried generating a shapes_population.sdf world with n
set to 100
and I see a couple of models missing. Do you see the same issue?
Signed-off-by: John Shepherd <[email protected]>
Good catch, took me a little bit to find the bug, but I resolved it here: 0ba4684 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The number of objects in shapes_populations.sdf are now correct for me.
I'll leave it to you and @chapulina to decide on the number of threads to use. Otherwise the PR looks good.
Signed-off-by: John Shepherd <[email protected]>
Signed-off-by: Louise Poubel <[email protected]>
Signed-off-by: Louise Poubel <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the improvements!
The ABI checker is having the same old false positive. All other CI looks good. Merging! |
Ready for review. Below is a plot of 25 randomly sampled
PostUpdate
profiled points for both before and after this PR.The before average timestep of
PostUpdate
was 33.095 ms and after average time forPostUpdate
is 22.479 ms in the case of theshapes_population.sdf
. This PR is primarily targeted for larger example and will not significantly affect smaller examples (if there are under 50 entity components, no speedup will occur at all due to the nature of the thread spawning in this PR).My RTF before this PR after 2 minutes of running
shapes_population.sdf
was 0.602% and after the PR I am getting about 0.669% RTF.I'm going to experiment with the the number of threads and the number of components each thread handles to see if I can tune these two parameters to get a more optimized RTF.I don't think changing the max thread count to anything higher thanstd::thread::hardware_concurrency()
(the max suggested threads that the hardware supports) would speed up anything, this would likely just bottleneck any additional threads and schedule them in when it puts other threads to sleep. Perhaps we could see some speedup with the amount of entity components each thread will run at minimum, but I'm not so sure this would make a huge difference either, it would only affect scenes in which there are betweencomponentsPerThread
andcomponentsPerThread
*maxThreads
entity components.