-
-
Notifications
You must be signed in to change notification settings - Fork 21.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MultiplayerSynchronizer: Improve performance of watched variables #81593
MultiplayerSynchronizer: Improve performance of watched variables #81593
Conversation
TL:DR: The extent of workload reduction surpasses what's apparent from the percentage figures provided (relative execution times). In our specific project, there's an overall improvement of over 15%, with an even higher reduction exceeding 20% when focusing only on Main::iteration. This is substantiated by the reduction in measured cycles for Main::iteration from 1.674E+11 to 1.294E+11. Next time it would be better to present the measured cycle data directly instead of using the relative execution times (to enhance clarity). As of now, I have included the measured cycle data in the summary for better transparency and understanding. Long Version: Assuming that the required cycle count for the code outside of To make it clearer, before the change, the relative execution time of Main::iteration was 74.8% (0.748), and after the change, it dropped to 69.1% (0.691). For simplicity, let's assume the old total workload was 1, which leads to the following equation: This demonstrates that the overall workload in our application has, in fact, decreased by over 18%. However, this performance gain was specifically within In other words, the new total workload of 0.815533981 equals the workload for the code outside of This explanation clarifies why we can anticipate a significantly greater improvement than the initial perception of just 5%. This conclusion gains further support from the observations concerning the "aggregated sample costs," which were 1.674E+11 (before) and 1.294E+11 (after), as illustrated in the provided screenshots. However, it's important to acknowledge that the assumption that the code outside of Next time I just will go with the measured cycles instead of the percentage values. Sorry for the confusion :D BeforeAfter |
@DennisManaa thank you very much for the comprehensive report. As I mentioned in chat, this PR does indeed introduce the side effect of skipping the first (few) delta synchronizations, resulting in an out of sync state until those variable changes. I've opened #82777 which should address Would be great if you could give that a try and confirm that it does work for you, and that you do get the expected performance improvements. |
Superseded by #82777 |
This PR partially resolves a specific issue we encountered while working with the High-Level-Multiplayer-System on our Multiplayer Top-Down Shooter.
I've thoroughly tested these modifications on both version 4.1.1-stable and the master branch. The accompanying screenshots were captured from the master branch for reference.
Background
We recently encountered performance issues in our game, prompting us to investigate their root causes. During this process, we upgraded our project from version 4.0.3 to 4.1.1 and transitioned to using variable "watch"ing as opposed to the "sync" option for data synchronization. We applied this approach wherever it made sense, significantly reducing server-client data traffic. This was particularly effective because many variables were not updated every frame. We greatly appreciated the efficiency gains from this feature that was introduced to the MultiplayerSynchronizer.
Unfortunately, this adjustment did not completely resolve our performance problems. After some time, we pinpointed the issue to our server's CPU struggling to handle the workload required. Regrettably, the integrated profiler did not provide sufficient insights in this case. Therefore, we turned to profiling our game using the Hotspot Profiler in conjunction with a debug build of the Godot Engine, as outlined in the Godot-documentation here: link.
We hope that this pull request, if merged, will contribute to partially alleviating our performance issues as soon as version 4.2 is released. If not, we remain optimistic that it will help to discover alternative ways to enhance the High-Level-Multiplayer-System's performance for all users of the engine. :)
Measurements
In our SceneTree, we typically have between 70 to 120 MultiplayerSynchronizers, which varies based on the game state. To optimize network synchronization, we've configured the
replication_interval
anddelta_interval
for most variables to approximately 0.016 seconds, which aligns with a frame rate of 60 frames per second (i.e., approximately once per frame).We conducted a gameplay session that lasted precisely 5 minutes, with around 30 seconds of setup time before starting a game round and an additional 20 seconds after finishing the game round. These timings were meticulously controlled by in-game timers. During our profiling analysis, we observed that the processing of calls to
SceneMultiplayer::poll
was consuming a significant amount of time in our specific scenario, accounting for 35.9% of the processing load. This was even more time-consuming than the processing of calls toSceneTree::_process
, which constituted 23.7% of the workload.The root cause of this performance issue appears to be the resource-intensive nature of the calls to
_verify_synchronizer
, which are invoked by both_send_delta
and_send_sync
.Everything
SceneMultiplayer::poll
SceneReplicationInterface::_send_delta
SceneReplicationInterface::_send_sync
How to Improve
The majority of the performance cost associated with the
SceneMultiplayer::poll
method can be attributed to two specific methods:_send_delta
(responsible for synchronizing variables set to "watch") and_send_sync
(responsible for synchronizing variables set to "sync"). At least, that's the understanding I've gathered from the code. Both of these methods call_verify_synchronizer
, which appears to be a potential bottleneck in our performance analysis. It's worth noting that any improvements made to this function could potentially benefit all games utilizing the Highlevel Multiplayer Systems.Given my current limited familiarity with the codebase, I'm unable to provide a specific enhancement strategy for
_verify_synchronizer
. Instead, I suggest an alternative approach for this pull request. It appears that_send_delta
invokes the_verify_synchronizer
method before checking if there is data that requires updating. I propose to modify the code to perform these checks before calling_verify_synchronizer
. This change would ensure that_verify_synchronizer
is only invoked when an update is genuinely necessary (similar to what_send_sync
already does). Currently, it's being called even when no update is needed. To illustrate the impact of this modification, I've provided screenshots of the results after implementing this change.Everything
SceneMultiplayer::poll
SceneReplicationInterface::_send_delta
SceneReplicationInterface::_send_sync
Summary
The call to
_send_delta
has undergone a significant reduction in its execution time, which in turn has caused shifts in the relative execution times of various methods within the callstack. Here's an overview of these changes:It's important to note that the data provided is specific to our project, and the extent of improvements or variations in performance could differ significantly for other projects. Regrettably, I am unable to share the specifics of this particular project. However, if necessary, I'd be willing to explore the possibility of creating a sample project to conduct more comprehensive testing and evaluation.
BUT (Important):
I want to emphasize that there could be potential side-effects that I cannot confidently evaluate due to my limited familiarity with Godot's code. Assistance and insights from those more experienced in the code would be greatly appreciated. In our specific project, we encountered no problems, and everything functioned as expected. However, it's crucial to recognize that this may not necessarily apply to all use cases. Your assistance in assessing and addressing any unforeseen issues would be highly valued. (This is the first time I contribute anything to the Godot-Project)