Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delay and time jump in object position update when using the transform tool #1358

Closed
azeey opened this issue Feb 24, 2022 · 10 comments · Fixed by #1392
Closed

Delay and time jump in object position update when using the transform tool #1358

azeey opened this issue Feb 24, 2022 · 10 comments · Fixed by #1392
Assignees
Labels
bug Something isn't working

Comments

@azeey
Copy link
Contributor

azeey commented Feb 24, 2022

Environment

  • OS Version: Ubuntu 20.04
  • Source or binary build?
    source: ign-gazebo6 (91a79da) (and binary 3.12.0 for comparison)

Description

  • Expected behavior: After moving an object using the transform tool, I expect the object to assume its new position and simulation to resume immediately. This is the case in citadel (3.12.0) as shown in the video:
citadel.mp4
  • Actual behavior: After using the transform tool, simulation seems to hang for a second and then jump in time so that the intermediate state of the object is not displayed on the GUI. In the video below, the box snaps to the ground after it hangs in the air for a second or two.
fortress.mp4

Interestingly, the behavior is different when I try it on ign-gazebo 6.5.0 binaries. The small delay is still there, but it doesn't seem to jump in time.

fortress_binaries.mp4
@azeey azeey added the bug Something isn't working label Feb 24, 2022
@adlarkin
Copy link
Contributor

adlarkin commented Mar 2, 2022

I have tested various commits/releases of ign-gazebo6, but I can't replicate the "snapping" behavior that was posted in the second video. When using ign-gazebo6, I typically see the delay behavior described in the third video (I'm testing on ubuntu bionic, I'm not sure if that would make a difference). I also tested the ign-gazebo5 binaries and didn't experience the delay behavior there, so this must be related to something that was introduced into ign-gazebo6.

I went ahead and built ign-gazebo6 at commit 2b141ac, which was the last commit before the minimal scene was introduced in commit 2c2605f. The delay behavior still appears to exist at this point (see the video below), so I don't think that the minimal scene changes are the cause of this issue (cc @ahcorde):

transform_delay

@adlarkin
Copy link
Contributor

adlarkin commented Mar 2, 2022

After some more debugging, I believe this issue was introduced in #851. The serialization/deserialization for models is done via converting a SDF element to/from a string, which seems to cause a bottleneck in the GUI <-> server communication. If don't serialize sdf::Model with the following patch, the delay behavior goes away:

diff --git a/include/ignition/gazebo/components/Model.hh b/include/ignition/gazebo/components/Model.hh
index 5ba0420d..6085e9a4 100644
--- a/include/ignition/gazebo/components/Model.hh
+++ b/include/ignition/gazebo/components/Model.hh
@@ -87,9 +87,7 @@ namespace components
   IGN_GAZEBO_REGISTER_COMPONENT("ign_gazebo_components.Model", Model)
 
   /// \brief A component that holds the model's SDF DOM
-  using ModelSdf = Component<sdf::Model,
-                   class ModelTag,
-                   serializers::SdfModelSerializer>;
+  using ModelSdf = Component<sdf::Model, class ModelTag>;
   IGN_GAZEBO_REGISTER_COMPONENT("ign_gazebo_components.ModelSdf", ModelSdf)
 }
 }

Here's what I now see with the patch above applied to the ign-gazdebo6 branch at commit 6ac12c6:

no_delay

If I don't apply the patch above, I still see the delay/jumpy behavior when testing ign-gazebo6 at the same commit:

transform_delay

@azeey and @ahcorde, would you mind testing the patch I gave above to see if it resolves the issue for you? Obviously, this doesn't solve the real issue of how we're handling sdf::Model serialization, but I'd like to make sure that this is the actual source of the problem.

@iche033
Copy link
Contributor

iche033 commented Mar 3, 2022

I tested the above patch and the lag is gone.

@adlarkin
Copy link
Contributor

adlarkin commented Mar 3, 2022

Great, thanks for verifying! We'll need to figure out a better way to serialize/deserialize sdf::Model. It also may be worth re-visiting how we serialize/deserialize other sdf types, because perhaps these other sdf types are hindering simulation runtime performance in other ways. Should we create another issue about efficient serialization/de-serialization for sdf types? I did a quick search and didn't see any related issues that already exist, but perhaps there are related issues open that I didn't notice.

@iche033
Copy link
Contributor

iche033 commented Mar 3, 2022

yeah serialization performance issue came up when we were doing the performance improvements work. I think @mjcarroll has some ideas on how to improve serialization.

@azeey
Copy link
Contributor Author

azeey commented Mar 3, 2022

I can also confirm the patch fixes the issue for me. Great job investigating @adlarkin! I'm not sure though if the problem is the effectiveness of the serialization or the fact that sdf::Model and other components are unnecessarily being serialized when the transform tool is used. Echoing /world/shapes/state, I can see that as soon as I translate the object, the entire contents of the ECM get published.

@azeey
Copy link
Contributor Author

azeey commented Mar 3, 2022

This patch also fixes the issue for me without making any changes to the Model component. Of course this might break other things, so I'm not suggesting we make this change, but it's clear that we're serializing everything when a component is removed or if it has a one time change. And both are true for the WorldPoseCmd component which is used when using the translation tool.

diff --git a/src/systems/scene_broadcaster/SceneBroadcaster.cc b/src/systems/scene_broadcaster/SceneBroadcaster.cc
index 800d338c3..a53719075 100644
--- a/src/systems/scene_broadcaster/SceneBroadcaster.cc
+++ b/src/systems/scene_broadcaster/SceneBroadcaster.cc
@@ -314,8 +314,8 @@ void SceneBroadcaster::PostUpdate(const UpdateInfo &_info,
   // we can skip the ECM serialization
   bool jumpBackInTime = _info.dt < std::chrono::steady_clock::duration::zero();
   bool changeEvent = _manager.HasEntitiesMarkedForRemoval() ||
-    _manager.HasNewEntities() || _manager.HasOneTimeComponentChanges() ||
-    jumpBackInTime || _manager.HasRemovedComponents();
+    _manager.HasNewEntities() ||
+    jumpBackInTime;
   auto now = std::chrono::system_clock::now();
   bool itsPubTime = (now - this->dataPtr->lastStatePubTime >
        this->dataPtr->statePublishPeriod[_info.paused]);

@jasonbyun8
Copy link

jasonbyun8 commented Mar 7, 2022

This works for me but I am receiving an error message now as below;

[Err] [EntityComponentManager.cc:1010] Failed to create component of type [0] for entity [11092]. Type has not been properly registered.

@chapulina
Copy link
Contributor

chapulina commented Mar 9, 2022

Here's another possible solution, it seemed to fix the delay, but it may have other unintended consequences:

diff --git a/src/systems/scene_broadcaster/SceneBroadcaster.cc b/src/systems/scene_broadcaster/SceneBroadcaster.cc
index 800d338c..e1faa9ab 100644
--- a/src/systems/scene_broadcaster/SceneBroadcaster.cc
+++ b/src/systems/scene_broadcaster/SceneBroadcaster.cc
@@ -330,10 +330,14 @@ void SceneBroadcaster::PostUpdate(const UpdateInfo &_info,
     set(this->dataPtr->stepMsg.mutable_stats(), _info);
 
     // Publish full state if there are change events
-    if (changeEvent || this->dataPtr->stateServiceRequest)
+    if (this->dataPtr->stateServiceRequest)
     {
       _manager.State(*this->dataPtr->stepMsg.mutable_state(), {}, {}, true);
     }
+    else if (changeEvent)
+    {
+      _manager.ChangedState(*this->dataPtr->stepMsg.mutable_state());
+    }
     // Otherwise publish just periodic change components when running
     else if (!_info.paused)
     {

@adlarkin
Copy link
Contributor

Here's another possible solution, it seemed to fix the delay, but it may have other unintended consequences:

diff --git a/src/systems/scene_broadcaster/SceneBroadcaster.cc b/src/systems/scene_broadcaster/SceneBroadcaster.cc
index 800d338c..e1faa9ab 100644
--- a/src/systems/scene_broadcaster/SceneBroadcaster.cc
+++ b/src/systems/scene_broadcaster/SceneBroadcaster.cc
@@ -330,10 +330,14 @@ void SceneBroadcaster::PostUpdate(const UpdateInfo &_info,
     set(this->dataPtr->stepMsg.mutable_stats(), _info);
 
     // Publish full state if there are change events
-    if (changeEvent || this->dataPtr->stateServiceRequest)
+    if (this->dataPtr->stateServiceRequest)
     {
       _manager.State(*this->dataPtr->stepMsg.mutable_state(), {}, {}, true);
     }
+    else if (changeEvent)
+    {
+      _manager.ChangedState(*this->dataPtr->stepMsg.mutable_state());
+    }
     // Otherwise publish just periodic change components when running
     else if (!_info.paused)
     {

I spent some time looking into taking this approach, and it looks like this fix should work - I opened a PR with this change and added test coverage if anyone would like to take a look and test it out: #1392

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants