-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggested fix for issue #9131 #9163
base: development
Are you sure you want to change the base?
Suggested fix for issue #9131 #9163
Conversation
…iseconds, and reduce the retry delay for subsequent retries; time_diff_keeper: Discard delayed responses, as they may have been received earlier than they were returned, which means that system_time_finish is very likely inaccurate
const int POLLING_DEVICES_INTERVAL_MS = 5000; | ||
const uint16_t MAX_RETRIES = 500; | ||
const uint8_t DEFAULT_V4L2_FRAME_BUFFERS = 4; | ||
const uint16_t DELAY_FOR_RETRIES = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about the other changes but please note that "DELAY_FOR_RETRIES" is used in other places in LRS SDK. (for example on l500-device.cpp).
Changing the value will effect other processes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your comment. I overlooked this because I initially made the commit in this PR based on the master branch and then cherry-picked it on top of the development branch for submitting the PR. Based on grep-ing the source, it seems that "DELAY_FOR_RETRIES" was not used anywhere else in master, but in development, it has been introduced into three pieces of identically-looking code which start on these lines:
librealsense/src/l500/l500-device.cpp
Line 357 in fc007ce
const int MAX_ITERATIONS_FOR_DEVICE_DISCONNECTED_LOOP = (POLLING_DEVICES_INTERVAL_MS + 1000) / DELAY_FOR_RETRIES; |
librealsense/src/ivcam/sr300.cpp
Line 350 in fc007ce
const int MAX_ITERATIONS_FOR_DEVICE_DISCONNECTED_LOOP = (POLLING_DEVICES_INTERVAL_MS + 1000) / DELAY_FOR_RETRIES; |
librealsense/src/ds5/ds5-device.cpp
Line 147 in fc007ce
const int MAX_ITERATIONS_FOR_DEVICE_DISCONNECTED_LOOP = (POLLING_DEVICES_INTERVAL_MS + 1000) / DELAY_FOR_RETRIES; |
Based on a quick look at these pieces of code, my impression is that the exact delay there probably does not matter much, since the code only seems to query a bool
behind a mutex
based on this delay and keeps the overall query duration constant. But I don't know which delay would be most appropriate here; I could easily revert it to 50 milliseconds if desired.
My reasoning for reducing the delay in the other place (the retry interval for USB commands) was that with a D415 device with slightly non-current firmware, and the command to read out the device clock, there seemed to be three distinct outcomes in the code that I tested:
- The response to the command could be read immediately
- A single retry (without any delay) was needed to get the response after getting EBUSY in the first try
- It took about 82 +- 1 milliseconds to get the response, with EBUSY being returned in the meantime
Waiting 50 milliseconds as the original code does seemed to cause a lot of unnecessary waiting in cases 2 and 3. The changed retry strategy in this PR should be able to get the response as soon as possible in cases 1 and 2, and reduce the waiting in case 3.
This is just based on observations though; I do not have any previous experience with low-level USB handling.
This is a suggestion for a fix to issue #9131. Changes:
retry_controls_work_around
: Instead of sleeping for 50 milliseconds already after the first time a USB command fails (due to the device being busy), the strategy for retrying is changed to behave as follows:DELAY_FOR_RETRIES
, is reduced from 50 to 10 milliseconds, since 50 milliseconds seems rather long to me.MAX_RETRIES
is correspondingly increased to keep the time period for retries the same (in addition to the added short period of immediate retries).system_time_finish
is very likely inaccurate, as explained in the added comment.)