Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libusb crash for multiple idling cameras #12400

Closed
fk-bbraun opened this issue Nov 13, 2023 · 10 comments
Closed

libusb crash for multiple idling cameras #12400

fk-bbraun opened this issue Nov 13, 2023 · 10 comments

Comments

@fk-bbraun
Copy link

Required Info
Camera Model D401 / D455
Firmware Version 5.15.1
Operating System & Version Ubuntu 20.04
Kernel Version (Linux Only) 5.15.0-88
Platform PC
SDK Version 2.54.1-1
Language C++

Hello,
we have a problem in a system that uses two realsense cameras. There is a rather uncommon usage mode, where we dont interact with the cameras directly or ever, still we will construct some basic rs2 objects in our constructor to be used later. After 10-15min we usually get a crash with a stacktrace pointing to libusb:

#0  0x00007ffff717e153 in libusb_exit ()
   from /lib/x86_64-linux-gnu/libusb-1.0.so.0
#1  0x00007ffff7c7cb86 in librealsense::platform::usb_context::~usb_context (
    this=0x7ffd900037b0, __in_chrg=<optimized out>)
    at /librealsense/src/libusb/context-libusb.cpp:27
#2  0x00007ffff7c8851e in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7ffd900037a0) at /usr/include/c++/9/bits/shared_ptr_base.h:148
#3  std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (
    this=0x7ffd900037a0) at /usr/include/c++/9/bits/shared_ptr_base.h:148
#4  std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (
    this=<synthetic pointer>, __in_chrg=<optimized out>)
    at /usr/include/c++/9/bits/shared_ptr_base.h:730
#5  std::__shared_ptr<librealsense::platform::usb_context, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=<synthetic pointer>, __in_chrg=<optimized out>)
    at /usr/include/c++/9/bits/shared_ptr_base.h:1169
#6  std::shared_ptr<librealsense::platform::usb_context>::~shared_ptr (
    this=<synthetic pointer>, __in_chrg=<optimized out>)
    at /usr/include/c++/9/bits/shared_ptr.h:103
#7  librealsense::platform::usb_enumerator::query_devices_info ()
    at /librealsense/src/libusb/enumerator-libusb.cpp:83
#8  0x00007ffff7c9577c in librealsense::platform::query_uvc_devices_info ()
    at /librealsense/src/uvc/uvc-device.cpp:45
#9  0x00007ffff7ca5bd2 in librealsense::platform::rs_backend::query_uvc_devices

or

__GI___pthread_mutex_lock (mutex=0x28) at ../nptl/pthread_mutex_lock.c:67
67	../nptl/pthread_mutex_lock.c: No such file or directory.
(gdb) bt
#0  __GI___pthread_mutex_lock (mutex=0x28) at ../nptl/pthread_mutex_lock.c:67
#1  0x00007ffff717cc1a in libusb_get_device_list ()
   from /lib/x86_64-linux-gnu/libusb-1.0.so.0
#2  0x00007ffff7c7caba in librealsense::platform::usb_context::usb_context (
    this=0x7fffdc01cb80) at /librealsense/src/libusb/context-libusb.cpp:18
#3  0x00007ffff7c88155 in __gnu_cxx::new_allocator<librealsense::platform::usb_context>::construct<librealsense::platform::usb_context> (this=<optimized out>, 
    __p=0x7fffdc01cb80) at /usr/include/c++/9/new:174
#4  std::allocator_traits<std::allocator<librealsense::platform::usb_context> >::construct<librealsense::platform::usb_context> (__a=..., __p=0x7fffdc01cb80)
    at /usr/include/c++/9/bits/alloc_traits.h:483
#5  std::_Sp_counted_ptr_inplace<librealsense::platform::usb_context, std::allocator<librealsense::platform::usb_context>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<>(std::allocator<librealsense::platform::usb_context>) (
    __a=..., this=0x7fffdc01cb70)
    at /usr/include/c++/9/bits/shared_ptr_base.h:548
#6  std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<librealsense::platform::usb_context, std::allocator<librealsense::platform::usb_context>>(librealsense::platform::usb_context*&, std::_Sp_alloc_shared_tag<std::allocator<librealsense::platform::usb_context> >) (__a=..., 
    __p=<synthetic pointer>: <optimized out>, this=<synthetic pointer>)
    at /usr/include/c++/9/bits/shared_ptr_base.h:679
#7  std::__shared_ptr<librealsense::platform::usb_context, (__gnu_cxx::_Lock_pol

While the obvious fix seems to be not constructing the objects when not using them, I still wanted to report this flaw. Realsense's usage of libusb calls does not seem to be thread-safe. A minimal snippet to reproduce this error usually within 10s is:

#include <librealsense2/rs.hpp>
#include <chrono>
#include <thread>


int main() {    
    std::vector<std::thread> threads;
    for (int i=0; i<1000; ++i) {
        threads.push_back(std::thread([&]{
            rs2::context c;
            while(true) {
                std::this_thread::sleep_for(std::chrono::milliseconds(100));
            }
        }));
        
    }
    
    while(true) {}
    return 0;
}

PS: This might be somewhat similar to the abandoned problem in #12280.

@fk-bbraun
Copy link
Author

To clarify: We dont get the error, once the rs2::context gets used, e.g. calling query_devices()

@MartyG-RealSense
Copy link
Collaborator

Hi @fk-bbraun There was a past case of this type of __GI___pthread_mutex_lock error at #10112 where the program would fail after 1 hour. In that case, I suggested monitoring the Ubuntu system resources for evidence of a memory leak that could cause an application to crash after a period of time had passed.

@fk-bbraun
Copy link
Author

Memory consumption does not increase once started. Plenty left.

@MartyG-RealSense
Copy link
Collaborator

A solution may be to call rs2::context first to avoid the error (as you suggest above) and build the list of attached camera devices with query_devices(), But then you can choose how long to wait before actually doing something with those cameras with a pipe.start() instruction to enable the camera streams (for example, putting a sleep period before the pipe.start() line). So you do not need to activate the cameras immediately once query_devices has constructed the list of cameras.

@fk-bbraun
Copy link
Author

I tried the proposed workaround. I queried the devices before idling. Still with the two uninitialized cameras the realsense->libusb will crash the application after 2-10min.

@MartyG-RealSense
Copy link
Collaborator

Which method did you use to install the librealsense SDK, please?

If the SDK is built from source code with CMake with the flag -DFORCE_RSUSB_BACKEND=TRUE then performance with multiple cameras will not be as good as a package installation or a source code build where RSUSB = false. This is because an RSUSB build of the SDK is best suited to single-camera applications.

@fk-bbraun
Copy link
Author

Thanks, I think we are going in the right direction. We are using the RSUSB backend.
As I read here a disadvantage is: Single Consumer - most kernel drivers (Linux/Windows) allow to connect and communicate with device from multiple processes (except for streaming). With Libuvc only one application can get device handle.

So yes, I might experiment with using the kernel patches instead. Seems a long way though, since this was the first time we experienced a problem with RSUSB in a multicam setup. It would be interesting, if the realsense team could resolve this in librealsense itself, e.g. by insuring no concurrent access (at least from within the same process)

For now, I seem to have workarounded the issue by delaying the rs2::context creation till we really need it. The initialization of our cameras is not in parallel, so there should be no more concurrent access.

@MartyG-RealSense
Copy link
Collaborator

Thanks very much for the update. Please do let me know if you experience any further problems. Good luck!

@MartyG-RealSense
Copy link
Collaborator

Hi @fk-bbraun Do you require further assistance with this case, please? Thanks!

@fk-bbraun
Copy link
Author

No, thank you, the workaround from above works. I still think the underlying problem in librealsense should be attended to one way or the other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants