Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

D435 on RaspberryPi crashing after few minutes until one hour: failed to set power state or RS2_USB_Status_IO #8274

Closed
mbobinger opened this issue Jan 29, 2021 · 23 comments

Comments

@mbobinger
Copy link

Required Info
Camera Model { D435 }
Firmware Version (Uploaded latest 5.12.10.0)
Operating System & Version RaspbianOS latest
Kernel Version (Linux Only) (idk)
Platform RaspberryPi
SDK Version { legacy / 2.. }
Language {Python}

Issue Description

Dear Intel-Team, we have around 11xD435 Cameras running at our installation sides for around 12h per day. We use following setup: RaspberryPi 4GB, Raspbian OS, 32GB USB Boot, USB3.0 Port plugged in, a USB3.0 5m Repeatercable and Python3.6.

We use Python3.6 to control the camera and the postprocessing is also based on python.

Occasionally, it happens that a Camera crashes unacceptably often, i.e. the crash occurs a few minutes up to an hour after the depth stream has successfully started. We have implemented a couple of checks in our python script that can reset the device based on try: / except: phrases but recently, 2 of the devices are just completely frozen.

So what we initially get is a timeout error:
frames = pipeline.wait_for_frames()
RuntimeError: Frame didn't arrive within 5000

When trying to see what the cam is doing using realsense-viewer, the common failed to set power state error is reported:
Warning [default] Could not open device failed to set power state

So I close and reopen realsense-viewer again (because there is no other way to retrieve any cam information) and after doing so, the camera is not recognized at all anymore and I need to reboot the Raspberry.

We've installed those cameras at a customer side and are in some real time trouble now. Could you assist us in troubleshooting / getting the code more robust?

The problem persists with multiple firmware versions for the camera, I've tried the latest major release recommended for production, I've tried the latest one and some other random version between the latest major and the latest one:
https://downloadcenter.intel.com/download/30119/Firmware-for-Intel-RealSense-D400-Product-Family

But the problem that the camera freezes after some time (and in some cases does not work right from the state) is still there.

Thank you!

@mbobinger
Copy link
Author

I'm just doing some review of previous related topics and their solutions to get a better idea:
Error: failed to set power state
#6395
solution: not really relevant for my case because of multiple cameras

#6555
solution: multiple config objects, added a delay between them.

https://forums.developer.nvidia.com/t/failed-to-set-power-state-error-for-multiple-realsense/121105
solution: multicam again

Error: frames did not arrive within 5000 seconds:
#2168
solution: reset USB device but the topic starter wasn't very happy with that
Related issues: #1889, #927, #840 and #901

#6766
only relevant for reading bag files

https://support.intelrealsense.com/hc/en-us/community/posts/360037544613-Frames-didn-t-arrive-within-5000-30-m-distance-with-2-active-cables

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Jan 29, 2021

Hi @mbobinger I am still performing initial analysis of the details of your case. As you have published a list of cases though, I will add #6740

In that case, which also used Pi 4 and Python 3, the RealSense user added a scheduled reset to their application and did not experience the Frame didn't arrive within 5000 error again after that.

In that case, they were also provided with the advice that the timeouts may be being caused by insufficient power being provided to the camera under peak power draw conditions.

A Power over Ethernet (PoE) HAT can be used with Pi 4 Model B to provide power to the board via ethernet as an alternative to an external power source.

https://www.raspberrypi.org/products/poe-hat/

I will continue to analyse your case to develop further advice.

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Jan 29, 2021

The stated use of repeater cables catches my attention. There was a past case where 12 D435 cameras were being used with repeater cable and they had camera detection issues where the camera that would not be detected changed randomly at boot-up. The cameras that stopped being detected could only be reset with an unplug and replug of the USB cable.

#6805

In that case, I referenced advice about repeater cables in Intel's multiple camera white-paper document.

https://dev.intelrealsense.com/docs/multiple-depth-cameras-configuration#section-d-cabling-and-enumeration


If it is necessary to use cables longer than 2m, we recommend using USB3 repeaters, and not just extension cables.
Unfortunately the quality of these vary tremendously. We have found that the following appears to work quite well. We have successfully strung 3 of these together:

https://www.centralcomputer.com/

@mbobinger
Copy link
Author

Thank you for your timely reply as always @MartyG-RealSense .
I have used following USB3.0 Repeater Cables of different brands (at the installation side it is a 5m USB 3.0 repeater cable):

  1. USB 3.0 Repeater Cable 1
  2. (Installation side): USB Repeater Cable 2

About the power issue: we have had problems with under voltage previously when not using a RPI official PSU like this one here:
RPI Official PSU

Of course using an official PSU doesn't mean that the camera also gets enough voltage during peak draw conditions that you have mentioned. We are already using a big fan to cool the RPI, which makes it hard to use a PoE hat but we can just get a PoE adapter. Anyways, the official RPI power supply really delivers decent voltage/current output when multiple components are plugged in, as I've checked using one of those power meter USB sticks, the voltage doesn't drop in a bit when the IntelRealsense is connected and the current goes up. But still there could be a problem of using the USB3.0 port - maybe it's better to switch to USB2.0 since we don't need the extra resolution and datarate of the USB3.0 one for our application.

Thanks for your ressources, I've reviews the white paper already, will take a look if #6740 can help us.

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Jan 29, 2021

Thanks very much @mbobinger

In the 12-camera system with repeater cabling mentioned earlier, it worked fine when tested in the user's US-based office and the problems manifested when the equipment was installed in an overseas location with different mains power supply.

USB 2.0 cabling may be a valid backup strategy, and was used as a 'good enough' solution in the case below:

IntelRealSense/realsense-ros#1624 (comment)

I look forward to your next update. Good luck!

@mbobinger
Copy link
Author

@MartyG-RealSense : we are checking tomorrow on the installation side because the customer is very pushy already but we don't have a spare intel camera for replacement if that was the problem.

Do you think it would be too much demanded if I ask you to prepare a short list with me on what to check? We would also like to scale our product bigger but we haven't reached a product maturity that makes us confident.

  1. Try USB 2.0 Cabling according to auto_exposure problem #1624
  2. Implement code-wise a reset of the cam
  3. Already running out of ideas. Maybe use a USB2.0 rated cable? Idk if that makes sense.

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Jan 29, 2021

Hardware failures are highly rare because of the long-term reliability of RealSense cameras, so it is unlikely to be a situation that requires a replacement camera.

RealSense 400 Series cameras are capable of running indefinitely so long as their operating temperature remains within recommended tolerances. In long-run sessions of multiple hours / days, if a problem occurs after a prolonged period of time then it is likely to be related to a glitch in the USB equipment (port / cable / hub), the computer hardware or its operating system.

Here are some further points to consider:

  • If your software application is a multicam application that is accessing multiple cameras from the same program (instead of having a separate copy of the program for each individual camera) then it is recommended that poll_for_frames() is used instead of wait_for_frames(). The reasons for this are discussed in the link below.

#2422 (comment)

  • It is recommended to have a swap file set up on Pi. This provides "virtual memory" taken from storage capacity (such as an SD card) that the Pi can draw upon once the real onboard memory is used up. Virtual memory is slower than physical memory.

https://www.howtoforge.com/ubuntu-swap-file

  • Are each of the 12 D435 cameras attached to individual Pi boards or are all 12 attached to one Pi via USB hub? Failed to set power state errors do tend to occur more often on setups that have more than 1 camera attached to the same computer. If you are attaching multiple cameras to the same board by hub, Nvidia Jetson is a more proven development board solution for this kind of setup with librealsense than Pi.

An example of a multicam networked setup where each camera has its own Pi 4 networked to a central computer is provided by Intel's open-source ethernet networking white-paper.

https://dev.intelrealsense.com/docs/open-source-ethernet-networking-for-intel-realsense-depth-cameras

  • It may be worth monitoring the temperature of the Pis and the cameras to check whether excessive temperature is causing problems. You can check the real-time temperatue of the camera with the RealSense Viewer or with code, but a quick test of camera temperature will be to simply put a hand on the D435 camera casing. If it is red hot to the touch after just a few minutes after a start of the stream then this would suggest that there is a temperature problem. This could be caused by a bad USB cable or a USB port glitch. There was also once a case where a RealSense user found that the Pi was overheating and transferring heat to the camera.

If cameras are experiencing a problem in some cases as soon as the stream activates then this would cast doubt on temperature build-up being a cause. It may be still worth checking though in order to eliminate it as a cause.

@sam598
Copy link

sam598 commented Jan 29, 2021

Are you using one Raspberry Pi for each RealSense camera?

In my anecdotal experience USB repeaters for multicam are unreliable to the point of being useless. The Raspberry Pi should be as close as possible to the RealSense device, and you should rely on ethernet to get data and power over a large distance.

@mbobinger
Copy link
Author

Thank you @MartyG-RealSense

  • I'll check the temperatures of our cameras after closing hour of the installation sides, i.e. 8pm. Excessive heat has not yet been a problem to my knowledge but eventually the 3D-printed case that I have designed (very briefly) is not ideal:
    grafik
    Anyways, I'll be happy to share the temperature results and discuss them.

  • The RPIs have a low temperature of around 40-45°C due to the GeekPi cooler that I use.

  • The installation just consists of one Raspberry and one Realsense Cam, I'll stay with wait_for_frames() as recommended for single cams.

  • I will change the swap_size from the default 100 to 1024 again: https://pimylifeup.com/raspberry-pi-swap-file/
    I've set it back from initially 2048 to the default of 100MB because there were 'rumours' that it could increase the 'wear' on the SD-card but to my knowledge 99.5% of the time, the swap isn't used anyways unless you really need it and then it is good to have it.

  • I would love to switch the RPI 4GB to the NVidia Jetson Nano 2GB, I'm just staying with the RPI4 for now because it has a stable Teamviewer support and we need that for troubleshooting, to my knowledge (like 1-2 months old), Teamviewer hasn't released any updates. VNCviewer would work for NVidia Jetson but their pricing model would be much more expensive.

@MartyG-RealSense
Copy link
Collaborator

Thanks again @mbobinger and thanks very much @sam598 for contributing your own experience to this case.

@mbobinger
Copy link
Author

mbobinger commented Jan 29, 2021

Are you using one Raspberry Pi for each RealSense camera?

In my anecdotal experience USB repeaters for multicam are unreliable to the point of being useless. The Raspberry Pi should be as close as possible to the RealSense device, and you should rely on ethernet to get data and power over a large distance.

@sam598 thanks a lot! The RPI is connected over a 5m USB 3.0 Repeater cable to the Intel RS cam. I haven't really considered the option to put the RPI next to the RS cam at the installation side and get the data over Ethernet, as @MartyG-RealSense suggested here: https://dev.intelrealsense.com/docs/open-source-ethernet-networking-for-intel-realsense-depth-cameras

It is a nice approach but I am also using the RPI for HDMI output to a monitor (and eventually a loudspeaker and an LTE router are also connected there :D.....). Anyways, good to know for a future application.

@mbobinger
Copy link
Author

@MartyG-RealSense The temperatures of the D435 in the 3D printed casing at the installation sides are just 34-35°C. I have plugged in a cam at my WIN10 PC and the temperature is 36°C here (22°C room temperature). So that's fine, will keep you posted about the other tries.

@MartyG-RealSense
Copy link
Collaborator

Yes, that temperature is within safe tolerances @mbobinger - thanks for the update and the forthcoming results from subsequent tests.

@mbobinger
Copy link
Author

mbobinger commented Jan 30, 2021

@MartyG-RealSense Hey Marty:
I've tested the pyusb package for resetting: https://iotbytes.wordpress.com/python-script-to-reset-usb-modem-com-port-on-raspberry-pi/

Please see my minimum code example below:

import usb.core
dev = usb.core.find(idVendor=0x8086, idProduct=0x0B07)
try:
    dev.reset()
except:
    print("could not reset")

The reset line works fine one single time:
grafik
So I've resetted the came and tried to restart my python code with the RS stream but It didn't work to initialize the cam.

And after trying to reset the camera for one subsequent time (using above code), I get an error and the code enters the except statement so the cam is not recognized at all anymore.

This behavior is similar (or identical) to the one I observed when opening realsense-viewer after my python program crashed. The camera is recognized there but I can not access the life stream, when i close and open realsense-viewer again, the camera is not recognized anymore.

So I think I have to either simulate a real unmount/mount scenario or sudo reboot the RaspberryPi. The reboot is not so nice because it takes long and the customer will probably not accept it.

EDIT 12 o`clock, 30th Jan 2021: next time, I'll try the release function of pyusb as well (see here: https://stackoverflow.com/questions/25617039/usb-device-release):

import usb.core
import usb.util

dev = usb.core.find(idVendor=0x8086, idProduct=0x0B07)
try:
    dev.reset()
    usb.util.dispose_resources(dev)
except:
    print("could not reset")

EDIT 12.15pm, 30Jan 2021: I can reset the camera as often as I want using Pyusb (and also including the release command) as long as the code didn't crash. In the next step, I'll check the reset/release function of Pyusb again after a code crash: I'll first make sure that no python program accesses the camera by sudo killall python3 and then try my luck with pysub, subsequently.

@MartyG-RealSense
Copy link
Collaborator

If a camera reset does not work, an alternative approach that would not require rebooting the Pi is to reset the USB port. On Raspbian, people seem to do that using Bash. Here are a couple of examples:

https://raspberrypi.stackexchange.com/a/92968
https://www.bishoph.org/raspberry-pi-usb-device-reset/

@mbobinger
Copy link
Author

mbobinger commented Jan 30, 2021

Thank you @MartyG-RealSense , I'll try your solution as well as this 'more' python based reset approach:
https://iotbytes.wordpress.com/python-script-to-reset-usb-modem-com-port-on-raspberry-pi/

But calling the bash script mentioned in your first link from python seems also like a good try:
https://www.kite.com/python/answers/how-to-execute-a-bash-script-in-python

1th EDIT: checked with pyusb and the lsusb based code examples. The issue is that after the crash, the camera completely disappears and can also not be seen by lsusb, but the bash scripts are based on reading the usb device from there.

Eventually I can power cycle the USB ports of the RPI4:
https://stackoverflow.com/questions/59772765/how-to-turn-usb-port-power-on-and-off-in-raspberry-pi-4

There is a package called uhubctl, the developer replied to the issue above and confirmed that he added RPi4 support with some downsides like that you have to reset all USB ports:
mvp/uhubctl@4aae44ced039

In this issue, the developer states that power can only be turned off for USB 3.0 (which would be fine for me):
raspberrypi/linux#3079 (but it can be that the other has launched an update in the meantime)
Note that to turn power off, both USB3 and USB2 ports must be turned off. Since uhubctl filters out USB2 hub, power is only turned off for USB3, and USB2 remains powered.

2th EDIT:
https://github.com/mvp/uhubctl#raspberry-pi-4b : using uhubctl -l 1-1 -a 0 should turn off the power to all devices, I could use support usb hubs though to control the switching. Here is a German tutorial on using uhubctl with RPI3B+: https://www.bitblokes.de/usb-ports-raspberry-pi-anschalten-ausschalten-deaktivieren/
other relevant threads: https://stackoverflow.com/questions/64936753/raspberry-pi-uhubctl-permissions , minimum bash code example for power cycling USB ports for RPI4 (very relevant): https://raspberrypi.stackexchange.com/questions/118656/raspberry-pi4-uhubctl-bash-script-wont-run
here the author suggests how to do power switching on RPI4: mvp/uhubctl#309
mvp/uhubctl#301

@SirDifferential
Copy link

SirDifferential commented Feb 1, 2021

We've been using Raspberry 4 with Realsense D415/D435/D455 for a couple of years and have some mixed results. We haven't seen any power issues with any 5.1V/3A PSUs or the POE hat, but the USB type C cables are an entirely different story. To this day, the only cable we've successfully used is the bundled cable that comes with the Realsense itself. Almost every other cable developed issues some hours or days into running the camera, while the official cable worked indefinitely without a reboot. I recall some of the RPIs having uptimes for 1-2 months until the OS was booted for some unrelated reason. We use the C and C++ API though.

So, I'd like to understand what makes the Realsense official cable such a unicorn in the world of non-functioning cables. It's awfully long for a small device housing, and a shorter cable would be nice to have.

On the port toggling, there is actually an EEPROM flag called USB_MSD_PWR_OFF_TIME that disables the USB ports on boot for n milliseconds. I'm not 100% sure if this works with SD card boot, but based on my quick tests, it seems to work. This should help if the Realsense needs a hard reboot.

@MartyG-RealSense
Copy link
Collaborator

Hi @SirDifferential The official 1 meter cables supplied with the camera are validated by Intel for use with RealSense. It has been said though that in general, the quality of USB-C cables from different suppliers can vary. In the data sheet document for the 400 Series cameras, Intel officially cite the company Newnex as a recommended supplier for 400 Series USB cables.

https://www.newnex.com/realsense-3d-camera-connectivity.php

I am not aware of a confirmed RealSense-compatible cable that is less than 1 meter in length though.

image

@mbobinger
Copy link
Author

It seems our installations run more stable with the Intel RS cam connected to USB2.0-Port of the RPI instead of USB3.0. We really don't need the extra performance that is delivered over USB3.0. I've also observed that the RPI seems to have some problems with USB3.0 as I can't manage to get boot from USB3.0 working. I'll log the errors now more systematically to determine what are the sources for the failures during long runs.

I also think the USB cables and in my case in particular the USB3.0 repeater cable can be an error source but there are recommendations for this error as well:
https://dev.intelrealsense.com/docs/open-source-ethernet-networking-for-intel-realsense-depth-cameras

I think that's more or less it from my side for the next time since I need to do more testing first.

@MartyG-RealSense
Copy link
Collaborator

Thanks very much @mbobinger for the update of your progress. I look forward to your next set of test results :)

@MartyG-RealSense
Copy link
Collaborator

Hi @mbobinger Do you have an update that you can provide, please? Thanks!

@mbobinger
Copy link
Author

Hi Marty - I can tell quite confident that using USB2.0 on RaspberryPi works more stable. I don't have any additional comments on the topic so far.

@MartyG-RealSense
Copy link
Collaborator

Thanks very much @mbobinger for the update. It is excellent to hear that you have a stable setup. I will close the case - please feel free to return to this issue on the forum if you have future problems. Good luck!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants