Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Windows] [Multi-Camera] D415 unresponsive after multiple sensor open and closes. #6397

Closed
MojamojaK opened this issue May 15, 2020 · 11 comments

Comments

@MojamojaK
Copy link
Contributor

MojamojaK commented May 15, 2020


Required Info
Camera Model D415
Firmware Version 5.12.3
Operating System & Version Windows 10
Platform PC
SDK Version 2.33.1
Language C++14
Segment

Issue Description

I am currently developing a device consisting of 20 D415 devices.
They are each connected to 5 pcie usb-hubs with 4 usb3.0-connectors.

At least one device will become unresponsive, meaning stop being enumerated into rs2::context().query_devices();, after around 200~1000 loops of running a code like below.
The current method to making them responsive requires either unplugging & replugging the device from their corresponding USB hubs, or shutting down and manually booting the PC (Restarts don't work)(You have to cut power to the device).

I have tried uninstalling all drivers, disabling USB selective suspend, but none of them worked.

Since enumeration is impossible, device.hardware_reset() cannot be executed.
I think there should be an alternative method of restarting the camera in a software manner. (or at least recover when on system restart (not shutdown & power up))

Whenever this occurs, the USB Tree View displays the following. (SET_ADDRESS_FAILURE)
Device_Unresponsive

Here is a quick sample for reproduction.
It will take long until a device becomes unresponsive.

#include <cstdlib>
#include <vector>
#include <iostream>
#include <future>
#include <librealsense2/rs.hpp>

constexpr size_t LOOP_COUNT = 500;
constexpr size_t PLUGGED_IN_COUNT = 20;

typedef struct stream_profile_config
{
	int fps;
	rs2_stream stream_type;
	int stream_index;
	rs2_format format;
	int width;
	int height;

} stream_profile_config;

std::tuple<bool, rs2::stream_profile> get_stream_profile(const rs2::sensor& sensor, const stream_profile_config& config)
{
	const std::vector<rs2::stream_profile> stream_profiles = sensor.get_stream_profiles();
	for (const rs2::stream_profile& stream_profile : stream_profiles)
	{
		if (stream_profile.is<rs2::video_stream_profile>())
		{
			const rs2::video_stream_profile video_stream_profile = stream_profile.as<rs2::video_stream_profile>();
			if (video_stream_profile.fps() == config.fps
				&& video_stream_profile.stream_type() == config.stream_type
				&& video_stream_profile.stream_index() == config.stream_index
				&& video_stream_profile.format() == config.format
				&& video_stream_profile.width() == config.width
				&& video_stream_profile.height() == config.height)
			{
				return std::make_tuple(true, stream_profile);
			}
		}
	};
	return std::make_tuple(false, stream_profiles[0]);
}

int main(void)
{
	std::cout << "[INFO] Starting Realsense Sandbox Program. " << std::endl;
	std::cout << "[INFO] Librealsense Library Version: " << RS2_API_VERSION_STR << std::endl;


	for (size_t loop = 0; loop < LOOP_COUNT; loop++) {
		std::cout << "[INFO] LOOP " << loop << std::endl;
		rs2::context context;
		const rs2::device_list device_list = context.query_devices();
		const uint32_t device_count = device_list.size();
		if (device_count < 1)
		{
			std::cout << "[ERROR] NO REALSENSE DEVICES FOUND!" << std::endl;
			return EXIT_FAILURE;
		}
		else if (device_count < PLUGGED_IN_COUNT)
		{
			std::cout << "[ERROR] REALSENSE DEVICE UNRESPONSIVE!" << std::endl;
			return EXIT_FAILURE;
		}
		std::vector<std::future<int>> threads;
		threads.reserve(device_count);
		for (uint32_t i = 0; i < device_count; i++) {
			threads.emplace_back(std::async(std::launch::async, [i, &device_list]() {
				const rs2::device device = device_list[i];
				if (!device)
				{
					std::cout << "[ERROR] REALSENSE DEVICE INVALID" << std::endl;
					return EXIT_FAILURE;
				}

				const rs2::color_sensor color_sensor = device.first<rs2::color_sensor>();
				const rs2::depth_sensor depth_sensor = device.first<rs2::depth_sensor>();
				if (!color_sensor || !depth_sensor)
				{
					std::cout << "[ERROR] REALSENSE DEVICE SENSOR(S) INVALID" << std::endl;
					return EXIT_FAILURE;
				}

				const std::tuple<bool, rs2::stream_profile> color_stream_profile_result = get_stream_profile(color_sensor, { 6, rs2_stream::RS2_STREAM_COLOR, 0, rs2_format::RS2_FORMAT_BGR8, 1280, 720 });
				const std::tuple<bool, rs2::stream_profile> depth_stream_profile_result = get_stream_profile(depth_sensor, { 6, rs2_stream::RS2_STREAM_DEPTH, 0, rs2_format::RS2_FORMAT_Z16, 1280, 720 });
				if (!std::get<0>(color_stream_profile_result) || !std::get<0>(depth_stream_profile_result))
				{
					std::cout << "[ERROR] REALSENSE SENSOR STREAM PROFILE NOT FOUND" << std::endl;
					return EXIT_FAILURE;
				}

				const rs2::stream_profile& color_stream_profile = std::get<1>(color_stream_profile_result);
				const rs2::stream_profile& depth_stream_profile = std::get<1>(depth_stream_profile_result);

				rs2::syncer syncer;

				std::cout << "[INFO] Opening / Starting Sensors" << std::endl;
				color_sensor.open(color_stream_profile);
				try
				{
					depth_sensor.open(depth_stream_profile);
					try
					{
						color_sensor.start(syncer);
						try
						{
							depth_sensor.start(syncer);
							try
							{
								std::cout << "[INFO] Opening / Starting Sensors Complete" << std::endl;

								rs2::frameset fs;
								syncer.try_wait_for_frames(&fs, 8000);
							}
							catch (...)
							{
								depth_sensor.stop();
								throw;
							}
						} catch (...)
						{
							color_sensor.stop();
							throw;
						}
					} catch (...)
					{
						depth_sensor.close();
						throw;
					}
				} catch (...)
				{
					color_sensor.close();
					throw;
				}
				
				std::cout << "[INFO] Stopping / Closing Sensors" << std::endl;
				try { color_sensor.stop(); }
				catch (const rs2::wrong_api_call_sequence_error&) {}
				try { depth_sensor.stop(); }
				catch (const rs2::wrong_api_call_sequence_error&) {}
				try { color_sensor.close(); }
				catch (const rs2::wrong_api_call_sequence_error&) {}
				try { depth_sensor.close(); }
				catch (const rs2::wrong_api_call_sequence_error&) {}
				std::cout << "[INFO] Stopping / Closing Sensors Complete" << std::endl;
				
				return EXIT_SUCCESS;
			}));
		}
		bool status = EXIT_SUCCESS;
		for (std::future<int>& thread : threads) {
			try {
				status |= thread.get();
			}
			catch (const std::exception& e) {
				std::cout << "[ERROR] Thread threw error: " << e.what() << std::endl;
				status = EXIT_FAILURE;
			}
		}
		if (status == EXIT_FAILURE) {
			return EXIT_FAILURE;
		}
	}
	std::cout << "[INFO] Closing Realsense Sandbox Program. " << std::endl;
	return EXIT_SUCCESS;
}
@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented May 15, 2020

If you cannot reset a specific device because it cannot be enumerated, an alternative solution may be to reset the entire USB port. It should then not be dependent on detection of whatever camera is currently plugged into it. Googling for 'windows c++ reset usb port' can provide some research leads about how to accomplish this.

A popular suggestion is a Microsoft tool called DevCon.

https://docs.microsoft.com/en-us/windows-hardware/drivers/devtest/devcon

@MojamojaK
Copy link
Contributor Author

MojamojaK commented May 15, 2020

Thank you for a fast response.

I forgot to mention, but I have already tried resetting the hub both from device manager and devcon :(
same results

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented May 15, 2020

Are you using multiple pipelines, please? If so, poll_for_frames should be more suitable to a multicam setup than try_wait_for_frames so you are not missing frames from one queue whilst waiting on another.

The link below has an explanation of the differences between try_wait_for_frames, poll_for_frames, etc.

#2422 (comment)

If all 20 cameras are attached to the same device with 5 hubs, how many cameras at a time are actively streaming, please? The most RealSense cameras that I have seen in simultaneous use on the same computing device is 6. Twenty cameras streaming simultaneously would require a very powerful computer system to provide sufficient processing resources (an i7 processor is recommendable for just 4 simultaneously streaming cameras).

The USB hub system would also have some inefficiencies if the hub does not have a dedicated USB controller for each of its ports and is instead handling more than 1 port on the hub.

@MojamojaK
Copy link
Contributor Author

MojamojaK commented May 15, 2020

Ok, I will try to use poll_for_frames and see the effects later.

I understand what you are trying to say. As the application I am developing requires rapid almost simultaneous capturing for a short amount of time, streams for all twenty cameras are opened and then closed quickly. Staggering captures from each device simply won’t work because stream.open(), stream.start() stream.stop(), stream.close() takes too much time. Framerate is lowered to 6 fps as anything above that will crash the program.

I used to use the typical rs::pipeline for streaming, but it proved to be too inefficient for the application, so I switched to using rs::syncer and it seems to be working 99% fine (with lots of automatic retry routines) (receiving 4x2 depth and color frame from all cameras in just 800ms)
The 1% problem is that it wouldn’t last over 200+ stream open/close in one process.

There are currently 1 usb controllers for every 2 devices.

I was wondering if there was any way we could brute force a hardware reset using cfgmgr32 or some windows usb library to enumerate through the unresponsive devices and transfer the HWRST command (if possible).

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented May 15, 2020

Rather than rapidly open and close the streams, an alternative may be to put the whole CPU to sleep until you need to wake it up for a capture. poll_for_frames is very suited to this, since you have to control the sleep state for it carefully anyway.

#6281

It also seems like an application that would be suited to multicam hardware sync, where you link the 20 cameras together with sync cabling (or transmit a wireless pulse from a signal generator) and 19 of those cameras are set as "slave" cameras that follow the capture timing of 1 "master" camera, or 20 cameras sync as slaves to the wireless pulse.

https://www.intel.co.uk/content/www/uk/en/support/articles/000028140/emerging-technologies/intel-realsense-technology.html

In regard to a reset alternative, the link below leads to a multicam version of hardware_reset()

#3829 (comment)

@MojamojaK
Copy link
Contributor Author

Thank you for your advices. They were very useful.

First, I have tried putting some threads created by librealsense to sleep, by intercepting operator()(rs2::frame) callbacks to the syncer with condition_variables by implementing a custom syncer which extends rs2::syncer. This seems to be working great as it allowed the CPU usage to drop from a continuous 10-30% to 0-5% with 20 devices. I will have to do more testing to see if any devices become unresponsive with this.

#6281 really gave me hints in implementing this. Thank you.

Second, even though I knew about the hardware sync feature, I didn't want to risk investing time and money into hardware (as some cameras are way more than 3 meters apart) that I couldn`t prove would work with 20 devices on 6fps. The white paper only shows up to 6 devices on 30 fps.
Plus, I am already 99% satisfied with the current configuration and looking for an easy solution to this problem.

Third, the symptoms in #3829 and the one I am experiencing is very similar. The issuer there is using Ubuntu which I think some machines has a USB hub power cut capability (ex https://github.com/mvp/uhubctl), which would fix this problem for sure, but a Windows machine usually does not have this capability.

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented May 17, 2020

I'm glad that the links provided were helpful. :)

The link below has a method in Microsoft's documentation for power-cycling a USB port or hub on Windows.

https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/usbioctl/ni-usbioctl-ioctl_usb_hub_cycle_port

@MartyG-RealSense
Copy link
Collaborator

Do you require further assistance please, or can this case be closed? Thanks!

@MojamojaK
Copy link
Contributor Author

Not for now! Thanks!
Might do some extensive testing and reopen later though.

@MartyG-RealSense
Copy link
Collaborator

Thanks so much for the update!

@MojamojaK
Copy link
Contributor Author

Just a comment.
Seems like the latest firmware 5.12.5.0 is capable of automatically resetting and enabling re-enumeration of the cameras in some cases.
Really awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants