-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
X_LINK_ERROR #408
Comments
Does this happen right away or after a period of time while its running? |
After some time, like 8k iterations with detections as I can remember. |
I have posted a similar issue luxonis/depthai-experiments#210 |
While running this example demo with
I will try |
We are working on what could be the same underlying issue. Not 100% sure though. @themarpe is on it. |
I appreciate it very much. If I may ask, is it hardware or software problem? |
@BlonskiP I ran it on mine (straight copy from the repo because I didn't have much time to try your version) and after an hour it was still running. Temps got up to ~58C. I don't think its the ImageManip issue and maybe it is heat related if you get into the 80C range. |
@madgrizzle Thank you very much. I will try to get my hands on raspberry fun as soon as I can anyway ;) |
Maybe a low-profile fan on the heatsink on the other side where the cameras are since it seems that part is getting really hot. |
So in terms of the heat of the DepthAI module - 85C is not a problem. The DepthAI SoM can run indefinitely at 105C die temperature. That said, I'm not sure if the Pi temperature could be an issue. Thoughts on this one @themarpe ? |
Hi @BlonskiP and @madgrizzle I am a bit behind on this issue (and the one @madgrizzle brought up), but I think same issue happens on device side. Initial guess was ImageManip, as there is a lot of complexity there, which could cause such a bug, but its not common between these two issues. I'm still wrapping up ImageManip improvements and I'll attack this instability issues next week. Regarding HW vs SW issue, the one I observed testing @madgrizzle issue, it was SW, but in this case not sure if maybe host has any influcence (can you reproduce on x86-64 @BlonskiP ?). |
@themarpe I ran the yolo script as well on an x86-host (the same one I use for the gen2-triangulation) and had no problems... it ran for four hours without a glitch. I had spun up a RPI4 when testing gen2-triangulation issues so I started the yolo script on it during my lunch break and will check it tonight. |
It ran for ~6 hours with no problem on the RPI4. |
@themarpe I added some additional code (from the gen2-triangulation) for face detection and let it run overnight. When I got up, it had crashed with rgb stream error. There was no face detections going on (no one in room.. except maybe a ghost?) so the imagemanip crop shouldn't have been called. I'll try to run the original script for a long time as well and see if it crashes. I do think there's a memory corruption problem going on as you suspect.. which is unfortunate as that's one of and the hardest types of bugs to find (and I'm assuming on the closed-source side of things as well). Finally, when I took this script and added cropping, face rotation, and face reidentification to the pipeline as well, it crashed within about 20 seconds. Seems when the pipeline gets busy, the crashing happens quicker. |
Cross posting: luxonis/depthai-experiments#210 (comment) |
It looks like this fix has increased stability, but this error still occurs :( |
Hi @BlonskiP |
I can confirm I get this error on the following environment: host: Ubuntu 20.04.3 LTS with AMD CPU Testing with camera1: fails after a few seconds of recognition. sometimes sgows saving... but not always. python3 main.py --name frog
Creating pipeline...
Creating Color Camera...
Creating Face Detection Neural Network...
Creating Head pose estimation NN
Creating face recognition ImageManip/NN
[14442C10D12853D000] [8.516] [NeuralNetwork(10)] [warning] Network compiled for 4 shaves, maximum available 13, compiling for 6 shaves likely will yield in better performance
[14442C10D12853D000] [8.763] [NeuralNetwork(10)] [warning] The issued warnings are orientative, based on optimal settings for a single network, if multiple networks are running in parallel the optimal settings may vary
Saving face...
Saving face...
Saving face...
Saving face...
[14442C10D12853D000] [17.464] [system] [critical] Fatal error. Please report to developers. Log: 'class' '374'
Traceback (most recent call last):
File "main.py", line 254, in <module>
frameIn = frameQ.tryGet()
RuntimeError: Communication exception - possible device error/misconfiguration. Original message 'Couldn't read data from stream: 'frame' (X_LINK_ERROR)' similar error when just running the |
@hipitihop can you try using a latest |
@themarpe My current setup is as follows: Debug logs attached: debug-log-oak-d-lite.txt Updated: let me know if this is an incorrect folder structure. I run the install requirements from the top level but run the experiment from within the Updated: I now see that I'm using the main demo repo |
@hipitihop We're looking into this bug in the meantime. |
Indeed with the Oak1 this does not crash. It does not seem to do any saving, but this might just need me to clear previous data to start fresh for a given name As for the Oak-D Lite: with [2022-01-19 09:39:32.620] [debug] Python bindings - version: 2.10.0.0 from 2021-08-24 18:49:37 +0300 build: 2021-08-24 17:52:17 +0000
[2022-01-19 09:39:32.620] [debug] Library information - version: 2.10.0, commit: 57bb84ad209825f181744f2308b8ac6f52a37604 from 2021-08-24 18:49:14 +0300, build: 2021-08-24 17:43:07 +0000
[2022-01-19 09:39:32.623] [debug] Initialize - finished
Creating pipeline...
Creating Color Camera...
Creating Face Detection Neural Network...
Creating Head pose estimation NN
Creating face recognition ImageManip/NN
[2022-01-19 09:39:32.687] [debug] Resources - Archive 'depthai-bootloader-fwp-0.0.12.tar.xz' open: 1ms, archive read: 62ms
[2022-01-19 09:39:33.056] [debug] Resources - Archive 'depthai-device-fwp-7131affa2c01ecd34506e9c3dd8ea9198ed874f1.tar.xz' open: 1ms, archive read: 431ms
[2022-01-19 09:39:33.074] [debug] Device - OpenVINO version: 2021.2
[2022-01-19 09:39:33.080] [debug] Patching OpenVINO FW version from 2021.4 to 2021.2
[18443010A1D10A1300] [11.280] [system] [info] Memory Usage - DDR: 0.12 / 358.55 MiB, CMX: 2.09 / 2.50 MiB, LeonOS Heap: 6.26 / 77.56 MiB, LeonRT Heap: 2.83 / 23.94 MiB
[18443010A1D10A1300] [11.280] [system] [info] Temperatures - Average: 37.71 °C, CSS: 39.35 °C, MSS 36.77 °C, UPA: 37.94 °C, DSS: 36.77 °C
[18443010A1D10A1300] [11.280] [system] [info] Cpu Usage - LeonOS 7.40%, LeonRT: 2.06%
....
[18443010A1D10A1300] [11.722] [system] [error] Attempted to start Color camera - NOT detected!
[18443010A1D10A1300] [11.418] [system] [info] ImageManip internal buffer size '203904'B, shave buffer size '20480'B
[18443010A1D10A1300] [11.418] [system] [info] SIPP (Signal Image Processing Pipeline) internal buffer size '156672'B
[18443010A1D10A1300] [11.418] [system] [info] NeuralNetwork allocated resources: shaves: [0-12] cmx slices: [0-12]
[18443010A1D10A1300] [11.418] [system] [info] ColorCamera allocated resources: no shaves; cmx slices: [13-15]
[18443010A1D10A1300] [11.418] [system] [info] ImageManip allocated resources: shaves: [15-15] no cmx slices.
[18443010A1D10A1300] [11.432] [NeuralNetwork(10)] [info] Needed resources: shaves: 4, ddr: 1605632
[18443010A1D10A1300] [11.432] [NeuralNetwork(10)] [warning] Network compiled for 4 shaves, maximum available 13, compiling for 6 shaves likely will yield in better performance
[18443010A1D10A1300] [11.722] [system] [error] Attempted to start Color camera - NOT detected!
[18443010A1D10A1300] [11.475] [DetectionNetwork(3)] [info] Needed resources: shaves: 6, ddr: 2728832
[18443010A1D10A1300] [11.707] [NeuralNetwork(7)] [info] Needed resources: shaves: 6, ddr: 21632
[18443010A1D10A1300] [11.721] [NeuralNetwork(10)] [warning] The issued warnings are orientative, based on optimal settings for a single network, if multiple networks are running in parallel the optimal settings may vary
[18443010A1D10A1300] [11.721] [NeuralNetwork(10)] [info] Inference thread count: 2, number of shaves allocated per thread: 4, number of Neural Compute Engines (NCE) allocated per thread: 1
[18443010A1D10A1300] [11.722] [DetectionNetwork(3)] [info] Inference thread count: 2, number of shaves allocated per thread: 6, number of Neural Compute Engines (NCE) allocated per thread: 1
[18443010A1D10A1300] [11.723] [NeuralNetwork(7)] [info] Inference thread count: 2, number of shaves allocated per thread: 6, number of Neural Compute Engines (NCE) allocated per thread: 1
[18443010A1D10A1300] [12.281] [system] [info] Memory Usage - DDR: 143.71 / 358.55 MiB, CMX: 2.47 / 2.50 MiB, LeonOS Heap: 16.87 / 77.56 MiB, LeonRT Heap: 7.29 / 23.94 MiB
[18443010A1D10A1300] [12.281] [system] [info] Temperatures - Average: 38.94 °C, CSS: 40.28 °C, MSS 38.65 °C, UPA: 38.65 °C, DSS: 38.18 °C
[18443010A1D10A1300] [12.281] [system] [info] Cpu Usage - LeonOS 13.06%, LeonRT: 59.15%
[18443010A1D10A1300] [13.282] [system] [info] Memory Usage - DDR: 143.71 / 358.55 MiB, CMX: 2.47 / 2.50 MiB, LeonOS Heap: 16.87 / 77.56 MiB, LeonRT Heap: 7.29 / 23.94 MiB
|
Hello @hipitihop, I believe that is a different issue - OAK-D-Lite uses camera sensors that weren't compatible with the firmware before ~2.11. So OAK-D-Lite using depthai 2.10 on any pipeline will error out with the same issue - camera not found. |
Hello. |
@BlonskiP we've observed that CM4 suffers from an thermal issue on USB hub chip. |
Can you share more details, minimum reproducible example script and the log of the run with DEPTHAI_LEVEL=debug enabled? |
@themarpe |
@jasonm189 which Luxonis camera/product you are using running the examples on? |
OAK-D. The issue happens only with that example, from what I've read it's a known issue with script node. |
@Erol444 on the above if you have anything like tracking list of issues or you can help with the example. |
@jasonm189 there was a sporadic error before we changed the script nodes CPU: |
Yes, it still crashes after 30+ mins. |
@jasonm189 - Sorry about the trouble. And actually given that your setup seems to be the only remaining crashing case here could you make a new issue so we can have all the details of the setup in one place? And then tag @Erol444 and me in it (and this issue)? |
On my environment, tiny_yolo v4 sample can work at depthai liberary ver2.14 but from 2.15 it can't.
|
@kazyam53 |
Addressed by luxonis/depthai-core#616 Reran gen2-face-detection in experiments over night, ran for 7h without issues |
Hello,
I have an issue with running tiny-yolo-v4 with SpatialDetection.
I'm using copy+pasted demo from:
https://docs.luxonis.com/projects/api/en/latest/samples/SpatialDetection/spatial_tiny_yolo/
With tiny yolo blob:
https://artifacts.luxonis.com/artifactory/luxonis-depthai-data-local/network/tiny-yolo-v4_openvino_2021.2_6shave.blob
My only change was adding print(...) with iteration counter, fps and average chip temperature.
My device is: OAK-D-CM4
device url: https://shop.luxonis.com/products/depthai-rpi-compute-module-4-edition
depthAi version:
2.11.1
Installed by:
python3 -m pip install git+https://github.com/luxonis/depthai-python.git@caf537b
without using venv.
Python
python3 --version Python 3.7.3
Raspi system informations
Error messages:
File "yolo_detection_test_sp.py", line 151, in <module> boundingBoxMapping = xoutBoundingBoxDepthMappingQueue.get() RuntimeError: Communication exception - possible device error/misconfiguration. Original message 'Couldn't read data from stream: 'boundingBoxDepthMapping' (X_LINK_ERROR)'
or on custom code
RuntimeError: Communication exception - possible device error/misconfiguration. Original message 'Couldn't read data from stream: 'RGB' (X_LINK_ERROR)'
Temperatures
Average chip temperature: 85C
Raspi temperature: 77C
Code
The text was updated successfully, but these errors were encountered: