Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ftdi_sio ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32 #2406

Open
mkohns opened this issue Feb 28, 2018 · 57 comments
Open
Labels
Waiting for internal comment Waiting for comment from a member of the Raspberry Pi engineering team

Comments

@mkohns
Copy link

mkohns commented Feb 28, 2018

Scenario:

Hardware:
RPI3, 4.14.22-v7+, /boot/.firmware_revision: v634741d4199871ab8bd5446a8e63b7e06c1885af (latest by today)
Device: 3D Printer, FTDI Fake Chip: FT232RL, 0403:6001, SerialNumber: A50285BI

Description:
The FTDI Chip of the 3D Printer offers /dev/ttyUSB0 for communication.
The kernel module ftdi_sio creates a serial device.
With a simple

cat /dev/ttyUSB0

the output of the printer can be monitored.
The call:

echo "M155 S1" > /dev/ttyUSB0

initates a automated temperature report every second.

Case:
This scenario works perfect on i.e desktop pc with ubuntu 17.10, 4.13.0-36-generic but also under 3 different distros I tested the same way. Also under Windows 7/10 (cygwin) the connection is stable.

Under OrangePI (armbian), BananaPi (armbian), RPI1/2/3 (wheezy, jessie, stretch) the connection is unpredictable unstable and dies with

ftdi_sio ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32

I already cross tested the following options (permutations) with no success:

max_usb_current=1
dwc_otg.microframe_schedule=0
dwc_otg.fiq_fsm_enable=0
dwc_otg.fiq_fix_enable=0 
dwc_otg.fiq_split_enable=0
dwc_otg.nak_holdoff_enable=0
dwc_otg.trans_backoff=3000
dwc_otg.fiq_fsm_mask=0x0 

This issue seems only be handled by

dwc_otg.speed=1

which is no option, as the whole bus (network card included) is limited to USB 1.1

From my point of view, also it is a Fake FTDI Chip, is seems to be an ARM linux kernel issue. On Intel/AMD linux and windows the communication is stable.

What would be the best way to investigate and get this issue solved?

References:
#1187
raspberrypi/firmware#88
https://raspberrypi.stackexchange.com/questions/1886/what-kernel-parameters-are-available-for-fixing-usb-problems

Simplification:
I simplied the testing scenario to temperatur monitoring. Of course several applications are available for managing 3D Printers in all their complexity. I encountered communications issues with this applications under RPI3.

@JamesH65
Copy link
Contributor

@P33M Not sure there is much to go on here, but any comments?

Seems to be ARM related since desktops don't see the same issue.

@JamesH65 JamesH65 added the Waiting for internal comment Waiting for comment from a member of the Raspberry Pi engineering team label Apr 23, 2018
@arobbins805
Copy link

Was this ever solved?

I am getting a similar issue on Ubuntu 16.04 LTS and 18.04 LTS. I need to repeatedly re-enumerate an FTDI chip. The device is recognized with the Product: FT230X Basic UART.

My syslog has the following lines:

[ +9.835078] ftdi_sio ttyUSB196: usb_serial_generic_read_bulk_callback - urb stopped: -32
[ +0.000090] ftdi_sio ttyUSB196: usb_serial_generic_read_bulk_callback - urb stopped: -32
[ +0.230177] usb 9-4.3.4: USB disconnect, device number 22
[ +0.000137] ftdi_sio ttyUSB196: error from flowcontrol urb

[Feb26 07:06] ftdi_sio ttyUSB1: usb_serial_generic_read_bulk_callback - urb stopped: -32
[ +0.000066] ftdi_sio ttyUSB1: usb_serial_generic_read_bulk_callback - urb stopped: -32
[ +0.139339] usb 3-4.4.4: USB disconnect, device number 105
[ +0.000141] ftdi_sio ttyUSB1: error from flowcontrol urb

[ +9.969193] ftdi_sio ttyUSB174: usb_serial_generic_read_bulk_callback - urb stopped: -32
[ +0.000114] ftdi_sio ttyUSB174: usb_serial_generic_read_bulk_callback - urb stopped: -32
[ +0.013751] usb 9-4.4.2: USB disconnect, device number 126
[ +0.000137] ftdi_sio ttyUSB174: error from flowcontrol urb

I can provide more detailed info if people are interested in this bug.

@Lukanite
Copy link

I think I'm seeing a similar issue. On the Pi Zero W, if I plug in an FTDI cable (TTL-232R-3V3) into the OTG port, when I try to use it, say, with:
screen /dev/ttyUSB0 115200
The Pi Zero W immediately slows to a crawl and remains unusably slow until the cable is disconnected (by unplugging it), upon which the kernel log prints:

Mar 27 22:06:17 pizerow kernel: [  207.609776] ftdi_sio ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32
Mar 27 22:06:17 pizerow kernel: [  207.609799] ftdi_sio ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32
Mar 27 22:06:17 pizerow kernel: [  207.644293] usb 1-1.2: USB disconnect, device number 3
Mar 27 22:06:17 pizerow kernel: [  207.651963] ftdi_sio ttyUSB0: FTDI USB Serial Device converter now disconnected from ttyUSB0
Mar 27 22:06:17 pizerow kernel: [  207.652106] ftdi_sio 1-1.2:1.0: device disconnected

Oddly enough, if I leave htop open in the background, its update rate slows to <1 update per minute, but when it does, the load figure spike to 4+ but with no process seeming to be responsible.

This is occurring on kernel 4.14.98+ with Raspbian Stretch Lite. My Pi 3B+ on the same kernel and OS (albeit armv7) handles the cable just fine.

@Lukanite
Copy link

Hmm, just tried a USB ACM device and it seems to exhibit the same issue, but ONLY if I connect it to the Pi Zero W with a USB hub. If I use an USB OTG adapter to directly connect the device, it's usable and fine.

However, if I try connecting the FTDI cable directly, it just causes the Pi Zero W to reboot. Once it comes back up, though, it's fine.

Maybe it has to do something with hubs?

@arobbins805
Copy link

I discovered what was causing this 'error' for me. My program was leaving the /dev/ttyUSBN serial port open. I rebooted and changed my software to close the port after use (Previously, it was left open in one of the error cases). After doing this, I no longer saw the dmesg or syslog errors. Hopefully this helps.

@Lukanite
Copy link

Lukanite commented Apr 2, 2019

Yep, I also see that it only happens if the device is disconnected while the port is open. Sometimes this causes all of the USB ports to stop working, though

@mrx23dot
Copy link

mrx23dot commented Apr 23, 2019

I have the same issue, app requires the COM port to be continuously open, I plug it directly into Pi3, without hub. No current changes during run. Any solution?

@JamesH65
Copy link
Contributor

@spl237 related to the lxpanel issues perhaps?

@tinkerdudeno1
Copy link

Like many others, I can confirm this issue with the fake ftdi disconnecting. Here's my story:

  • Got a brand new Rpi 4
  • Installed docker and ran a service which is constantly listening and occasionally sending signals to the ftdi (which is connected to a bus on my heating system).
  • It almost instantly disconnects with usb_serial_generic_read_bulk_callback - urb stopped: -32 on dmesg.
  • Funny thing is: The exact same image/setup works fine on my old Rpi 3 sitting right next to it. I literally unplugged-replugged with the exakt same docker image on the Pis back and forth and one would work, the other one would go crazy and disconnect. Thought it MUST be their different kernels (Rpi4: 5.4.69-v7l+; Rpi3: 4.19.97-v7+) and found this issue after endless googling.
  • Downgrading to usb 1 (with dwc_otg.speed=1 in /boot/cmdline.txt as suggested in RPI 1 B+ FT232 disconnecting #1187) in the Pi 4 did not work for me.

In my desparation, I tried another (fake) ftdi I had flying around and guess what: Everyting is just fine on the Pi 4, no reconnects and no need to downgrade to usb 1. Guess there are different quality fake ftdi out there. So if you -- like me -- don't want to spend 20 euros on a genuine ftdi (which would almost definitey mitigate your problem), trying different ftdi from different batches/sellers might be worth a try.

One more thing: Just connecting the ftdi which just sits there with no data coming through will not cause it to reconnect. It's when you begin pushing data over the chip, it will go crazy and disconnect.

So I'm afraid this issue is very hard to track down. Even with the same setup and a seemingly identical fake ftdi, the behavior is unpredictable.

@otavio
Copy link
Contributor

otavio commented Oct 22, 2020

I confirm the very same issue @tinkerdudeno1; the same issue also works on rpi3 and fails on rpi4. In our case, we are using 5.4.72 kernel on both.

@bangom
Copy link

bangom commented Nov 16, 2020

I've hit the same error when using mini USB to UART converter with FT232 chip (https://www.gme.cz/ftdi-prevodnik-s-mini-usb-a-spi) under heavier load:

[  743.655103] ftdi_sio ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32
[ 2463.978860] ftdi_sio ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32
[ 2557.336232] ftdi_sio ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32

RPI: Raspberry Pi 3 Model B Rev 1.2
Kernel: 5.4.75-v7+

The Converter is directly connected to a USB port of the Rasberry.

Downgrading to usb 1 with dwc_otg.speed=1 in /boot/cmdline.txt helped mitigate USB disconnects.

@SamuelGold
Copy link

Raspberry Pi 4B, Kernel 5.4.79-v7l+ and the same problem:
[ 2911.620320] cp210x ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32
[ 3434.996100] cp210x ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32

@bombasticbob
Copy link

bombasticbob commented Feb 16, 2021

I am exhibiting problems like this when doing rapid IO with genuine FTDI ICs on devices that are under development.

They show up as errno=EINVAL after poll or write operation, and appear to be cumulative; that is, closing the application "fixes" it for a while, but the problem shows up more rapidly after i restart the application. The serial is being run at 115k baud to control devices and are rapidly polled. I've used FTDI serial for a lot of things with x86 but not for ARM and so I wasn't expecting problems.

If you look HERE you'll see that the private member 'prev_status' is defined as 'char' (which is unsigned by default for ARM, unless some compiler option has been altered to change that). It's being "math"d with 'status' which is an unsigned char. Normally you would think this is OK but the data types _ were _ declared differently, and I'm not sure whether compiler data type promotion would affect it.

I looked for this type of problem specifically because I've seen weirdness in my own code when porting it to ARM, related to signed vs unsigned char. It's probably benign, but it's the _ kind _ of thing to look for.

Other ARM-related things would have to do with atomics and translation of userland to kernel addressing for various purposes.

Someone familiar with porting drivers from x86 to ARM may know of other things to look for. This was the only thing I spotted that caused me to raise my eyebrow even slightly.

uname: Linux rpi4test 4.19.97-v7l+ #1294 SMP Thu Jan 30 13:21:14 GMT 2020 armv7l GNU/Linux
(yes it's an RPi 4 but the 3 seems to do it as well)

@Sennevds
Copy link

I had/have the same problem. I have a creality Ender 3 V2 3d printer which uses a serial connection over usb to connect to a pi3 b+. It worked perfectly for several weeks until I moved the printer (and the pi) to a build in cabinet. Since the move I got the above errors every hour. I could reconnect but the print has already stopped of course. Now I have moved the pi to a cabinet below the printer and it works perfectly again so it looks like interference creates this problem for me. Not sure if this is completely the same problem(I have the same errors in my dmesg) but I thought maybe this insight couple help identify the problem.

@Stifael
Copy link

Stifael commented Feb 24, 2021

I experience a similar issue but with a different raspi-setup.

Raspi: Stretch (version 9)
Kernel: 4.19.66-v7+ #1253 SMP Thu Aug 15 11:49:46 BST 2019
FTDI: FT4232H Q (https://ftdichip.com/products/usb-com485-plus4/)

I currently experience two problems with this setup, but I am not sure if they depend on each other, and I am also only able to reproduce the latter:

  1. Communication problems can occur during normal operation. Even though the hardware setup has not been changed and the system was running for several months, the system can enter a state where communication is no longer possible. Not even rebooting the raspi helps.
  2. Hot Plugging the FTDI. It can happen that not only the communication stops, but also that the raspi crashes.

The second problem results in the usb_serial_generic_read_bulk_callback - urb-message.
Since I have no clue what causes the first issue, I tried to tackle the second problem first in the hope that the two issues are somewhat dependent on each other. For that, I wrote a very simple cpp-executable that opens 4 ports and writes "hello" to each ttyUSBx port sequentially. When running the executable and test hot-plugging, the raspi can enter a state where it can no longer recover from the communication loss and in the worst case results in a crash.

findings

  • writing to just one ttyUSBx port lowers the chances of receiving the usb_serial_generic_read_bulk_callback - urb error message
  • setting dwc_otg.speed=1 further improves the stability of the communication
  • the same setup but with a Syslogic CompactS runs without issues

ftditest.zip

@HuberDe
Copy link

HuberDe commented Jun 5, 2022

Hi,

just that issue after googling regarding my problem on my pi4. I‘m running zigbee2mqtt on my pi using a zigbee hardware dongle. This worked great for the last year. Since a few days my zigbee2mqtt is crashing several times per day. I did not do any changes on the software nor hardware side. So I assume it could be a aging topic of the hardware?

Denis

@alfredopalhares
Copy link

I too am having the same error as above, when using using Klipper on an Ender 5 3D printer

[  111.775735] ch341-uart ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32
[  130.875367] ch341-uart ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32

I have tried:

  • Adding dwc_otg.speed=1
  • Updating the system
  • Changed power supply
  • Wraping in the USB comunications cable in aluminum foil

@antonmeyer
Copy link

similar issue
cp210x ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32

RP4 5.10.103-v8+ #1529 SMP PREEMPT Tue Mar 8 12:26:46 GMT 2022 aarch64 GNU/Linux

reading at 9600 baud continously

@dchauran
Copy link

dchauran commented Nov 23, 2022

Frequent disconnects in OctoPrint with a CH340 (MKS Robin Nano v1.2)

ch341-uart ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32
  • Rpi 3b+
  • Using a very short shielded cable
  • No undervoltage issues, power supply is more than adequate
  • dwc_otg.speed=1 seemingly has no effect (though it still says a full speed device is connected, but I am unsure if that is relevant or not)

@raspberrypi raspberrypi deleted a comment Nov 24, 2022
@synman
Copy link

synman commented Dec 12, 2022

I too am having this problem with an ESP32 board with an integrated FTDI controller connecting to my pi4. It happens on both the USB3 and USB2 ports.

Dec 12 17:16:03 octopi-qbp kernel: ftdi_sio ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32
Dec 12 17:16:03 octopi-qbp kernel: ftdi_sio ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32

On the ESP32 side, the uart keeps the port open and sends and receives without issue.

On the pi side, not so pretty. Once the urb stopped -32 message pops in my syslog the pi loses its receive channel entirely. It is able still able to send data no problem, but the send side is dead.

The only way to recover is to disconnect / reconnect (via software).

Downgrading to usb 1.1 does not solve my issue.

@synman
Copy link

synman commented Dec 13, 2022

Coming back (quickly) to this. I had a breakthrough of sorts.

I failed to mention I also had a CyberPower UPS connected to my pi. For reasons I do not yet understand, after I disconnected the UPS, my problems with the ESP32 board went away.

So now I don't know what to think. I got my ESP32 board working properly, but now I'm SoL when it comes to monitoring my UPS locally.

@klack
Copy link

klack commented Feb 1, 2023

I am also having this issue on 2 out of 3 printers. They are on a Raspberry Pi 4 running in docker.

In the past, I had an undervoltage issue that affected those same 2 printers. Upgrading the power supply solved that issue. But now I have this one.

I am going to try:
dwc_otg.speed=1
and
Covering the power pin on my usb cable so it does not power the printer

@SimpleSimonLA
Copy link

SimpleSimonLA commented Mar 20, 2023

For me, the issue disappeared after I've changed the USB Hub - still one without external power suply.

@klack
Copy link

klack commented Mar 20, 2023

Ah seems like there are multiple reasons for this error. For me I guess it was the Raspberry Pi trying to power 3 printer boards at once.

@mertzt89
Copy link

mertzt89 commented Mar 20, 2023 via email

@alfredopalhares
Copy link

I have "solved" this issue by switching to CANBus communication with the board.

@wpietri
Copy link

wpietri commented Apr 2, 2023

I'm also having this problem. My details:

  • Debian GNU/Linux 11, aka latest Raspbian
  • Linux beacon1 5.15.84-v8+ #1613 SMP PREEMPT Thu Jan 5 12:03:08 GMT 2023 aarch64 GNU/Linux
  • DeskPi Lite with Pi 4
  • Digital Yacht Receiver, known good and stable for years until I swapped in this Pi

The receiver produces a stream of data at 38400 baud as it is received from passing ships. This works for hours or days at a time, failing such that select returns nothing for the device and sees no issues, even though data is still coming in. An strace of the reading process looks like this:

[pid 467848] 07:02:27 pselect6(5, [3 4], [], [], {tv_sec=9, tv_nsec=999995000}, NULL) = 1 (in [3], left {tv_sec=9, tv_nsec=999978426})
[pid 467848] 07:02:27 read(3, "C", 1)   = 1
[pid 467848] 07:02:27 pselect6(5, [3 4], [], [], {tv_sec=9, tv_nsec=999995000}, NULL) = 1 (in [3], left {tv_sec=9, tv_nsec=999978593})
[pid 467848] 07:02:27 read(3, "\r", 1)  = 1
[pid 467848] 07:02:27 pselect6(5, [3 4], [], [], {tv_sec=9, tv_nsec=999995000}, NULL) = 1 (in [3], left {tv_sec=9, tv_nsec=999978574})
[pid 467848] 07:02:27 read(3, "\n", 1)  = 1
[pid 467848] 07:02:27 write(1, "!AIVDM,1,1,,A,E>kb9O9Rh@@@@@@@@@"..., 67) = 67
[pid 467848] 07:02:27 pselect6(5, [3 4], [], [], {tv_sec=9, tv_nsec=999991000}, NULL) = 0 (Timeout)
[pid 467848] 07:02:37 pselect6(5, [3 4], [], [], {tv_sec=9, tv_nsec=999987000}, NULL) = 0 (Timeout)
[pid 467848] 07:02:47 pselect6(5, [3 4], [], [], {tv_sec=9, tv_nsec=999994000}, NULL) = 0 (Timeout)
[pid 467848] 07:02:57 pselect6(5, [3 4], [], [], {tv_sec=9, tv_nsec=999987000}, NULL) = 0 (Timeout)
[pid 467848] 07:03:07 pselect6(5, [3 4], [], [], {tv_sec=9, tv_nsec=999987000}, NULL) = 0 (Timeout)
[pid 467848] 07:03:17 pselect6(5, [3 4], [], [], {tv_sec=9, tv_nsec=999987000}, NULL) = 0 (Timeout)
[pid 467848] 07:03:27 pselect6(5, [3 4], [], [], {tv_sec=9, tv_nsec=999987000}, NULL) = 0 (Timeout)

The relevant kernel log is:

Apr  2 07:02:29 beacon1 kernel: [1004434.516651] ftdi_sio ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32

The frequency of the problem seems to be increasing over time:

Thu 23 Mar 20:20:54 PDT 2023: same - restarting
Mon 27 Mar 22:33:23 PDT 2023: same - restarting
Tue 28 Mar 08:52:11 PDT 2023: same - restarting
Thu 30 Mar 00:59:14 PDT 2023: same - restarting
Fri 31 Mar 09:56:41 PDT 2023: same - restarting
Fri 31 Mar 13:43:54 PDT 2023: same - restarting
Sat 1 Apr 20:02:04 PDT 2023: same - restarting
Sat 1 Apr 21:57:11 PDT 2023: same - restarting
Sat 1 Apr 23:58:19 PDT 2023: same - restarting
Sun 2 Apr 02:53:27 PDT 2023: same - restarting
Sun 2 Apr 04:57:35 PDT 2023: same - restarting
Sun 2 Apr 07:03:42 PDT 2023: same - restarting
Sun 2 Apr 10:07:53 PDT 2023: same - restarting
Sun 2 Apr 15:11:09 PDT 2023: same - restarting
Sun 2 Apr 17:09:16 PDT 2023: same - restarting
Sun 2 Apr 20:00:24 PDT 2023: same - restarting

The error message sometimes happens without data flow disruption, but every data flow disruption occurs within a couple seconds of a log message like that.

Glad to debug further as requested.

@popy2k14
Copy link

popy2k14 commented Apr 2, 2023

After two weeks of using the Dell wyse 5070 (x86 based) which replaced my odroid (arm based) with the same usb devices attached and HA restored.

The issue is gone!

So its an arm based kernel issue!

@klack
Copy link

klack commented Apr 2, 2023

For me, the issue disappeared after I've changed the USB Hub - still one without external power suply.

This ended up being the real fix for me.

@popy2k14
Copy link

popy2k14 commented Apr 2, 2023

It's rather a workaround! There is a serious bug in the kernel causing this.

@bunder2015
Copy link

bunder2015 commented May 24, 2023

I ran into this issue with a pi 4b running the latest 64bit pi OS, connected to a USB hub that is connected a UPS serial monitor and 3 ARM boards' serial consoles. The pi is powered by the official charger, and the USB hub is also powered. It was working fine for a couple weeks and then one of the consoles started hanging up with this urb error. Restarting putty would hang up again after 30 seconds and I wound up rebooting the pi to restore connectivity.

I can try unpowering the hub, as I shouldn't really need to power it for my devices.

Edit: unfortunately my hub doesn't work when unpowered. Perhaps the hub is faulty.

@McNugget6750
Copy link

Confirming the same issue on RPI3

@jbeale1
Copy link

jbeale1 commented Aug 20, 2023

I had the same problem:
ch341-uart ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32
when communicating with a CH341-UART chip in a knockoff Arduino-UNO-alike board. This was not from a Pi, but on a Linux Fitlet2 box (Linux Mint, 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 2023 running on an Intel J3455) so it is not limited to ARM CPUs. The problem was solved by connecting the CH341 cable direct to the linux box port, instead of through a 4-port unpowered hub (Sabrent brand). So maybe something about the USB 2.0 TT (transaction translator) in the hub?

@popy2k14
Copy link

@jbeale1 Interresting. Never had this issue since i switched to Dell wyse 5070 (x86 based).

@ForrestFire0
Copy link

I have this issue on rpi 4 as well with cp120x and cp340x devices. Seems to be an issue. Given how common these FTDI clones are it seems like this should be a top priority, but it has not been addressed for 5 years!

@McNugget6750
Copy link

Certainly not a solution but I ended up hard-wiring my 3d printer to the hardware serial port of the Raspberry Pi which works without issues. Sad so see that this cannot be resolved.
I have noticed, though, that my Ubuntu laptop also experiences occasional disconnects of these chips with the same error message. Maybe it's a mass-produced hardware fault and not a driver issue after all?

@klack
Copy link

klack commented Dec 28, 2023 via email

@bunder2015
Copy link

I recently acquired some Pi 5's and I'm still running into this issue. I haven't tried buying a new usb hub yet though.

@McNugget6750
Copy link

This is not a solution but I ended up connecting my 3D printer directly to the RX/TX pins of the Raspberry Pi and so far it has been absolutely stable.
It's heart breaking that this is a common occurrence. I will say though that I have seen this on Ubuntu laptops as well so maybe this is a kernel issue or deeply routed in the drivers and not specifically related to the Raspberry Pi.

@popy2k14
Copy link

Wow, the issue is still not solved!?
A year and 3 months on my Dell Wyse (x86) and never had this issue again.

@RaggioRaggio
Copy link

RaggioRaggio commented Aug 24, 2024

Hi Everyone,

I have the same issue with an RPi3 Model B, running RaspbianOS 64-bit, communicating via USB to a Creality Mainboard 4.2.7 (using the CH341 chip). The RPi is well-supplied and cooled, with no warnings about reduced performance or throttling.

I encountered the following error:

ch341-uart ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32

This error appeared after I configured a USB 2.0 webcam (AUKEY LM-PC1E) to the RPi hub for Crownsnest, which caused a communication loss with Klipper mid-print and ultimately led to a print failure. I believe the issue is related to USB bus traffic, particularly when it's shared with other bandwidth-consuming devices. With the webcam disconnected, the issue is not happening.

Without reverting to USB 1.1 (dwc_otg.speed=1), I found that reducing the baud rate for the CH341 from the default 250000 to a more conservative and standard 115200 mitigated the issue. I'm currently running progressively longer tests and gaining more confidence in the setup. The baud rate can be configured when compiling the Klipper firmware and its still plenty, so it's very easy to try.

I hope this helps someone. Cheers!

@popy2k14
Copy link

@RaggioRaggio thx for the workaround. But I think that's an Kernel/driver issue.

@RaggioRaggio
Copy link

@popy2k14 Thank you for the reply. Yes, I can confirm that it's a kernel/driver issue.

The long-duration tests still failed, and after trying other workarounds, I reverted my system back to the 32-bit distribution. Not only is it more responsive, with less CPU load, but HDMI and touch also have less lag with KlipperScreen. The WiFi connection is more stable, and many other random errors and warnings have also disappeared from the dmesg list. The setup is exactly the same, and it has now been working error-free at a full 250000 baud for over 12 hours printing.
It all seems related to the 64-bit kernel/driver, as the setup is unchanged from before.

@Fail-Safe
Copy link

@RaggioRaggio Does the switch back to 32-bit still hold true for you? I've been running into this same situation as others have described (the urb stopped: -32 error) and have lost several prints at >12 hours as a result.

I'm going to switch to a 32-bit distro if you still believe that might be the key to all this.

@RaggioRaggio
Copy link

@Fail-Safe 32-bit solved any issue with my configuration, and I definitely recommend you a try if you have the time.

Just to recap my configuration:

  • RPi3 Model B, 2024-07-04 Raspberry Pi OS Lite (32-bit)
  • KIUAH, latest with mainsail (moonracker, mobileracker, spoolman, crownsnest, klipperscreen)
  • AUKEY LM-PC1E, streaming 1080p 5-10fps with leveraging camera-streaming with adaptive-mjpeg
  • Creality Mainboard 4.2.7 (using the CH341 chip) + CRTouch
  • BTT ADXL345 V2.0
  • Waveshare 7INCH HDMI Display-C (1024x600) with USB Capacitive Touch, an old first release

I've never shut down or restarted any service in the last two days, many small/medium print jobs, and not a single error in the dmesg stream. With 64-bit I had a (urb stopped: -32 error) every 1 to 2 hour. Hope this helps!

@P33M
Copy link
Contributor

P33M commented Aug 29, 2024

The proximate cause as to why the USB serial driver stops responding is that the generic callback doesn't handle the case where the IN endpoint enters the Halted state. Devices can set this autonomously but should only do so only in response to some internal catastrophic error condition. It's possible that fake FTDI chips and the CH341 misdetect UART RX jabber/framing errors as fatal. Hubs may also report STALL errors for IN endpoints that suffer transaction errors on the FS/LS segment of the bus (between the UART and the hub port).

However, this doesn't explain why swapping to a 32-bit kernel will fix this on a Pi 3. The exact same host driver code gets run in both cases, but not in FIQ context on 64-bit - so generally things happen slower. There is no change to the error handling pathways, so why the change in behaviour?

@Fail-Safe
Copy link

Fail-Safe commented Aug 31, 2024

32-bit solved any issue with my configuration, and I definitely recommend you a try if you have the time.

Per your recommendation, I made the time and switched back to 32-bit Raspberry Pi OS Lite on my RPi3 Model B+. I was so tired of mid-print failures and wasted filament!

Fast-forward... Loaded up Klipper and all its surrounds again and I have had exactly zero (0) reoccurrences of this issue since the switch back to the 32-bit OS. (!!!) Printing is back to predictable and consistent, thankfully.

Hopefully it stays this way. I will report back if the issue does reoccur. Many thanks, @RaggioRaggio, for your findings on this! 👍🏻

@RaggioRaggio
Copy link

RaggioRaggio commented Sep 1, 2024

@Fail-Safe Happy to hear that the switch-back to 32-bit solved also for you! I really hope it’ll stay this way!

@P33M unfortunately i don’t know in details the inner workings of the kernel / serial driver. What I’ve observed is that the 32-bit system runs my software stack with 2-3% less CPU than the 64-bit. Further, I haven’t experienced a single wifi connection drop (before it happened two-tree times a week), a freeze of the touch drivers (about once a week), and many warning and error in the dmesg stream. Maybe it was my install that had issues (although i had done it with the Raspberry sd card tool), but i have the feeling that 64-bit ont the RPi 3 is not as well-refined as the 32-bit: maybe it’s some issue with the FIQ, or some caveat in few low-level drivers, that at the end of the day cause a connection drop also with internal ICs. Don’t know if this issue happens on more “modern” RPi, somewhat more native to 64-bit, but 32-bit on a RPi 3 really seem more performant and stable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Waiting for internal comment Waiting for comment from a member of the Raspberry Pi engineering team
Projects
None yet
Development

No branches or pull requests