Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPi0W: SD card errors while running around 40 LEDs #224

Open
pinheadmz opened this issue Sep 19, 2017 · 18 comments
Open

RPi0W: SD card errors while running around 40 LEDs #224

pinheadmz opened this issue Sep 19, 2017 · 18 comments
Labels
notice Issues that are solved/do not require input, but preserved and marked of interest to users.

Comments

@pinheadmz
Copy link

Rrunning the test sudo ./test -c produces SD card errors in dmesg:

[33490.769029] mmcblk0: error -110 transferring data, sector 137216, nr 16, cmd response 0x900, card status 0xc00

More verbose log here: https://pastebin.com/AZUijQmW

This error will pop up every few minutes if I leave the test running just by itself. With my other processes running (full project: https://github.com/pinheadmz/ClockJr) these errors pop up as frequently as every few seconds.

My project displays the "rainbow wheel" effect for a few seconds then stops and goes blank. A few seconds later I will see this mmcblk0 error in dmesg.

I've been through FOUR SD CARDS, all different brands. It's not the card.

@penfold42
Copy link
Contributor

Change the dma channel to 10.

The default dma channel 5 now clashes with recent OS versions

@pinheadmz
Copy link
Author

This seems to have abated the problem for now, thanks! Is there a command to figure out which DMA channels are reserved by the OS?

My system info:

Machine model: Raspberry Pi Zero W Rev 1.1
Linux version 4.9.35+ (dc4@dc4-XPS13-9333) (gcc version 4.9.3 (crosstool-NG crosstool-ng-1.22.0-88-g8460611) ) #1014 Fri Jun 30 14:34:49 BST 2017

@penfold42
Copy link
Contributor

Not that I know of.

This snippet of code might be a good start:
https://stackoverflow.com/questions/29628602/how-to-allocate-dma-channel-in-user-space

@pietrodn
Copy link
Contributor

I had the very same problem of OP with DMA channel 5. I think that the documentation should be updated and the code should use DMA channel 10 by default.

I do not know which DMA channels are safe to use, so I'm not posting a pull request right now. However this forum post says:

Avoid channels 0, 1, 2, 3, 6, 7. The GPU uses 1, 3, 6, 7. The frame buffer uses 0 and the SD card uses 2.

@totterfree
Copy link

Going to confirm that after running into the same issue with the latest version of raspbian, switching the DMA to 10 fixed my problem. All this after 15 Pi W's, 15 SD cards, multiple power supply swaps, and much head>desking. Thank you for the solution! I would also recommend that the DMA be switched to 10 in the docs. While 5 "should be acceptable" it clearly isn't. So a solution that works is better than a solution that should work in my opinion. Also, Thanks for the hard work in general. This library is totally awesome!

@pietrodn
Copy link
Contributor

@totterfree I just submitted a pull request to change the defaults and add a note in the documentation. IMHO it's important for the default settings to be safe.

@penfold42
Copy link
Contributor

Do you know how safe dma 10 is ?

5 was safe ...
... on older firmware or kernel (not sure which determines the dma allocation)

@pietrodn
Copy link
Contributor

We used dma=10 in production on two Raspberry Pi 3 Model B mounted on two drones, controlling both flight and lighting. There were no filesystem problems and the drones kept flying :)

@pietrodn
Copy link
Contributor

pietrodn commented Dec 27, 2017

Ok, I found some official documentation.

The Raspberry Pi 3 Model B, in the latest Raspbian, has a file called /proc/device-tree/soc/dma@7e007000/brcm,dma-channel-mask.

According to the Linux kernel documentation, this file contains the

Bit mask representing the channels
not used by the firmware in ascending order,
i.e. first channel corresponds to LSB.

For my Raspberry Pi 3 Model B, this file contains four bytes: 00 00 7f 34 (you can view it with the xxd utility).
This corresponds, in base 2, to the number 0000 0000 0000 0000 0111 1111 0011 0100.

I guess that the first 2 bytes (the first 16 zeroes) are not significant since the RPI's DMA has only 16 channels.
In my interpretation of the kernel docs, the reserved channels are: 0, 8, 9, 12, 14, 15, i.e., those corresponding to a zero in the bitmask.

Update (2018-01-01): sorry, I started counting from the most significant byte, opposite to the specification... the reserved channels should be: 0, 1, 3, 6, 7, 15. The strange thing is that 5 isn’t marked as reserved.

@penfold42
Copy link
Contributor

And this is where the Pi is a bit crap.
5 is free according to that and we know this is a bad choice these days...

At least it’s documented now in the readme!

@PandorasFox
Copy link
Contributor

oh man, this was causing a crazy amount of errors on my pi 3b. I was thinking my sdcard was dying, haha.

Is it a problem with all pi's on their newest firmwares now, or just the 3b/zero W? If it's only a subset, it might be worth it to just have no default set, if there's no guaranteed safe default.

@penfold42
Copy link
Contributor

I suspect it’s all models on newer firmware (or kernel - I don’t know which is responsible)

@PandorasFox
Copy link
Contributor

PandorasFox commented Jan 1, 2018

I'll try to downgrade my kernel/firmware later and see if I can bisect it to a specific version, if I have time. My suspicion is that it's baked into the kernel rather than done at the firmware level, but they're coupled anyways.

edit: my internet for the holidays is awful; guess I'll have to try again in a few weeks

@totterfree
Copy link

I can confirm this happens to recently purchased raspberry pi zero w's on the latest lite image. We switched to DMA 10. No more corrupt SD cards so far. Every so often, we see a blip or two in animations after they've been running for awhile, but that could be something going on in our application. Kind of curious to try 0, 8, 9, 12, 14, or 15 to see if one of those smooths things out

@Gadgetoid Gadgetoid added the notice Issues that are solved/do not require input, but preserved and marked of interest to users. label Mar 6, 2018
@hallard
Copy link

hallard commented Apr 29, 2018

Guys,
I spent all this weeks to face this issue too. switching PI Zero, SD, My Shield.
Since I build all from scratch I though I was up to date. And finally identified that the problem was coming from WS2812 driver. Then I decided to come here and got it, even if I compile new version my script was with old setup, and DMA 5

# Create NeoPixel object 2 LEDs, 64 Brighness GRB leds
strip = Adafruit_NeoPixel(2, gpio_led, 800000, 5, False, 64, 0, ws.WS2811_STRIP_GRB)

I think the code should fire a python warning/error if DMA used is 5, like this LED will not work but save your SD Card or OS, Because we can't guarantee that there is not sample code or code with DMA setup to 5 anywhere, by precaution, at least fire a big warning.

Anyway after 2 days, found my issue ;-)

@pietrodn
Copy link
Contributor

The problem is that we have no way to query the OS for which DMA channels are free to use.
If we ever get to that, I think the correct behavior when the user tries to write to an unsafe DMA channel would be to abort the program immediately with a meaningful error message.

@PandorasFox
Copy link
Contributor

yeah, some versions use it, some don't; it's pretty much all undocumented, right? The defaults are just kinda shots in the dark so there's no real point in warning at all; it's kinda expected to read the readme and stuff.

tombettany added a commit to KanoComputing/kano-peripherals that referenced this issue Jun 19, 2018
Switch to use DMA channel 10 for LED control as per issue
jgarff/rpi_ws281x#224.
tombettany added a commit to KanoComputing/kano-peripherals that referenced this issue Jun 19, 2018
Switch to use DMA channel 10 for LED control as per issue
jgarff/rpi_ws281x#224.
@kefabean
Copy link

phew, so glad I found this thread. was battling to fix this for hours!

Richard-Kirby added a commit to Richard-Kirby/water-softener-minder that referenced this issue Oct 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
notice Issues that are solved/do not require input, but preserved and marked of interest to users.
Projects
None yet
Development

No branches or pull requests

8 participants