Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dma::WriteDma::write is unsound. #270

Closed
thalesfragoso opened this issue Sep 27, 2020 · 9 comments · Fixed by #274
Closed

dma::WriteDma::write is unsound. #270

thalesfragoso opened this issue Sep 27, 2020 · 9 comments · Fixed by #274

Comments

@thalesfragoso
Copy link
Member

The trait Static doesn't provide the invariants necessary for a buffer to be safe to use with DMA, it actually doesn't provide any invariant at all, since it's public and safe to implement.

The quickest solution would be to mark the trait as unsafe and document it. But a better solution would be to move this API to use the embedded-dma traits.

This is all breaking changes and we should also discuss if we want to yank previous versions with this problem.

@TheZoq2
Copy link
Member

TheZoq2 commented Sep 29, 2020

Thanks for the report, is this the same unsoundness that was discussed in rust-embedded/embedonomicon#64? I'm not sure what the best practice for DMA unsoundness is, I imagine not a whole lot of people use DMA so maybe yanking things causes more harm than good

@mitchmindtree
Copy link
Contributor

Thanks for helping me resolve this @thalesfragoso :)

I imagine not a whole lot of people use DMA

@TheZoq2 I just wanted to mention that the SPI DMA is especially useful for working with large numbers of LEDs which I'd imagine is not an uncommon use case (this is our use case anyway). But I do agree that yanking might cause more harm than good either way.

@TheZoq2
Copy link
Member

TheZoq2 commented Sep 29, 2020

Oh yea, I use the SPI DMA for just that myself :)

I haven't had time to read up on the specifics here, but I'd say wether or not to yank depends on how hard it is to run into this unsoundness. If it happens every time you use DMA, then we should probably yank, otherwise I guess it would depend on the severity

@mitchmindtree
Copy link
Contributor

Good point! We spent a couple of days investigating noisey SPI LEDs on our PCB prototype, trying to work out what was going on by using an oscilloscope to check all the clocks and data lines but could not find the issue. It turns out that the unsoundness somehow lead to an issue where writing to the SPI DMA would cause packet loss in our Ethernet, and similarly Ethernet activity would result in the LEDs lighting up in seemingly random ways.

Here's the story I posted to the stm32-rs matrix last night for the record:

We seem to have found the source of the noisey LED behaviour on our PCB and it turns out to be pretty strange! We noticed that our LEDs lit up seemingly randomly whenever there was Ethernet activity. E.g. we could send a message from my laptop to the board once per second, and very reliably a random group of LEDs would light up. For some context, these are ~150 SPI addressable LEDs.
We started checking all the clock, Ethernet and SPI pins across the whole board with an oscilloscope expecting to see a noisey line, but alas everything looked very clean
Even the SPI lines both at the MCU and at the other end of the 20 mm cable looked really clean, though we noticed that there was some odd 1s on the SPI line even though we were just writing zeroes. I figured it was just the end frame which expects some Fs, but then remembered that around the same time we received our new PCBs was also around the same time I got the SPI3 DMA working for the F107
Turns out that if I write to SPI3 Synchronously rather than via the DMA, the LED noise goes away!
Not only does it go away, it also solved an odd issue that we were running into where our MCU would drop lots of TCP packets while running the main firmware ( which uses SPI3 ), but would work flawlessly in the bootloader
Seems there is some funky interaction between the SPI3 DMA and the Ethernet DMA on the F107?
/ story time

Following this, @thalesfragoso dug into the issue with me and we discovered that my DmaBuffer struct was the cause of the issues. The type was a wrapper around a heapless::Vec<u8, U1024> that implemented the necessary traits for use as a DMA buffer including Static. Seeing as no unsafe was required, it didn't occur to me it would lead to any issues, and while I did think Static was an odd name I thought it just implied that the buffer should be owned as it was very similar to the std Borrow trait and the buffer was passed by value to the WriteDma::write.

After removing this type in favour of using a &'static mut [u8], all of our strange LED noise issues are gone and we no longer appear to get the Ethernet packet loss that we were getting before!

@TheZoq2
Copy link
Member

TheZoq2 commented Sep 29, 2020

Oh wow, that sounds like a horrible bug to track down 😰. I'll try and have a look at this tomorrow evening

Sidenote: Maybe that could explain why the backlight on one half of my keyboard flickers while pressing keys on the other half 🤔

@mitchmindtree
Copy link
Contributor

I wonder if this could be addressed (in stm32f1xx-hal at least) by requiring that the Static trait's borrow method returns a &'static B rather than &B?

@TheZoq2
Copy link
Member

TheZoq2 commented Sep 29, 2020

A friend of mine also suggested just that. I've got to admit I'm a bit out of my comfort zone with DMA soundness issues, but that might work.

I think a better solution is to just use embedded-dma as suggested

@thalesfragoso
Copy link
Member Author

I wonder if this could be addressed (in stm32f1xx-hal at least) by requiring that the Static trait's borrow method returns a &'static B rather than &B?

I think that would make things very inflexible, the new embedded-dma API is a better solution imo, and I don't think it would be that hard to change to it. I might do a PR in the coming days (not sure if before the weekend though).

@TheZoq2
Copy link
Member

TheZoq2 commented Sep 30, 2020

Yea, I agree that we should use embedded-dma.

I might do a PR in the coming days (not sure if before the weekend though).

Sounds good, I could do it but I think you have more DMA experience than me :) Though let me know if you can't find the time, it feels like we should try to get this done fairly quickly given the severity :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants