Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STM: Clear stale data from STM SPI peripheral before starting data transfer #14534

Closed
wants to merge 4 commits into from

Conversation

rvasquez6089
Copy link

Summary of changes

Note: This is a resubmission. I accidentally closed my previous PR.
This change fixes a bug in in STM SPI API. I was having trouble with the SPI peripheral reading in data after resetting the micro controller using the reset pin. It would seem that after a reset by the reset pin.... stale data would be left in the read buffer of the SPI peripheral.
This would cause the RXNE flag of the SPI peripheral to be high.
This would cause the spi_master_write() function to incorrectly read the receive buffer before valid data was in the buffer. I was able to verify this bug with an oscilloscope.
This pull request checks the state of the RXNE flag and clears the receive buffer appropriately before starting an SPI data transfer.
I'm not sure if this is a silicon error or a hal error, but this code fixes the issue!

It has preliminary support for the STM32H7 series, as well as 16 and 32bit transfers.

I have only been able to test this code on a STM32F767ZI on SPI 2. More testing will be needed on other platforms to verify this fix.

Impact of changes

Migration actions required

Documentation

None.

Pull request type

[X] Patch update (Bug fix / Target update / Docs update / Test update / Refactor)
[] Feature update (New feature / Functionality change / New API)
[] Major update (Breaking change E.g. Return code change / API behaviour change)

Test results

[] No Tests required for this change (E.g docs only update)
[X] Covered by existing mbed-os tests (Greentea or Unittest)
[] Tests / results supplied as part of this PR

Reviewers


@ciarmcom ciarmcom added the release-type: patch Indentifies a PR as containing just a patch label Apr 12, 2021
@ciarmcom
Copy link
Member

@rvasquez6089, thank you for your changes.
@ARMmbed/mbed-os-maintainers please review.

@ciarmcom ciarmcom requested a review from a team April 12, 2021 17:00
@0xc0170 0xc0170 changed the title Bug Fix: Resubmission Clear stale data from STM SPI peripheral before starting data transfer STM: Clear stale data from STM SPI peripheral before starting data transfer Apr 13, 2021
@0xc0170 0xc0170 requested a review from a team April 13, 2021 07:27
@LMESTM
Copy link
Contributor

LMESTM commented Apr 14, 2021

@rvasquez6089 thanks for analysing the issue you have faced and coming up with a proposed fix.

We could suppose that a reset would avoid any stale data ... if not the system reset, then the peripheral reset should do ...
but it seems that the peripheral reset is only done during free:

__HAL_RCC_SPI1_FORCE_RESET();

Could you try to test a peripheral reset sequence during init as well for your peripheral under test and see if that helps ?
Basically by changing

__HAL_RCC_SPI2_CLK_ENABLE();
with the few lines below:

#if defined SPI2_BASE
if (spiobj->spi == SPI_2) {
__HAL_RCC_SPI2_CLK_ENABLE();
__HAL_RCC_SPI2_FORCE_RESET();
__HAL_RCC_SPI2_RELEASE_RESET();
spiobj->spiIRQ = SPI2_IRQn;
}
#endif

?
Thanks in advance your feedback

@rvasquez6089
Copy link
Author

@LMESTM I will give this a try and report back next week. 👍

@rvasquez6089
Copy link
Author

@LMESTM

I had to change the order of the reset and enable. It seems like the reset functions disable the peripheral in some way so it is best to enable after resetting them.

 #if defined SPI2_BASE
if (spiobj->spi == SPI_2) {
__HAL_RCC_SPI2_FORCE_RESET();
__HAL_RCC_SPI2_RELEASE_RESET();
__HAL_RCC_SPI2_CLK_ENABLE();
spiobj->spiIRQ = SPI2_IRQn;
}
#endif

I added this change to all of the SPI peripherals. Check my commits for the full change.

Copy link
Collaborator

@jeromecoutant jeromecoutant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good! :-)
SPI tests are all OK on my side for all STM32 families.

Just request to correct space alignment and remove unused code.

Thx!

Comment on lines 801 to 825
//#if TARGET_STM32H7
// if (!LL_SPI_IsActiveFlag_RXP(SPI_INST(obj)))///Clears stale data in the receive buffer is there is any.
//#else /* TARGET_STM32H7 */
// /* Wait for RXNE flag before reading */
// if (LL_SPI_IsActiveFlag_RXNE(SPI_INST(obj)))///Clears stale data in the receive buffer is there is any.
//#endif /* TARGET_STM32H7 */
// {
// if (LL_SPI_IsActiveFlag_RXNE(SPI_INST(obj)))
// {
// /* Read received data */
// if (bitshift == 1)
// {
// LL_SPI_ReceiveData16(SPI_INST(obj));
//#ifdef HAS_32BIT_SPI_TRANSFERS
// } else if (bitshift == 2) {
// LL_SPI_ReceiveData32(SPI_INST(obj));
//#endif
// }
// else{
// LL_SPI_ReceiveData8(SPI_INST(obj));
// }
// }
// }


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this code

@@ -184,41 +183,53 @@ static void _spi_init_direct(spi_t *obj, const spi_pinmap_t *pinmap)
#if defined SPI1_BASE
// Enable SPI clock
if (spiobj->spi == SPI_1) {
__HAL_RCC_SPI1_FORCE_RESET();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alignment issue :-)

@LMESTM
Copy link
Contributor

LMESTM commented Apr 19, 2021

@rvasquez6089 so do you confirm that the reset sequence is fixing the issue you're facing ?

@rvasquez6089
Copy link
Author

rvasquez6089 commented Apr 19, 2021

@LMESTM The issue stills seems to be there intermittently, but the frequency of issue has gone down significantly. I will check that my user code isn't causing this issue. However, I don't see how that could be the case when using the high level mbed SPI library.
I think one more solution would be clear the input buffer in the init code as well as force reset the peripheral. I will try that next. As well as post screenshots from my oscilloscope.
Edit: I am currently doing a full write up on this. I will post later today in this thread.

@rvasquez6089
Copy link
Author

rvasquez6089 commented Apr 20, 2021

All,
After doing extended testing.....All I can say is that this problem continues to get more and more mysterious to me.
I have created a public repository with sample code that recreates my findings below. My findings below were done with proprietary company code so I can't share it. I hope that this public repository is of use to you.

I have also found that adding a 100us second delay after changing the SPI frequency increases the chances of the SPI send/receiving data in the correct mode. I hoping that someone on the STM team can shed light on my findings. I found the results to be very random . I have tested my sample code on 3 identical boards just to verify there isn't any damage to my MCU or other parts of my test setup.

Lastly, here are some pre compiled binary of the sample code with the changes below if it helps. 👍
mbed-os-spi-testing clear recieve buffer fix.zip
mbed-os-spi-testing mbed-os master 6.9.0+.zip

System Information

mbed cli version: 1.10.2
Compiler: GNU Arm Embedded Toolchain Version 9-2019-q4-major
Operating System: Windows 10 Pro Build 18636

Here are the output waveforms from the STM32F767.

Original Code: Tag mbed-os-6.9.0 + Output waveforms.

Channel 1 (Yellow) CLK
Channel 2 (Green) MOSI
Channel 3 (Blue) MISO
Channel 4 (Pink) CS

After Reprogramming.

I get this waveform after programming the chip with an ST-LINKV3 Mini
It is clear that the uC takes time after each byte the move the data out of the receive register. Everything works as expected after a fresh reprogram. 👍
scope_53
I am also printing out the received on the 3rd byte data.
image

After Reset with Reset Button

I get this waveform after pressing the reset button on my custom development board.
The uC does not take time to move data out of the receive register. Same code as before, same mbed-os tag. But the code does not run the same way after pressing the reset button. 👎 I'm not sure what is the different in running the code after a programming or resetting the microcontroller.
scope_55
I am also printing out the received on the 3rd byte data.
image

Sometimes the transfer doesn't even complete before the chip select line goes high again. This is a random occurrence.
scope_54

Method 1: Clear the receive buffer before each transfer. rvasquez6089/mbed-os commit ddc9fc8

I get this waveform after programming the chip with an ST-LINKV3 Mini
Still works great after reprogramming 👍

scope_57
image

After Reset with Reset Button

I get this waveform after pressing the reset button on the development board. Everything still works as expected. 👍 There doesn't appear be a significant added delay from the extra branch in the code either

scope_53
image

Method 2: Suggested fix from @LMESTM rvasquez6089/mbed-os commit 6f6443f (Force reset in the _spi_init_direct)

After Reprogramming.

Works the same after programming. 👍
It is clear that the uC takes time after each byte the move the data out of the receive register.
scope_58

After Reset with Reset Button

I get this waveform after pressing the reset button on my custom development board. Same as before.
The uC does not take time to move data out of the receive register. It doesn't even wait for the data to finish transmitting before raising the chip select line.
scope_59
image

@LMESTM
Copy link
Contributor

LMESTM commented Apr 20, 2021

@rvasquez6089 hmm, I don't really have a clue. It may take some time before we find time to reproduce locally and investigate further. for now, I will ask internally if that rings a bell to anyone.

I am just wondering if you shouldn't move the CS initialization before the SPI driver creation to make sure that CS is never asserted before your actual code starts. Even if without clock, I wouldn't expect anything to happen ...

DigitalOut CS_OUT(PC_13, 1);
DigitalOut CS1(PG_7, 1);
SPI spi2(PC_3, PC_2, PD_3);

EDIT: first input from SPI experts, the the MOSI would be expected LOW after and before each sequence. So there might be some glitch related to wrong pins default states / configuration. Maybe we should use pull_down for MOSI ... so, after testing CS1 init change above, may you also try to change pull_non to pull_down and see if it helps ?
https://github.com/ARMmbed/mbed-os/blob/master/targets/TARGET_STM/stm_spi_api.c#L229

thanks

@rvasquez6089
Copy link
Author

EDIT: first input from SPI experts, the the MOSI would be expected LOW after and before each sequence. So there might be some glitch related to wrong pins default states / configuration. Maybe we should use pull_down for MOSI ... so, after testing CS1 init change above, may you also try to change pull_non to pull_down and see if it helps ?

I will give this a try early next week and post the results.

@jeromecoutant
Copy link
Collaborator

Hi @rvasquez6089

first input from SPI experts, the the MOSI would be expected LOW after and before each sequence. So there might be some glitch related to wrong pins default states / configuration. Maybe we should use pull_down for MOSI ... so, after testing CS1 init change above, may you also try to change pull_non to pull_down and see if it helps ?
https://github.com/ARMmbed/mbed-os/blob/master/targets/TARGET_STM/stm_spi_api.c#L229

Did you have time to check with the pull down configuration?

Thx

@rvasquez6089
Copy link
Author

I will get back you next week with results. I am/was in the process of finishing a project for a customer.

@ciarmcom
Copy link
Member

This pull request has automatically been marked as stale because it has had no recent activity. @rvasquez6089, please carry out any necessary work to get the changes merged. Thank you for your contributions.

@ciarmcom ciarmcom added stale Stale Pull Request and removed stale Stale Pull Request labels May 10, 2021
@ciarmcom ciarmcom added stale Stale Pull Request and removed stale Stale Pull Request labels May 21, 2021
@jeromecoutant
Copy link
Collaborator

I will give this a try early next week and post the results.

Any chance to find some testing time ? :-)

@ciarmcom ciarmcom added stale Stale Pull Request and removed stale Stale Pull Request labels May 28, 2021
@rvasquez6089
Copy link
Author

rvasquez6089 commented Jun 2, 2021

@jeromecoutant Not yet. I've been peeling F767ZI chips from Nucleo boards to put on customer projects due to chip shortage. 😆 I'm all out of Nulceo's to do testing on. I will eventually get back to this PR. I want to get the problem solved in a way that meets the Mbed foundation standards. For now I'm just shipping code from my fork to customers.
I will get back with more information and waveforms when I have more time. Hopefully before the end of June!

@0xc0170
Copy link
Contributor

0xc0170 commented Jun 2, 2021

@rvasquez6089 thanks for the update. We can close this pull request and you can reopen once updated. I'll close it now.

@0xc0170 0xc0170 closed this Jun 2, 2021
@mergify mergify bot removed devices: st needs: work release-type: patch Indentifies a PR as containing just a patch stale Stale Pull Request labels Jun 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants