Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some TOOL1LC boards give incorrect temperature readings using 3.6.0-beta.2 firmware #1060

Closed
dc42 opened this issue Nov 18, 2024 · 4 comments
Assignees
Labels
bug Bug that has been reproduced Done - Needs Testing
Milestone

Comments

@dc42
Copy link
Collaborator

dc42 commented Nov 18, 2024

When running 3.6.0-beta.2 firmware some TOOL1LC boards give incorrect temperature readings for thermistors connected to TEMP0, for example about 2C instead of about 21C. Reported by Andy S and also seen by me. Probably related is this forum post https://forum.duet3d.com/topic/36964/3-6-0-beta-2-issue-1lc-inconsistent-temp-reading.

In my toolchanger the version 1.2a board at CAN address 22 exhibits this issue but the version 0.6 board at address 23 does not. If I swap the CAN addresses over and restart then the issue moves to address 23, i.e. the issue stays with the board hardware; so the difference in behaviour is not caused by differences in board configuration.

@dc42 dc42 added the bug Bug that has been reproduced label Nov 18, 2024
@dc42 dc42 added this to the 3.6.0 milestone Nov 18, 2024
@dc42 dc42 self-assigned this Nov 18, 2024
@dc42
Copy link
Collaborator Author

dc42 commented Nov 18, 2024

Further investigation: using commit 4cf3 of the Duet3Expansion project and commit 40c9 of CANlib, the issue occurs when commit 4980 of CoreN2G is used but not when the earlier commit fa3a is used. The difference is that in the later commit, the DMA descriptors and write-back sections are put in a separate RAM section just after the CAN message buffers, which is cleared to zero by the startup code, instead of in bss. Here are some possible causes of the issue:

  1. Moving the DMA writeback address has upset the DMA controller in some way so that DMA transfers from the ADC on the SAMC21 sometimes yield incorrect results.
  2. The area of memory that the DMA descriptors have been moved to gets corrupted by something, causing incorrect DMA from the ADC.
  3. Moving the DMA descriptors has caused the address of the TEMP0 filter buffer to move, and the address it moved to is getting corrupted.

Changing the DMA channel used by the ADC from 2 to 5 does not solve the problem, which makes it less likely that possible cause 2 above is the reason.

@dc42
Copy link
Collaborator Author

dc42 commented Nov 18, 2024

Further investigation revealed that even if the DMA descriptors are left in bss the problem still exists. Removing the startup code to initialise the [empty] DmaBuffers section then fixes the problem.

@dc42
Copy link
Collaborator Author

dc42 commented Nov 18, 2024

Solved the problem on my toolchanger by reducing the optimisation level of the startup code. The old code turned the loops into calls to memcpy and memset. Adding attribute((optimize ("-fno-tree-loop-distribute-patterns"))) to prevent that didn't fix it, but changing the optimisation level of ResetHandler to -Os did (the project setting is -O3). Also changed optimisation level of InitClocks to -Os. New binary for others to test is at https://www.dropbox.com/scl/fo/kaxbpbsyq2lxbae4tmzpk/AAPVUAh6CO_FMO-cG9ttX6U?rlkey=f0aesppc8d2p7bzwwi2s1jayo&dl=0.

@dc42
Copy link
Collaborator Author

dc42 commented Dec 4, 2024

@dc42 dc42 closed this as completed Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug that has been reproduced Done - Needs Testing
Projects
None yet
Development

No branches or pull requests

1 participant