Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: nrf: posix: portability.posix.common.tls.newlib fails on nrf9160dk_nrf9160 #31721

Closed
PerMac opened this issue Jan 27, 2021 · 17 comments
Closed
Assignees
Labels
area: Tests Issues related to a particular existing or missing test bug The issue is a bug, or the PR is fixing a bug platform: nRF Nordic nRFx priority: low Low impact/importance bug
Milestone

Comments

@PerMac
Copy link
Member

PerMac commented Jan 27, 2021

Describe the bug
The test portability.posix.common.tls.newlib from tests/posix/common/ fails on nrf9160dk_nrf9160

To Reproduce
Steps to reproduce the behavior:

  1. Have nrf9160dk connected
  2. call ./scripts/twister --device-testing -T tests/posix/common/ -p nrf9160dk_nrf9160 --device-serial /dev/ttyACM0 -v -v
  3. See error

Expected behavior
The test passes

Impact
Not clear

Logs and console output
The part containing FAIL status:

START - test_posix_realtime
POSIX clock set APIs
Assertion failed at WEST_TOPDIR/zephyr/tests/posix/common/src/clock.c:75: test_posix_realtime: (rts.tv_nsec not equal to mts.tv_nsec)
Nanoseconds not equal
FAIL - test_posix_realtime

Environment (please complete the following information):

  • OS: Ubuntu 18.04
  • Toolchain Zephyr SDK
  • Commit Version used: Zephyr v2.5.0-rc1-11-gf91e9f
@PerMac PerMac added bug The issue is a bug, or the PR is fixing a bug platform: nRF Nordic nRFx area: Tests Issues related to a particular existing or missing test labels Jan 27, 2021
@nashif nashif added the priority: low Low impact/importance bug label Jan 30, 2021
@ioannisg
Copy link
Member

ioannisg commented Feb 2, 2021

will take a look

@PerMac
Copy link
Member Author

PerMac commented Feb 2, 2021

I think this test is unstable in fact. It eventually passes after some retries

@ioannisg
Copy link
Member

ioannisg commented Feb 3, 2021

I can't reproduce this @PerMac , are you sure you keep seeing the error?
I am on Zephyr v-2-5-0 -rc2

@ioannisg
Copy link
Member

ioannisg commented Feb 3, 2021

My suggestion is to close this ticket.

@ioannisg
Copy link
Member

ioannisg commented Feb 3, 2021

I think this test is unstable in fact. It eventually passes after some retries

It does not fail for me (but i only tested gnu arm embedded)

@PerMac
Copy link
Member Author

PerMac commented Feb 3, 2021

I still see this with v-2-5-0 -rc2 Sometimes some scenarios from this test pass, but most of the time at least one will fail and it is always an issue with realtime test case. I use Zephyr SDK 0.12.2

@pabigot
Copy link
Collaborator

pabigot commented Feb 3, 2021

I've run the reproducing command on v2.5.0-rc2-8-gcf946d3365 five times and all tests pass on my rev 0.8.5 nrf9160dk_nrf9160. Perhaps it's device-specific (bad crystal)? Also using SDK 0.12.2.

@PerMac
Copy link
Member Author

PerMac commented Feb 3, 2021

I will close the ticket then. Might be as you are suggesting Peter, that it is a particular board issue (if it matters I have 0.9.0 on my desk).

@PerMac
Copy link
Member Author

PerMac commented Mar 3, 2021

@pabigot I reopened the issue as this test start failing again recently. It works with v2.5.0. The last time I was able to bisect with reverted logic to find that 544475d was the commit which (unnoticeably) fixed the issue previously

@ioannisg
Copy link
Member

ioannisg commented Mar 3, 2021

so is this a regression then?

@pabigot
Copy link
Collaborator

pabigot commented Mar 3, 2021

so is this a regression then?

Possibly, or it was just chance that it passed in #31721 (comment). It does fail for me today; I'll see if I can bisect.

@PerMac
Copy link
Member Author

PerMac commented Mar 3, 2021

Yes, I believe it is regression.
I tried to bisect where the issue is introduced but had no success. The test already was unstable which makes bisecting by hand a nightmare. However, I think we can narrow down the scope to verify:
The test passed in our internal CI for zephyr-v2.5.0-268-g3fe33 (how after a 1 retry)
Then for zephyr-v2.5.0-441-g5de3f and later on it continues to fail, retries did not help.
Our CI is set to do 2 retries

@PerMac
Copy link
Member Author

PerMac commented Mar 3, 2021

I don't think so @pabigot Please check my above comments. To recap: I was able to find that the last time the test was actually fixed by 544475d
It worked (with retires) up to (including) zephyr-v2.5.0-268-g3fe33
It cannot pass with retries since zephyr-v2.5.0-441-g5de3f

@pabigot
Copy link
Collaborator

pabigot commented Mar 3, 2021

I hit a failure at zephyr-v2.5.0-108-g84e4e62c2db5 so I think the problem may have been reintroduced earlier. I'll report back when I know more.

@pabigot
Copy link
Collaborator

pabigot commented Mar 3, 2021

My bisect between v2.5.0-rc2-8-gcf946d3365 and zephyr-v2.5.0-693-gd19741f1ec located the failure at zephyr-v2.5.0-7-g91946ef21c.

zephyr-v2.5.0-6-gdd4322154067 passed ten (10) reps.
zephyr-v2.5.0-6-gdd4322154067 also passed thirty (30) reps (total 40)

zephyr-v2.5.0-7-g91946ef21c fails quickly.

zephyr-v2.5.0-268-g3fe33 failed at rep 8 so the problem was visible there, even if it wasn't revealed in local testing.

All this with SDK 0.12.3.

zephyr-v2.5.0-7-g91946ef21c (91946ef) is not supposed to change any behavior, so this may be due to something as subtle as code placement and caching.

I've got a nice script that does the necessary testing for a generic twister failure at a particular commit. It could probably be used with git bisect run to automate the narrowing down process. I'll clean it up and get it posted somewhere.

@ioannisg
Copy link
Member

This does not really seem to fail; I 've tested it with 2.6.0-RC1 and with current master.
Closing for now, @PerMac pls, reopen if you test this and it fails for nrf9160.

@PerMac
Copy link
Member Author

PerMac commented May 20, 2021

after patching nrf53 with #35455 I see these tests failing most of the time on nrf5340dk_nrf5340_cpuappns. Haven't seen it failed on not ns version yet. I created a separate issue for timer-related instability in tests #35509.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Tests Issues related to a particular existing or missing test bug The issue is a bug, or the PR is fixing a bug platform: nRF Nordic nRFx priority: low Low impact/importance bug
Projects
None yet
Development

No branches or pull requests

5 participants