Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ztest "1cpu" cases don't retarget interrupts on x86_64 #21216

Closed
andyross opened this issue Dec 5, 2019 · 2 comments
Closed

Ztest "1cpu" cases don't retarget interrupts on x86_64 #21216

andyross opened this issue Dec 5, 2019 · 2 comments
Assignees
Labels
area: SMP Symmetric multiprocessing area: Tests Issues related to a particular existing or missing test area: X86_64 x86-64 Architecture (64-bit) bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug Stale

Comments

@andyross
Copy link
Contributor

andyross commented Dec 5, 2019

On SMP, many tests weren't designed to work with multiple CPUs and are making use of a "1cpu" ztest variant. The way this work is fairly crude: it spawns a thread which locks interrupts and spins, forcing the test to operate on only one CPU.

But with x86_64 in particular, with the default IO-APIC destination settings for an interrupt ("fixed" delivery to the "lowest priority" "physical" CPU), the HPET timer interrupt sometimes gets directed to the locked CPU. Obviously this doesn't get handled, and the test will fail (usually hanging).

This has ping-ponged in the source tree. The original (single cpu) targetting was replaced in commit 5a9a33b with a "logical" delivery to a CPU/APIC ID of 0xff, which (in qemu at least) works to broadcast the interrupt to all CPUs. But this failed on UP2 hardware and got reverted in commit 005aff7, accidentally introducing the bug detailed here, and had to be re-reverted in 23bddde.

This is fairly rare in practice, just one test fails with notable frequency. But really this needs some kind of architectural solution, I can see two good ones:

  1. Augment the arch layer on SMP platforms with a "mask and disable delivery of interrupts" API that can be used to disable interrupts for long term tasks like this.

  2. Deprecate the "1cpu" ztest feature and make all test cases SMP-safe by design. This is a ton of work, and some tests make some hard assumptions about preemption behavior (e.g. it's really not possible to get kernel/sched/preempt to work at all without knowing that everything happens on one CPU).

@andyross andyross added the bug The issue is a bug, or the PR is fixing a bug label Dec 5, 2019
@jhedberg jhedberg added has-pr priority: low Low impact/importance bug and removed has-pr labels Dec 10, 2019
@carlescufi carlescufi added area: Tests Issues related to a particular existing or missing test area: SMP Symmetric multiprocessing area: X86_64 x86-64 Architecture (64-bit) labels Apr 30, 2020
@andyross
Copy link
Contributor Author

Some notes about triage: this isn't a severe bug, but the fix is going to be quite involved on x86, as we don't have a framework for dealing with interrupt targetting in the way that would be needed. For traditional drivers like HPET (which is all that's needed now) we can fix this with the some ioapic code, but in the general case any MSI interrupt is going to need some kind of support at the device and/or PCI layer. Nontrivial, and not likely to be fixed this cycle.

@github-actions
Copy link

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: SMP Symmetric multiprocessing area: Tests Issues related to a particular existing or missing test area: X86_64 x86-64 Architecture (64-bit) bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug Stale
Projects
None yet
Development

No branches or pull requests

3 participants