fix INTRNG interrupt release #1281

ehem · 2024-06-07T21:41:00Z

SSIA.

While it is good intr_isrc_deregister() exists, it hasn't ever actually been used. Due to a driver which needs this, fix issues found by reviewing the source and finding solutions.

ehem · 2024-06-07T21:45:24Z

@markjdb prefers intr_event_destroy() returning an error for ie == NULL. Problem is this is answering the wrong question for most callers. Since this is called from cleanup/error paths, the caller wants to know Has everything been properly cleaned up? Whereas returning EINVAL is answering Has everything been cleaned up and was everything in **perfect** condition?

I'm pretty sure most, if not all callers will want the answer to the first since EBUSY means memory is likely being leaked. Whereas for INTRNG, the event being NULL simply means no event was ever allocated and things are fine.

ehem · 2024-06-07T22:08:49Z

Putting this a different way, given a random struct intr_irqsrc *isrc;, the following must always terminate no matter the initial state of isrc:

while(intr_isrc_deregister(isrc)!=0)
    sleep(1);

(this would be a hypothetical PIC which implemented lazy release of interrupts)

If the filter hadn't yet been removed from the event, the first call to intr_isrc_deregister() would successfully release isrc->isrc_irq, but then return EBUSY from intr_event_destroy(). For subsequent calls currently isrc_free_irq() would return EINVAL and the loop would never terminate.

ehem · 2024-06-11T23:03:58Z

This was simply rebased to aid keeping track of local branches, everything is identical to the original. Notably the CI pass is still valid.

bsdimp

So I'm inclined to just close this as not desired. There's so much here that needs to change, I'm not sure that there's anything left

bsdimp · 2024-06-15T05:00:40Z

sys/kern/subr_intr.c

+	if (error != 0)
+		return (error);
+
+	return (isrc_event_destroy(isrc));


As noted above, this routine is never called. Why do we need to call isrc_event_destroy() here? That's new and not quite what the commit message says. How did you test this?

Notice how the event is created by isrc_event_create()? Given that, it is INTRNG's job to destroy the event, yet this call was missing. I simply needed to examine intr_isrc_deregister() and notice the missing call. The driver is sitting in my tree due it needing intr_isrc_deregister() to actually work.

bsdimp · 2024-06-15T05:00:46Z

sys/kern/subr_intr.c

-	isrc->isrc_irq = INTR_IRQ_INVALID;	/* just to be safe */
+	if (irq_sources[isrc->isrc_irq] == isrc)
+		irq_sources[isrc->isrc_irq] = NULL;
+	isrc->isrc_irq = INTR_IRQ_INVALID;	/* mark removed */



We shouldn't be papering over programing errors. It's a programming error when isrc_irq is >= intr_nirq. That normally can't happen. Likewise with setting things to NULL in this way. This is a bad change. NAK.

Hard to say for sure, though, since there's not actually any callers of this in the tree. This is only called from intr_isrc_deregister, which is global, but has no actual callers. So how did you test this?

But this is unsafe. If they don't match, it must be an error.

We shouldn't be papering over programing errors. It's a programming error when isrc_irq is >= intr_nirq. That normally can't happen. Likewise with setting things to NULL in this way. This is a bad change. NAK.

A hypothetical scenario, but one which I believe should be handled as it could nominally be used. Imagine a highly hot-plug bus on a system with >1000 processors. On such a hot-plug bus it could nominally be valid to have one thread which calls intr_isrc_deregister() at 1 second intervals and a separate thread doing intr_teardown_irq() calls. In such a case the sequence:

intr_isrc_deregister(); intr_teardown_irq(); intr_isrc_deregister();

Issue is the first intr_isrc_deregister() call would get EBUSY from intr_event_destroy() since the filter wasn't yet removed. Presently the second would get EINVAL from isrc_free_irq() since the isrc had already been removed from the table by the first call. When it comes down to it, either isrc_free_irq() or isrc_event_destroy() must tolerate being called multiple times with the same isrc.

Hard to say for sure, though, since there's not actually any callers of this in the tree. This is only called from intr_isrc_deregister, which is global, but has no actual callers. So how did you test this?

By reading the source and concluding this was a nominally valid course of action. I've got a driver which is expected to call intr_isrc_deregister() during normal operation. As such I'm examining intr_isrc_deregister() and discovering it is rather buggy. The x86 equivalent driver is unable to release interrupts and is thus forced to keep them on a list. Whereas INTRNG has the function partially implemented, but untested.

I have no expectation to need this exact functionality, but when trying to fix the problems with intr_isrc_deregister() this issue shows up. Presently I'm unable to submit the driver since 3 key areas it needs working aren't in usable shape.

bsdimp · 2024-06-15T05:01:22Z

sys/kern/kern_intr.c

@@ -531,7 +531,7 @@ intr_event_destroy(struct intr_event *ie)
 {

 	if (ie == NULL)
-		return (EINVAL);
+		return (0);


This change is wrong. We want this to be an error. It's a programming error to call this with NULL.

% find sys -type f -print0 | xargs -0 grep -eintr_event_destroy -l sys/netpfil/pf/if_pfsync.c sys/netpfil/pf/pf_ioctl.c sys/netpfil/pf/pflow.c sys/kern/kern_intr.c sys/kern/subr_intr.c sys/sys/interrupt.h %

The ones in sys/netpfil/pf/ are unlikely to trigger this. The only present way for them to trigger this is for something to scribble zeros over memory or some other sort of memory corruption. My concern is they're all calling intr_event_destroy() during shutdown, then either doing MPASS(rc == 0); or returning the error to the caller. If this isn't a debug kernel then the error could simply be lost (sys/netpfil/pf/if_pfsync.c:vnet_pfsync_uninit()). If this is a debug kernel, is a panic() truly a desirable result? I suspect these callers would instead prefer:

if (ie == NULL) { printf("ERROR: %s(): called with NULL event\n", __func__); return (0); }

As the rest of their shutdowns can still occur.

The situation for INTRNG (sys/kern/subr_intr.c) is very different. INTRNG does lazy event allocation. After intr_isrc_register() returns, ->isrc_event is still NULL. ->isrc_event doesn't become non-NULL until isrc_add_handler() is called. As such:

struct intr_irqsrc *isrc; int rc; intr_isrc_register(isrc); rc = intr_isrc_deregister(isrc); printf("intr_isrc_deregister() returned: %d\n", rc);

Will reliably output intr_isrc_deregister() returned: 22. So there is a disagreement in the implementation about what intr_event_destroy(NULL) should do/return. Since I'm looking at a driver which needs intr_isrc_deregister() to work, this is blocking for me.

I suppose you could modify isrc_event_destroy() to do if (isrc->isrc_event != NULL) return(0);. My concern with this is you're testing the same value twice and giving exactly opposite results for a problematic value. I cannot abide this, it is just too ugly (I will hold my nose and ignore it if someone else implements, but I will not implement).

Another viable approach would be to remove lazy event allocation from INTRNG. I don't think lazy event allocation significantly helps INTRNG. I also observe major downsides to lazy event allocation (both here and other places you must first test ->isrc_event, you cannot assume non-NULL). I think it was a mistake to allow PowerPC and INTRNG to use lazy event allocation.
Issue is this would basically invalidate the premise behind INTR_SOLO and someone would be rather annoyed with that (D35607).

Actually, one other approach I would be willing to implement. It would be acceptable to me to revert 39888ed and then have isrc_event_destroy() check for isrc->isrc_event == NULL. It is the double checking for NULL which I have a problem with.

I though would be quite happy if lazy event allocation was deprecated, and assured removing that would be accepted.

In different terms, this situation is highly similar to c1287a3, Yes, it might be an error, but handing the error off to the caller drastically escalates the damage. Whereas handling it mitigates things.

sys/kern/subr_intr.c

bsdimp · 2024-06-15T05:06:26Z

@markjdb prefers intr_event_destroy() returning an error for ie == NULL. Problem is this is answering the wrong question for most callers. Since this is called from cleanup/error paths, the caller wants to know Has everything been properly cleaned up? Whereas returning EINVAL is answering Has everything been cleaned up and was everything in **perfect** condition?

I'm pretty sure most, if not all callers will want the answer to the first since EBUSY means memory is likely being leaked. Whereas for INTRNG, the event being NULL simply means no event was ever allocated and things are fine.

I agree with mark. All the callers want no errors at all. They all assume that it's OK to tear down this intr_event. EBUSY is an error here (not a try again considtion). It's for when the handlers haven't been unregistered. This leads to a panic in most of the callers as a programming error. I disagree with this analysis. That makes the whole basis for this change in question. And the change itself isn't just this, but has other bits in it as well. I'm inclined to just close this and something we don't want, at least not in the current form.

ehem · 2024-06-15T20:43:40Z

That makes the whole basis for this change in question. And the change itself isn't just this, but has other bits in it as well. I'm inclined to just close this and something we don't want, at least not in the current form.

Wrong. The one and only basis for this pull is that intr_deregister_source() is unreliable and really should be fixed.

ehem · 2024-06-18T00:23:51Z

An alternative patch set to fix the issue: altfix As GitHub functions on branch names, I could readily push that on top of "intrfix" for this pull.

It is nominally feasible to omit the first patch in that series. Problem is that leaves INTR_SOLO in a broken state. The premise of INTR_SOLO is to try to avoid allocating the event, which is incompatible with removing lazy allocation. The last 3 are also optional for solving the issue, though the fourth looks like a kind of Good Idea.

ehem · 2024-06-22T01:39:28Z

Okay, try this. Note the later commits are cleaning out some bits which I would tend to, but they are optional to the goal of this pull request.

I suspect INTR_SOLO retains its level of functionality. The worth may be degraded, but as near as I can tell it still does what it did before (I'm unable to test since there are no test cases).

I'm trying to check whether isrc->isrc_handlers can be removed. Looks like it should work, but testing failed. I'm left suspicious the PIC drivers may have placed some steps which should have been in pic_enable_intr() hooks into pic_setup_intr() hooks.

ehem · 2024-06-22T05:04:58Z

It actually only takes the two at the front to achieve my primary objective. With those intr_isrc_deregister() would genuinely function and fully release the portions allocated by INTRNG.

The as not yet submitted driver would simply use intr_isrc_deregister() and emit a warning if it gets an error code. It simply won't handle recovery beyond that. Yet I feel it is my duty to allow other driver writers to go further on their recovery approaches.

ehem · 2024-06-25T02:34:08Z

Okay, figured out what was going on with ->isrc_handlers, so that can now be nuked. The goal for this pull is achieved at 3068c82, though the cleanup in 56aaa29 is kind of nice too.

I would like to purge INTR_SOLO, but that is not a requirement for this to be a success. The main goal is simply for:

intr_isrc_register(isrc);
intr_isrc_deregister(isrc);

To fully cleanup the structure with no leaks. Notably this requires ->isrc_event to be released by INTRNG (which allocates it). I would prefer for it to be possible to recover after EBUSY, but that is not actually a requirement for my goal.

sys/kern/subr_intr.c

ehem · 2024-06-28T05:22:17Z

I finally realized INTR_SOLO isn't actually incompatible with always allocating events. It is doing using INTR_SOLO and event-style handlers at the same time which is incompatible.

As a result removing lazy event allocation works fine. This then means there is no need to check for ->isrc_event being NULL before calling intr_event_destroy(). The only need is to actually call isrc_event_destroy() so the event is released.

This version has been tested. The final commit isn't absolutely needed, but I do wish to express the point that with no activity on INTR_SOLO in more than 5 years it is overdue for removal. Its presence confuses developers into thinking it might be the future (it is not) and is thereby wasting a great deal of developer time.

ehem · 2024-07-23T20:19:51Z

I'm planning to update to FreeBSD's HEAD soon. There will be a rebase in a few hours.

ehem · 2024-07-23T22:51:15Z

In fact, the two at the front really do fix the problem sufficiently. If the event is always allocated, then intr_event_destroy() won't ever be called with a NULL event. Then it simply becomes an issue of needing isrc_event_destroy() to be fixed and be called.

I still feel intr_event_destroy() returning success and doing a printf() is the way to go. Since the function is called from cleanup paths, you don't want to abort those if no event was ever allocated or cleanup was already done.

ehem · 2024-08-02T19:41:03Z

Given how things look now, I plan to rebase in a bit. Some of what is in here now looks worthy of being split into a separate branch.

ehem · 2024-12-11T19:07:00Z

The issue 5aa7e6e is aimed at is intr_event_destroy() is a shutdown/cleanup function. As such only urgent errors should be returned. The event already being NULL is legitimately bad, but if the caller is cleaning up due to other error situation there is likely no recovery path. Whereas if printf() is done there is hope the system won't immediately need to panic().

In theory simply including the string @strejda can summon people. The main goal here is intr_isrc_deregister() leaks memory. INTRNG may allocate for ->isrc_event, but presently intr_isrc_deregister() fails to clean that up.

There is minimal benefit in delaying event allocation. Worse, this reduces opportunities for merging architectural interrupt structures together. Since the event is now created before being added to the interrupt table, there is no longer any need for locking. A few spots also no longer need to check for ->isrc_event being NULL, clean those up. Differential Revision: https://reviews.freebsd.org/D40166

SSIA. If the event is still valid when deregister is called, it needs to be destroyed. Differential Revision: https://reviews.freebsd.org/D38599

Having the table entry already wiped is acceptable. Otherwise an error may cause the function to be called again and there will be an infinite loop. Differential Revision: https://reviews.freebsd.org/D38599

A NULL event likely indicates the event was either already released or never allocated. In such case the event has already been cleaned up and that is what the caller needs to know. This gets called from cleanup paths and cleanup should not be abandoned if the event is invalid. Emit a warning message as this indicates inconsistent state. There may well simply be the precursor to a page-fault panic(). Differential Revision: https://reviews.freebsd.org/D38602

Report errors returned by intr_event_destroy(). While these should be highly unlikely, they are worth of being reported nonetheless.

ehem force-pushed the intrfix branch from d5549d2 to 50614ed Compare June 11, 2024 00:29

bsdimp requested changes Jun 15, 2024

View reviewed changes

ehem force-pushed the intrfix branch 3 times, most recently from 50614ed to 21f51c4 Compare June 21, 2024 01:09

ehem force-pushed the intrfix branch from 21f51c4 to 3256e6c Compare June 22, 2024 04:27

ehem force-pushed the intrfix branch from 3256e6c to 0d2326a Compare June 25, 2024 02:28

ehem requested a review from bsdimp June 25, 2024 02:34

ehem commented Jun 28, 2024

View reviewed changes

sys/kern/subr_intr.c Outdated Show resolved Hide resolved

ehem force-pushed the intrfix branch from 0d2326a to 0c92234 Compare June 28, 2024 05:15

ehem force-pushed the intrfix branch from 0c92234 to 535f54c Compare July 23, 2024 22:35

ehem force-pushed the intrfix branch 2 times, most recently from 7e6a34c to fd060ad Compare August 5, 2024 20:33

ehem mentioned this pull request Aug 5, 2024

Minimal collection for running on Xen/ARM64 #1358

Draft

3 tasks

ehem force-pushed the intrfix branch from fd060ad to 51d603d Compare October 25, 2024 21:08

ehem force-pushed the intrfix branch 2 times, most recently from 02ba14a to 7ca6d44 Compare November 6, 2024 20:45

ehem force-pushed the intrfix branch from 7ca6d44 to 032c7cf Compare December 11, 2024 19:00

ehem force-pushed the intrfix branch 6 times, most recently from 7f63cf4 to 02ea9ac Compare December 19, 2024 21:18

ehem force-pushed the intrfix branch from 02ea9ac to d19901a Compare December 23, 2024 18:07

ehem mentioned this pull request Dec 23, 2024

Intrng: adjusting pic_init_secondary for further use cases #1280

Closed

ehem force-pushed the intrfix branch 3 times, most recently from 18963a7 to dec13a4 Compare January 6, 2025 17:23

ehem force-pushed the intrfix branch 2 times, most recently from e2252cc to 9f2fcd9 Compare January 15, 2025 19:52

ehem force-pushed the intrfix branch 2 times, most recently from dddecf4 to 0bc2651 Compare January 29, 2025 22:31

ehem force-pushed the intrfix branch 7 times, most recently from 089aca8 to ab41b68 Compare February 7, 2025 17:58

ehem added 5 commits February 10, 2025 10:15

intrng: destroy event when deregistering source

e6ebaee

SSIA. If the event is still valid when deregister is called, it needs to be destroyed. Differential Revision: https://reviews.freebsd.org/D38599

intrng: have isrc_free_irq() return success if already free

f8856d3

Having the table entry already wiped is acceptable. Otherwise an error may cause the function to be called again and there will be an infinite loop. Differential Revision: https://reviews.freebsd.org/D38599

kern/intr: declare intr_event_destroy() __result_use_check

2868866

Report errors returned by intr_event_destroy(). While these should be highly unlikely, they are worth of being reported nonetheless.

ehem force-pushed the intrfix branch from ab41b68 to 2868866 Compare February 10, 2025 18:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix INTRNG interrupt release #1281

fix INTRNG interrupt release #1281

ehem commented Jun 7, 2024

ehem commented Jun 7, 2024

ehem commented Jun 7, 2024

ehem commented Jun 11, 2024

bsdimp left a comment

bsdimp Jun 15, 2024

ehem Jun 15, 2024

bsdimp Jun 15, 2024

ehem Jun 15, 2024 •

edited

Loading

bsdimp Jun 15, 2024

ehem Jun 15, 2024

ehem Jun 15, 2024

ehem Aug 3, 2024

bsdimp commented Jun 15, 2024

ehem commented Jun 15, 2024

ehem commented Jun 18, 2024

ehem commented Jun 22, 2024

ehem commented Jun 22, 2024

ehem commented Jun 25, 2024

ehem commented Jun 28, 2024

ehem commented Jul 23, 2024

ehem commented Jul 23, 2024

ehem commented Aug 2, 2024

ehem commented Dec 11, 2024

fix INTRNG interrupt release #1281

Are you sure you want to change the base?

fix INTRNG interrupt release #1281

Conversation

ehem commented Jun 7, 2024

ehem commented Jun 7, 2024

ehem commented Jun 7, 2024

ehem commented Jun 11, 2024

bsdimp left a comment

Choose a reason for hiding this comment

bsdimp Jun 15, 2024

Choose a reason for hiding this comment

ehem Jun 15, 2024

Choose a reason for hiding this comment

bsdimp Jun 15, 2024

Choose a reason for hiding this comment

ehem Jun 15, 2024 • edited Loading

Choose a reason for hiding this comment

bsdimp Jun 15, 2024

Choose a reason for hiding this comment

ehem Jun 15, 2024

Choose a reason for hiding this comment

ehem Jun 15, 2024

Choose a reason for hiding this comment

ehem Aug 3, 2024

Choose a reason for hiding this comment

bsdimp commented Jun 15, 2024

ehem commented Jun 15, 2024

ehem commented Jun 18, 2024

ehem commented Jun 22, 2024

ehem commented Jun 22, 2024

ehem commented Jun 25, 2024

ehem commented Jun 28, 2024

ehem commented Jul 23, 2024

ehem commented Jul 23, 2024

ehem commented Aug 2, 2024

ehem commented Dec 11, 2024

ehem Jun 15, 2024 •

edited

Loading