Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix INTRNG interrupt release #1281

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

fix INTRNG interrupt release #1281

wants to merge 5 commits into from

Conversation

ehem
Copy link
Contributor

@ehem ehem commented Jun 7, 2024

SSIA.

While it is good intr_isrc_deregister() exists, it hasn't ever actually been used. Due to a driver which needs this, fix issues found by reviewing the source and finding solutions.

@ehem
Copy link
Contributor Author

ehem commented Jun 7, 2024

@markjdb prefers intr_event_destroy() returning an error for ie == NULL. Problem is this is answering the wrong question for most callers. Since this is called from cleanup/error paths, the caller wants to know Has everything been properly cleaned up? Whereas returning EINVAL is answering Has everything been cleaned up and was everything in **perfect** condition?

I'm pretty sure most, if not all callers will want the answer to the first since EBUSY means memory is likely being leaked. Whereas for INTRNG, the event being NULL simply means no event was ever allocated and things are fine.

@ehem
Copy link
Contributor Author

ehem commented Jun 7, 2024

Putting this a different way, given a random struct intr_irqsrc *isrc;, the following must always terminate no matter the initial state of isrc:

while(intr_isrc_deregister(isrc)!=0)
    sleep(1);

(this would be a hypothetical PIC which implemented lazy release of interrupts)

If the filter hadn't yet been removed from the event, the first call to intr_isrc_deregister() would successfully release isrc->isrc_irq, but then return EBUSY from intr_event_destroy(). For subsequent calls currently isrc_free_irq() would return EINVAL and the loop would never terminate.

@ehem
Copy link
Contributor Author

ehem commented Jun 11, 2024

This was simply rebased to aid keeping track of local branches, everything is identical to the original. Notably the CI pass is still valid.

Copy link
Member

@bsdimp bsdimp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm inclined to just close this as not desired. There's so much here that needs to change, I'm not sure that there's anything left

if (error != 0)
return (error);

return (isrc_event_destroy(isrc));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted above, this routine is never called. Why do we need to call isrc_event_destroy() here? That's new and not quite what the commit message says. How did you test this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice how the event is created by isrc_event_create()? Given that, it is INTRNG's job to destroy the event, yet this call was missing. I simply needed to examine intr_isrc_deregister() and notice the missing call. The driver is sitting in my tree due it needing intr_isrc_deregister() to actually work.

isrc->isrc_irq = INTR_IRQ_INVALID; /* just to be safe */
if (irq_sources[isrc->isrc_irq] == isrc)
irq_sources[isrc->isrc_irq] = NULL;
isrc->isrc_irq = INTR_IRQ_INVALID; /* mark removed */

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be papering over programing errors. It's a programming error when isrc_irq is >= intr_nirq. That normally can't happen. Likewise with setting things to NULL in this way. This is a bad change. NAK.

Hard to say for sure, though, since there's not actually any callers of this in the tree. This is only called from intr_isrc_deregister, which is global, but has no actual callers. So how did you test this?

But this is unsafe. If they don't match, it must be an error.

Copy link
Contributor Author

@ehem ehem Jun 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be papering over programing errors. It's a programming error when isrc_irq is >= intr_nirq. That normally can't happen. Likewise with setting things to NULL in this way. This is a bad change. NAK.

A hypothetical scenario, but one which I believe should be handled as it could nominally be used. Imagine a highly hot-plug bus on a system with >1000 processors. On such a hot-plug bus it could nominally be valid to have one thread which calls intr_isrc_deregister() at 1 second intervals and a separate thread doing intr_teardown_irq() calls. In such a case the sequence:

intr_isrc_deregister();
intr_teardown_irq();
intr_isrc_deregister();

Issue is the first intr_isrc_deregister() call would get EBUSY from intr_event_destroy() since the filter wasn't yet removed. Presently the second would get EINVAL from isrc_free_irq() since the isrc had already been removed from the table by the first call. When it comes down to it, either isrc_free_irq() or isrc_event_destroy() must tolerate being called multiple times with the same isrc.

Hard to say for sure, though, since there's not actually any callers of this in the tree. This is only called from intr_isrc_deregister, which is global, but has no actual callers. So how did you test this?

By reading the source and concluding this was a nominally valid course of action. I've got a driver which is expected to call intr_isrc_deregister() during normal operation. As such I'm examining intr_isrc_deregister() and discovering it is rather buggy. The x86 equivalent driver is unable to release interrupts and is thus forced to keep them on a list. Whereas INTRNG has the function partially implemented, but untested.

I have no expectation to need this exact functionality, but when trying to fix the problems with intr_isrc_deregister() this issue shows up. Presently I'm unable to submit the driver since 3 key areas it needs working aren't in usable shape.

@@ -531,7 +531,7 @@ intr_event_destroy(struct intr_event *ie)
{

if (ie == NULL)
return (EINVAL);
return (0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is wrong. We want this to be an error. It's a programming error to call this with NULL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

% find sys -type f -print0 | xargs -0 grep -eintr_event_destroy -l
sys/netpfil/pf/if_pfsync.c
sys/netpfil/pf/pf_ioctl.c
sys/netpfil/pf/pflow.c
sys/kern/kern_intr.c
sys/kern/subr_intr.c
sys/sys/interrupt.h
% 

The ones in sys/netpfil/pf/ are unlikely to trigger this. The only present way for them to trigger this is for something to scribble zeros over memory or some other sort of memory corruption. My concern is they're all calling intr_event_destroy() during shutdown, then either doing MPASS(rc == 0); or returning the error to the caller. If this isn't a debug kernel then the error could simply be lost (sys/netpfil/pf/if_pfsync.c:vnet_pfsync_uninit()). If this is a debug kernel, is a panic() truly a desirable result? I suspect these callers would instead prefer:

if (ie == NULL) {
        printf("ERROR: %s(): called with NULL event\n", __func__);
        return (0);
}

As the rest of their shutdowns can still occur.

The situation for INTRNG (sys/kern/subr_intr.c) is very different. INTRNG does lazy event allocation. After intr_isrc_register() returns, ->isrc_event is still NULL. ->isrc_event doesn't become non-NULL until isrc_add_handler() is called. As such:

struct intr_irqsrc *isrc;
int rc;
intr_isrc_register(isrc);
rc = intr_isrc_deregister(isrc);
printf("intr_isrc_deregister() returned: %d\n", rc);

Will reliably output intr_isrc_deregister() returned: 22. So there is a disagreement in the implementation about what intr_event_destroy(NULL) should do/return. Since I'm looking at a driver which needs intr_isrc_deregister() to work, this is blocking for me.

I suppose you could modify isrc_event_destroy() to do if (isrc->isrc_event != NULL) return(0);. My concern with this is you're testing the same value twice and giving exactly opposite results for a problematic value. I cannot abide this, it is just too ugly (I will hold my nose and ignore it if someone else implements, but I will not implement).

Another viable approach would be to remove lazy event allocation from INTRNG. I don't think lazy event allocation significantly helps INTRNG. I also observe major downsides to lazy event allocation (both here and other places you must first test ->isrc_event, you cannot assume non-NULL). I think it was a mistake to allow PowerPC and INTRNG to use lazy event allocation.
Issue is this would basically invalidate the premise behind INTR_SOLO and someone would be rather annoyed with that (D35607).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, one other approach I would be willing to implement. It would be acceptable to me to revert 39888ed and then have isrc_event_destroy() check for isrc->isrc_event == NULL. It is the double checking for NULL which I have a problem with.

I though would be quite happy if lazy event allocation was deprecated, and assured removing that would be accepted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In different terms, this situation is highly similar to c1287a3, Yes, it might be an error, but handing the error off to the caller drastically escalates the damage. Whereas handling it mitigates things.

sys/kern/subr_intr.c Outdated Show resolved Hide resolved
@bsdimp
Copy link
Member

bsdimp commented Jun 15, 2024

@markjdb prefers intr_event_destroy() returning an error for ie == NULL. Problem is this is answering the wrong question for most callers. Since this is called from cleanup/error paths, the caller wants to know Has everything been properly cleaned up? Whereas returning EINVAL is answering Has everything been cleaned up and was everything in **perfect** condition?

I'm pretty sure most, if not all callers will want the answer to the first since EBUSY means memory is likely being leaked. Whereas for INTRNG, the event being NULL simply means no event was ever allocated and things are fine.

I agree with mark. All the callers want no errors at all. They all assume that it's OK to tear down this intr_event. EBUSY is an error here (not a try again considtion). It's for when the handlers haven't been unregistered. This leads to a panic in most of the callers as a programming error. I disagree with this analysis. That makes the whole basis for this change in question. And the change itself isn't just this, but has other bits in it as well. I'm inclined to just close this and something we don't want, at least not in the current form.

@ehem
Copy link
Contributor Author

ehem commented Jun 15, 2024

That makes the whole basis for this change in question. And the change itself isn't just this, but has other bits in it as well. I'm inclined to just close this and something we don't want, at least not in the current form.

Wrong. The one and only basis for this pull is that intr_deregister_source() is unreliable and really should be fixed.

@ehem
Copy link
Contributor Author

ehem commented Jun 18, 2024

An alternative patch set to fix the issue: altfix As GitHub functions on branch names, I could readily push that on top of "intrfix" for this pull.

It is nominally feasible to omit the first patch in that series. Problem is that leaves INTR_SOLO in a broken state. The premise of INTR_SOLO is to try to avoid allocating the event, which is incompatible with removing lazy allocation. The last 3 are also optional for solving the issue, though the fourth looks like a kind of Good Idea.

@ehem ehem force-pushed the intrfix branch 3 times, most recently from 50614ed to 21f51c4 Compare June 21, 2024 01:09
@ehem
Copy link
Contributor Author

ehem commented Jun 22, 2024

Okay, try this. Note the later commits are cleaning out some bits which I would tend to, but they are optional to the goal of this pull request.

I suspect INTR_SOLO retains its level of functionality. The worth may be degraded, but as near as I can tell it still does what it did before (I'm unable to test since there are no test cases).

I'm trying to check whether isrc->isrc_handlers can be removed. Looks like it should work, but testing failed. I'm left suspicious the PIC drivers may have placed some steps which should have been in pic_enable_intr() hooks into pic_setup_intr() hooks.

@ehem
Copy link
Contributor Author

ehem commented Jun 22, 2024

It actually only takes the two at the front to achieve my primary objective. With those intr_isrc_deregister() would genuinely function and fully release the portions allocated by INTRNG.

The as not yet submitted driver would simply use intr_isrc_deregister() and emit a warning if it gets an error code. It simply won't handle recovery beyond that. Yet I feel it is my duty to allow other driver writers to go further on their recovery approaches.

@ehem
Copy link
Contributor Author

ehem commented Jun 25, 2024

Okay, figured out what was going on with ->isrc_handlers, so that can now be nuked. The goal for this pull is achieved at 3068c82, though the cleanup in 56aaa29 is kind of nice too.

I would like to purge INTR_SOLO, but that is not a requirement for this to be a success. The main goal is simply for:

intr_isrc_register(isrc);
intr_isrc_deregister(isrc);

To fully cleanup the structure with no leaks. Notably this requires ->isrc_event to be released by INTRNG (which allocates it). I would prefer for it to be possible to recover after EBUSY, but that is not actually a requirement for my goal.

@ehem ehem requested a review from bsdimp June 25, 2024 02:34
sys/kern/subr_intr.c Outdated Show resolved Hide resolved
@ehem
Copy link
Contributor Author

ehem commented Jun 28, 2024

I finally realized INTR_SOLO isn't actually incompatible with always allocating events. It is doing using INTR_SOLO and event-style handlers at the same time which is incompatible.

As a result removing lazy event allocation works fine. This then means there is no need to check for ->isrc_event being NULL before calling intr_event_destroy(). The only need is to actually call isrc_event_destroy() so the event is released.

This version has been tested. The final commit isn't absolutely needed, but I do wish to express the point that with no activity on INTR_SOLO in more than 5 years it is overdue for removal. Its presence confuses developers into thinking it might be the future (it is not) and is thereby wasting a great deal of developer time.

@ehem
Copy link
Contributor Author

ehem commented Jul 23, 2024

I'm planning to update to FreeBSD's HEAD soon. There will be a rebase in a few hours.

@ehem
Copy link
Contributor Author

ehem commented Jul 23, 2024

In fact, the two at the front really do fix the problem sufficiently. If the event is always allocated, then intr_event_destroy() won't ever be called with a NULL event. Then it simply becomes an issue of needing isrc_event_destroy() to be fixed and be called.

I still feel intr_event_destroy() returning success and doing a printf() is the way to go. Since the function is called from cleanup paths, you don't want to abort those if no event was ever allocated or cleanup was already done.

@ehem
Copy link
Contributor Author

ehem commented Aug 2, 2024

Given how things look now, I plan to rebase in a bit. Some of what is in here now looks worthy of being split into a separate branch.

@ehem ehem force-pushed the intrfix branch 2 times, most recently from 7e6a34c to fd060ad Compare August 5, 2024 20:33
@ehem ehem force-pushed the intrfix branch 2 times, most recently from 02ba14a to 7ca6d44 Compare November 6, 2024 20:45
@ehem
Copy link
Contributor Author

ehem commented Dec 11, 2024

The issue 5aa7e6e is aimed at is intr_event_destroy() is a shutdown/cleanup function. As such only urgent errors should be returned. The event already being NULL is legitimately bad, but if the caller is cleaning up due to other error situation there is likely no recovery path. Whereas if printf() is done there is hope the system won't immediately need to panic().

In theory simply including the string @strejda can summon people. The main goal here is intr_isrc_deregister() leaks memory. INTRNG may allocate for ->isrc_event, but presently intr_isrc_deregister() fails to clean that up.

@ehem ehem force-pushed the intrfix branch 6 times, most recently from 7f63cf4 to 02ea9ac Compare December 19, 2024 21:18
@ehem ehem force-pushed the intrfix branch 3 times, most recently from 18963a7 to dec13a4 Compare January 6, 2025 17:23
@ehem ehem force-pushed the intrfix branch 2 times, most recently from e2252cc to 9f2fcd9 Compare January 15, 2025 19:52
@ehem ehem force-pushed the intrfix branch 2 times, most recently from dddecf4 to 0bc2651 Compare January 29, 2025 22:31
@ehem ehem force-pushed the intrfix branch 7 times, most recently from 089aca8 to ab41b68 Compare February 7, 2025 17:58
ehem added 5 commits February 10, 2025 10:15
There is minimal benefit in delaying event allocation.  Worse, this
reduces opportunities for merging architectural interrupt structures
together.

Since the event is now created before being added to the interrupt
table, there is no longer any need for locking.

A few spots also no longer need to check for ->isrc_event being
NULL, clean those up.

Differential Revision: https://reviews.freebsd.org/D40166
SSIA.  If the event is still valid when deregister is called, it needs
to be destroyed.

Differential Revision: https://reviews.freebsd.org/D38599
Having the table entry already wiped is acceptable.  Otherwise an error
may cause the function to be called again and there will be an infinite
loop.

Differential Revision: https://reviews.freebsd.org/D38599
A NULL event likely indicates the event was either already released or
never allocated.  In such case the event has already been cleaned up and
that is what the caller needs to know.  This gets called from cleanup
paths and cleanup should not be abandoned if the event is invalid.

Emit a warning message as this indicates inconsistent state.  There may
well simply be the precursor to a page-fault panic().

Differential Revision: https://reviews.freebsd.org/D38602
Report errors returned by intr_event_destroy().  While these should be
highly unlikely, they are worth of being reported nonetheless.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants