-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
template: fix panic in change_mode=script on client restart #24057
Conversation
e0526f9
to
b335f6c
Compare
b335f6c
to
16b1886
Compare
16b1886
to
f0fa711
Compare
@@ -1220,29 +1209,6 @@ OUTER: | |||
t.Fatal(t, "should have received an event") | |||
} | |||
} | |||
|
|||
// remove the reference to the task handle and update the template contents |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to reviewers: this is the test we added in #23917 that we knew even at the time didn't really test anything useful. Now that we've fixed the problem by fixing the architecture, this doesn't even have anything left to test so I'm removing it. There's no way to "remove the task handle" anymore, as the template manager doesn't have a reference to it at all. The code for this PR is covered in task_runner_hook_test.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense to remove this, but I don't see a test file called task_runner_hook_test.go
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, that's a typo, it's this test: https://github.com/hashicorp/nomad/blob/main/client/allocrunner/taskrunner/template_hook_test.go#L277 Note that only an integration test that spins up a real task and can restart a client can actually exercise this bug. So that falls under the general pile of "things we'd really like to do when we have solid E2E upgrade/restart testing".
@@ -80,6 +83,9 @@ func (h *DriverHandle) ExecStreaming(ctx context.Context, | |||
command []string, | |||
tty bool, | |||
stream drivers.ExecTaskStream) error { | |||
if h == nil { | |||
return ErrTaskNotRunning | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes aren't really necessary now, but they'll provide belt-and-suspenders against future bugs.
f0fa711
to
bb60ebe
Compare
When we introduced change_mode=script to templates, we passed the driver handle down into the template manager so we could call its `Exec` method directly. But the lifecycle of the driver handle is managed by the taskrunner and isn't available when the template manager is first created. This has led to a series of patches trying to fixup the behavior (#15915, #15192, #23663, #23917). Part of the challenge in getting this right is using an interface to avoid the circular import of the driver handle. But the taskrunner already has a way to deal with this problem using a "lazy handle". The other template change modes already use this indirectly through the `Lifecycle` interface. Change the driver handle `Exec` call in the template manager to a new `Lifecycle.Exec` call that reuses the existing behavior. This eliminates the need for the template manager to know anything at all about the handle state. Fixes: #24051
bb60ebe
to
8e9cb77
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
return | ||
} | ||
_, exitCode, err := tm.handle.Exec(script.Timeout, script.Command, script.Args) | ||
_, exitCode, err := tm.config.Lifecycle.Exec(script.Timeout, script.Command, script.Args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this feels so much nicer 👏
@@ -1220,29 +1209,6 @@ OUTER: | |||
t.Fatal(t, "should have received an event") | |||
} | |||
} | |||
|
|||
// remove the reference to the task handle and update the template contents |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense to remove this, but I don't see a test file called task_runner_hook_test.go
?
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
When we introduced
change_mode=script
to templates, we passed the driver handle down into the template manager so we could call itsExec
method directly. But the lifecycle of the driver handle is managed by the taskrunner and isn't available when the template manager is first created. This has led to a series of patches trying to fixup the behavior (#15915, #15192, #23663, #23917). Part of the challenge in getting this right is using an interface to avoid the circular import of the driver handle.But the taskrunner already has a way to deal with this problem using a "lazy handle". The other template change modes already use this indirectly through the
Lifecycle
interface. Change the driver handleExec
call in the template manager to a newLifecycle.Exec
call that reuses the existing behavior. This eliminates the need for the template manager to know anything at all about the handle state.Fixes: #24051