-
Notifications
You must be signed in to change notification settings - Fork 595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
controller indicates ready only after verifying proxy configured work #1720
Conversation
Only mark the controller ready after it can assume that the proxy has configuration available, either immediately if the proxy has a database or after the first config push if the proxy has no database.
Licenses differ between commit 2609005350500696c825a5647805f0f482b39a57 and base:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is my idea for testing this bit of functionality:
- Implement a unit test with a mock proxy, and then:
- after initialization, ensure that ready == false
- report readiness in the mock
- ensure that ready == true
- modify the existing integration test harness to block until the proxy reports readiness. This will check only the "positive" half of the readiness check, though, but the negative case should be sufficiently covered by the unit test.
Several tags were incorrect in the changelog footer links. Align tags with their actual names and the appropriate comparison versions.
Tests more or less follow #1720 (review). It doesn't actually block testing on readiness, but does confirm that readiness is eventually good, which is roughly equivalent. Most of the other tests imply readiness because they fail if config didn't post, so it doesn't make much sense to block them. I'm still a bit iffy on that strategy since IMO the negative case in integration is what matters for verifying the feature. That seems like it'd require some weird refactoring where we pull the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, just typo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Continuing the discussion from https://github.com/Kong/kubernetes-ingress-controller/pull/1720/files#discussion_r691494074 here because apparently GH doesn't allow responding to that thread anymore.
What I meant is that we could just define the func ConfigReady(_ *http.Request) error
function as a plain old top-level function in the global scope of the file, and then use ConfigReady
as an argument to mgr.AddReadyzCheck
. ConfigReady
won't work directly as a value that will be accepted as an argument thereto (because of a type mismatch), so it'd need a type cast: healthz.Checker(ConfigReady)
.
- Add a dedicated value for having applied config at least once - Add a lock for the config applied flag - Use dedicate True/False test assertions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor semantics
#1720 (comment) might work to go to the inline discussion? That one still shows replies available to me, but yeah, the GH UI has gotten confused. As I read the last comment though, it looks like that suggestion was about using the var to make it a Checker, not about doing it to make the dynamic-scoped function, so without further comment on the new inline version, I'm assuming that's resolved and handles both concerns (using the proxy instance and not allowing variable substitution) and have marked the thread such. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 💯
Co-authored-by: Michał Flendrich <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
What this PR does / why we need it:
Only mark the controller ready after it can assume that the proxy has configuration available, either immediately if the proxy has a database or after the first config push if the proxy has no database.
Replaces the no-op Ping function with a dynamically-scoped
ConfigReady()
function.ConfigReady()
returns a nil error if the previously instantiated Proxy'sIsReady()
function returnstrue
or returns a "proxy not yet configured" error ifIsReady()
returnsfalse
IsReady()
returnstrue
if:proxy.dbmode == "off"
andlen(proxy.lastConfigSHA) > 0
(i.e. we have applied config at least once).proxy.dbmode
is any other value (we assume config is probably already available in the DB). This doesn't handle the initial start edge case where the database is empty, but is usually good enough. Handling the edge case requires some complex nonsense with figuring out if the cluster controller leader has applied configuration (since non-leaders never apply config themselves).It returns
false
otherwise.Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close that issue when PR gets merged): fixes #1504Special notes for your reviewer:
I haven't the foggiest how you test this consistently and meaningfully.
Unit tests seem not particularly useful since we'd need to stuff a fake value into
lastConfigSHA
rather than allowing it to populate naturally. Effectively all we confirm is that string comparisons and slice length checks work as expected.Integration tests are complicated by many factors:
lastConfigSHA
. The controller will sync even before you've created any Ingresses and such. We can only make it not sync by either not starting the proxy (more or less what we had before, but not easily doable in our test environment because we don't get to let Kubernetes restart the container if the initial connection attempt fails) or by creating resources that result in invalid configuration before anything else applies (not sure how to do this with the test parallelization), confirm not ready, rectify the error, and confirm ready.A manual test of the image does at least confirm that the controller becomes ready, but even in manual runs I don't know how I can confirm its behavior that it doesn't become ready when it shouldn't.
PR Readiness Checklist:
Complete these before marking the PR as
ready to review
:CHANGELOG.md
release notes have been updated to reflect any significant (and particularly user-facing) changes introduced by this PR