-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configurable MaxQueryTime and DefaultQueryTime #3777
Conversation
@urbas we are interested as well by this PR, could you rebase it? |
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, and I think it is definitly a feature we would like to use.
I just wonder if we could put a max on the jitter time because if we set very high values, the jitter might be quite high (ex: 24h -> more than 1h30 of jitter)
agent/consul/rpc.go
Outdated
} else if queryOpts.MaxQueryTime <= 0 { | ||
queryOpts.MaxQueryTime = defaultQueryTime | ||
if queryOpts.MaxQueryTime <= 0 { | ||
queryOpts.MaxQueryTime = s.config.DefaultQueryTime | ||
} | ||
|
||
// Apply a small amount of jitter to the request. | ||
queryOpts.MaxQueryTime += lib.RandomStagger(queryOpts.MaxQueryTime / jitterFraction) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If MaxQueryTime is set to a very big value, jitter might be quite high.
Limited for now to 37s, but if the value is very high (for instance 24h), then value might be quite high... do you think we might put a max on it ? (such as 60s)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pierresouchay: do you mind if I add the jitter feature in another pull request? The config is changing a lot and rebasing causes conflicts that are tedious to resolve. It would help if I kept the PRs small.
124e045
to
4d8fc44
Compare
I am not owner of repository, it was just a remark ;) |
@urbas this sounds like a good idea to me. are you still interested in making this change? And if so, could you update your PR? |
71a018b
to
15e7c5c
Compare
Done. However, a unit test failed. It looks unrelated to my change. The test is waiting max 7 seconds for the secondary DC to get CA roots from the primary. This might not be enough time for the transition to happen. Note that the timeout of 7 seconds comes from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your work! This looks good, I noticed that the docs are missing on https://www.consul.io/docs/agent/options.html. Could you also add a test that proves your configuration is working?
@@ -512,8 +512,6 @@ func TestLeader_SecondaryCA_TransitionFromPrimary(t *testing.T) { | |||
func TestLeader_SecondaryCA_UpgradeBeforePrimary(t *testing.T) { | |||
t.Parallel() | |||
|
|||
maxRootsQueryTime = 500 * time.Millisecond |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is still needed, because it changes how long the roots endpoint is blocking before it returns in the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it to the server configurations. Hopefully that does the trick.
Any updates? |
Thank you for updating your PR. I think the only thing that is missing now is the documentation for the new config options in https://www.consul.io/docs/agent/options.html. |
Updated tests to use max and default query time.
Codecov Report
@@ Coverage Diff @@
## master #3777 +/- ##
==========================================
- Coverage 65.63% 65.58% -0.05%
==========================================
Files 443 443
Lines 53307 53317 +10
==========================================
- Hits 34988 34968 -20
- Misses 14094 14116 +22
- Partials 4225 4233 +8
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your work, this looks great!
This change moves constants
maxQueryTime
anddefaultQueryTime
inrpc.go
into the configuration.This provides support for use cases where we're okay with blocking queues that take longer than 10m. For example, AWS ELB allows 1h timeouts. This allows lowering the number of re-connections by orders of magnitude.
Configuring
maxQueryTime
is also useful for use cases where establishing new connections is at a premium (either because of power or data limitations).