-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable cache by default for DNS ServiceDiscoverer #2518
Conversation
Motivation: When we poll DNS in the background the cache is not important because we always schedule the next query after cache expires. It only helps with concurrent resolutions of the same address, which in practice should be minimal if users properly reuse clients. Disadvantage of having a cache is that if it gets poisoned with invalid or stale entries there is no way to clear the cache. Cancelling the events publisher and re-subscribing doesn't help because re-subscribe always hits the cache. See apple#2514. Modifications: - Disable cache by default; - Provide API to opt-in for caching (can be useful if users resolve per new connection instead of polling in the background); - Deprecate `minTTL`, remove `maxTTL` builder methods, introduce `ttl(min, max, cache)` instead; - Invoke `ttlCache.prepareForResolution(name)` only for scheduled queries, keep it unchanged when cancel/re-subscribe to correctly offset ttl; - Ignore empty list inside `MinTtlCache.get(...)`; - Make `DefaultDnsClient` logging more consistent; - Enhance testing; Result: Caching is disabled by default. Polling is driven by TTL. In case of re-subscribe, we always send a new query. Users have API to configure min/max polling intervals and caching.
* @return {@code this}. | ||
*/ | ||
DnsServiceDiscovererBuilder maxTTL(int maxTTLSeconds); | ||
DnsServiceDiscovererBuilder ttl(int minSeconds, int maxSeconds, boolean cache); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we start using a new method, consider using Duration
instead of int seconds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, should we add a different builder for the caching enable/disable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RFC defines TTL in seconds and returned result from the server is always in seconds. Feels more natural for DNS to think about the configuration the same way. It's hard to imagine when caching for 5.5 seconds will make more sense vs 5 or 6 seconds.
For a separate method for caching enable/disable is a good question. I keep thinking about what will be the best for users in both modes that we offer: poll DNS vs resolve per new connection. For the polling cache is usually not required, but a short caching for <5 seconds can help when you have multiple clients for the same IP. For the 2nd case cache is more important, but for how long we should cache by default is a question. We can have 2 "global-a" resolvers: one for polling, one for per-connection resolutions OR we can have one with larger polling interval and smaller caching interval. WDYT about having 2 method like:
.pollTtl(min, max)
.cacheTtl(min, max)
Users who want to disable cache will be able to set cacheTtl(0, 0)
, for polling the min allowed number can be either 1 or 0.
We can discuss more offline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed offline, I went with:
.ttl(min, max)
.ttl(min, max, minCache, maxCache)
Thanks for brainstorming!
...tty/src/main/java/io/servicetalk/dns/discovery/netty/DefaultDnsServiceDiscovererBuilder.java
Outdated
Show resolved
Hide resolved
...k-dns-discovery-netty/src/main/java/io/servicetalk/dns/discovery/netty/DefaultDnsClient.java
Outdated
Show resolved
Hide resolved
Motivation: Gradle can run test classes in different order. In case it runs `DnsServiceDiscovererBuilderProviderTest` after any other test, the numbers are off because existing `TestDnsServiceDiscovererBuilderProvider` is applied to any builder. Modifications: - Apply `TestDnsServiceDiscovererBuilderProvider` logic only for `DnsServiceDiscovererBuilderProviderTest`; Result: Tests pass regardless of the execution order.
Build failures will be resolved after #2523 is merged. |
...very-netty/src/main/java/io/servicetalk/dns/discovery/netty/DnsServiceDiscovererBuilder.java
Show resolved
Hide resolved
Motivation: Having no cache works well for happy path and for poisoned cache use-case described in apple#2518. Unfortunately, users may leak new client instances and add excess load on DNS. Having a short-lived cache, similar to OpenJDK default, will help to reduce number of outgoing DNS queries. Modifications: - Increase `DEFAULT_MAX_TTL_CACHE_SECONDS` from 0 to 30 seconds; - Use `Math.min(pollTtl, cacheTtl)` inside `ttl(int, int)` overload to avoid `minCacheSeconds > minSeconds` and `maxCacheSeconds > maxSeconds`; - Log `DEFAULT_TTL_POLL_JITTER_SECONDS` at debug level; - Stop using `DEFAULT_MIN_TTL_POLL_SECONDS` for `srvHostNameRepeatInitialDelay` because they are not related; Result: `DnsServiceDiscoverer` caches results by default for up to 30 seconds, still respecting the original TTL from a DNS server if it's lower than 30 seconds.
Motivation: Having no cache works well for happy path and for poisoned cache use-case described in apple#2518. Unfortunately, users may leak new client instances and add excess load on DNS. Having a short-lived cache, similar to OpenJDK default, will help to reduce number of outgoing DNS queries. Modifications: - Increase `DEFAULT_MAX_TTL_CACHE_SECONDS` from 0 to 30 seconds; - Use `Math.min(pollTtl, cacheTtl)` inside `ttl(int, int)` overload to avoid `minCacheSeconds > minSeconds` and `maxCacheSeconds > maxSeconds`; - Log `DEFAULT_TTL_POLL_JITTER_SECONDS` at debug level; - Stop using `DEFAULT_MIN_TTL_POLL_SECONDS` for `srvHostNameRepeatInitialDelay` because they are not related; - Adjust tests to disable default cache because we manage time manually; Result: `DnsServiceDiscoverer` caches results by default for up to 30 seconds, still respecting the original TTL from a DNS server if it's lower than 30 seconds.
Motivation: Having no cache works well for happy path and for poisoned cache use-case described in #2518. Unfortunately, users may leak new client instances and add excess load on DNS. Having a short-lived cache, similar to OpenJDK default, will help to reduce number of outgoing DNS queries. Modifications: - Increase `DEFAULT_MAX_TTL_CACHE_SECONDS` from 0 to 30 seconds; - Use `Math.min(pollTtl, cacheTtl)` inside `ttl(int, int)` overload to avoid `minCacheSeconds > minSeconds` and `maxCacheSeconds > maxSeconds`; - Log `DEFAULT_TTL_POLL_JITTER_SECONDS` at debug level; - Stop using `DEFAULT_MIN_TTL_POLL_SECONDS` for `srvHostNameRepeatInitialDelay` because they are not related; - Adjust tests to disable default cache because we manage time manually; Result: `DnsServiceDiscoverer` caches results by default for up to 30 seconds, still respecting the original TTL from a DNS server if it's lower than 30 seconds.
Motivation:
When we poll DNS in the background the cache is not important because we always schedule the next query after cache expires. It only helps with concurrent resolutions of the same address, which in practice should be minimal if users properly reuse clients. The disadvantage of having a cache is that if it gets poisoned with invalid or stale entries there is no way to clear the cache. Cancelling the events publisher and re-subscribing doesn't help because re-subscribe always hits the cache. See #2514.
Modifications:
minTTL
, removemaxTTL
builder methods, introducettl(min, max)
andttl(min, max, minCache, maxCache)
instead;ttlCache.prepareForResolution(name)
only for scheduled queries, keep it unchanged when cancel/re-subscribe to correctly offset ttl;MinTtlCache.get(...)
;DefaultDnsClient
logging more consistent;id
in the returnedServiceDiscoverer.toString()
from the builder;Result:
Caching is disabled by default. Polling is driven by TTL. In case of re-subscribe, we always send a new query. Users have API to configure min/max polling intervals and caching.