-
Notifications
You must be signed in to change notification settings - Fork 1.1k
DNS resolution of memcached service fails #1591
Comments
It looks like it either can't reach the DNS server, or doesn't get a result back. The hostname there is a bit odd -- should it include the hyphenated IP address like that? I would expect just |
For what is worth I experience the exact same behavior on minikube and flux deployed with the helm chart.
Flux components versions:
Every pods are green and running. The logs:
I'll try to read a bit more about how the service discovery works in memcache and see if I can come up with an explanation... |
I can reach the memcache container from the fluxd container by hitting the hostname passed to fluxd:
Telnet session from the fluxd container gives (again truncated by me):
|
Digging some more shows that the cache seems to be used: From a memcache telnet session, dumping an item gives:
So I would say the memcache server list is properly provisioned with the valid hostname and some "ghost" hostnames are trying to get in and are rejected because they can't resolve... It most likely fails here: How they get there is a mystery to me so far... @squaremo do you have any clue how those weird records could get there? Should we add a note to the FAQ to let people know that this is "ok" and doesn't break the cache? @dilshad18 could you check in your own setup if you do have something in memcache (if you are not used to memcache, I followed this blog post and it helped me get used to it) |
Yes, that sounds like a good diagnosis. Filling in some details, after looking in the Kubernetes docs re service discovery: The way it's set up in the example deployment (and chart, and Weave Cloud config, ...) is that memcached has a headless service. According to https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#services (also see https://github.com/kubernetes/dns/blob/master/docs/specification.md), in DNS there will be:
With the arguments given in the example deployment of fluxd, it'll query for the SRV records, then the memcache client code (linked above) will query the IP of each host mentioned. So where it's failing is in that second bit -- it can't resolve some or all of the hosts it got from the SRV records. But the question remains: how did those unresolveable endpoints get there in the first place? That I don't know :-( |
BTW it is entirely fine to give the memcached service a clusterIP (i.e., don't set it to |
Following error is happening in flux installation inside minikube instance:
component=memcached err="error updating memcache servers: lookup 172-17-0-3.flux-memcached.flux.svc.cluster.local. on 10.96.0.10:53: no such host"
This is a rather fresh installation and only few files have been applied. We update a single file and it tries to update that and it fails,
The text was updated successfully, but these errors were encountered: