[Possible bugs] Errors on a new deploy #279

fiskhest · 2023-07-31T19:56:49Z

Describe the bug

mailu-postfix sporadically reports:

Jul 31 19:31:49 mail postfix/smtp[8523]: warning: DNSSEC validation may be unavailable
Jul 31 19:31:49 mail postfix/smtp[8523]: warning: reason: dnssec_probe 'ns:.' received a response that is not DNSSEC validated

mailu-dovecot sporadically reports:

Jul 31 15:45:07 lmtp(10822): Error: SSL context initialization failed, disabling SSL: Can't load SSL certificate (ssl_cert setting): The certificate is empty

Environment

k3s

Additional context
These two errors pop up in logs occasionally. I however am not seeing anything actually being broken, any use case I test is working and I cannot discern what operations within the setup is causing them or by what pattern in workloads.
Are they safe to disregard or should they be cause for concern?

The text was updated successfully, but these errors were encountered:

fastlorenzo · 2023-08-11T22:04:17Z

Does your resolver works correctly with DNSSEC? cf https://mailu.io/1.9/faq.html#the-admin-container-won-t-start-and-its-log-says-critical-your-dns-resolver-isn-t-doing-dnssec-validation

fiskhest · 2023-08-11T23:51:10Z

Thanks for the suggestion, I have indeed noticed issues with this when migrating my previous mailu 1.8 setup.
I had to troubleshoot and enable forwarding (coredns workaround #144 Mailu/Mailu#2619 (comment)) to get mailu-admin to start. This is what is reported in my stack since I got the setup running (albeit logging these intermittent errors, as reported).

laptop -> pihole upstream dns (with DNSSEC enabled)

❯ dig example.org. A +adflag @192.168.27.221 | grep "flags:.*ad"
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

mailu-dovecot -> coredns (k3s dns, which forwards to pihole)

mailu-dovecot-7cd57fd5bf-6rfdn:/app# dig example.org. A +adflag @10.43.0.10 | grep "flags:.*ad"
;; flags: qr aa rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

After posting this issue, maybe related I noticed connection issues to the upstream dnsbl servers in the logs of mailu-rspamd. When troubleshooting that manually I noticed the rspamd container still did not see the AD flag. I've worked around it for now by patching the rspamd deployment with dnsPolicy: clusterFirstWithHostNet and when verifying manually I too can see that it can lookup DNSSEC correctly:

rspamd:/app# dig example.org. A +adflag @10.43.0.10 | grep "flags:.*ad"
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

But I still see stuff like this in logs:

2023-08-12 01:31:32 #170(controller) <7f6o1j>; monitored; rspamd_monitored_propagate_error: servfail on resolving multi.surbl.org, disable object
2023-08-12 01:32:24 #170(controller) <imhkkk>; monitored; rspamd_monitored_dns_cb: DNS query blocked on multi.uribl.com (127.0.0.1 returned), possibly due to high volume
2023-08-12 01:13:42 #170(controller) <c9hxbz>; monitored; rspamd_monitored_dns_cb: DNS reply returned 'no error' for zen.spamhaus.org while 'no records with this name' was expected when querying for '1.0.0.127.zen.spamhaus.org'(likely DNS spoofing or BL internal issues)

When writing this up, a thought struck my mind: should a working dns setup be able to dig 1.0.0.127.zen.spamhaus.org. A +adflag @x.y.z.i and expect ad in the returned flags? That does not seem to work anywhere in my stack.
I also just noticed when I tried a couple of more ideas I still seem to have some DNS issue, as this intermittently works:

rspamd:/app# dig 1.0.0.127.zen.spamhaus.org +short
127.255.255.254
rspamd:/app# dig 1.0.0.127.zen.spamhaus.org +short
# no result

Corefile (coredns v1.9.4)

  Corefile: |
    .:53 {
        errors
        health
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
        }
        hosts /etc/coredns/NodeHosts {
          ttl 60
          reload 15s
          fallthrough
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }
    import /etc/coredns/custom/*.server

fastlorenzo · 2024-06-24T00:05:43Z

Hum, I'll use the basis of #326 and make the option available for all containers, that should allow you to fix it with a workaround on any containers, as needed

nextgens · 2024-09-13T20:35:08Z

This is a bad idea.

The RBLs are blocking @fiskhest because he uses a shared resolver upstream and they receive too many queries from it. It has nothing to do with DNSSEC. The right fix is to run your own recursive resolver.

The postfix warning is the only worrying one here... and odds are this is due to the upstream recursive resolver that sometimes doesn't do DNSSEC recursion (maybe when the entry is cached or when it's under high-load?). The fix is the same though: do not rely on a recursive resolver you do not control; run your own.

fiskhest added the bug Something isn't working label Jul 31, 2023

fastlorenzo added the pending answer label Aug 11, 2023

fastlorenzo self-assigned this Jun 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Possible bugs] Errors on a new deploy #279

[Possible bugs] Errors on a new deploy #279

fiskhest commented Jul 31, 2023 •

edited

Loading

fastlorenzo commented Aug 11, 2023

fiskhest commented Aug 11, 2023 •

edited

Loading

fastlorenzo commented Jun 24, 2024

nextgens commented Sep 13, 2024

[Possible bugs] Errors on a new deploy #279

[Possible bugs] Errors on a new deploy #279

Comments

fiskhest commented Jul 31, 2023 • edited Loading

fastlorenzo commented Aug 11, 2023

fiskhest commented Aug 11, 2023 • edited Loading

laptop -> pihole upstream dns (with DNSSEC enabled)

mailu-dovecot -> coredns (k3s dns, which forwards to pihole)

fastlorenzo commented Jun 24, 2024

nextgens commented Sep 13, 2024

fiskhest commented Jul 31, 2023 •

edited

Loading

fiskhest commented Aug 11, 2023 •

edited

Loading