Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Possible bugs] Errors on a new deploy #279

Open
fiskhest opened this issue Jul 31, 2023 · 4 comments
Open

[Possible bugs] Errors on a new deploy #279

fiskhest opened this issue Jul 31, 2023 · 4 comments
Assignees
Labels
bug Something isn't working pending answer

Comments

@fiskhest
Copy link

fiskhest commented Jul 31, 2023

Describe the bug

mailu-postfix sporadically reports:

Jul 31 19:31:49 mail postfix/smtp[8523]: warning: DNSSEC validation may be unavailable
Jul 31 19:31:49 mail postfix/smtp[8523]: warning: reason: dnssec_probe 'ns:.' received a response that is not DNSSEC validated

mailu-dovecot sporadically reports:

Jul 31 15:45:07 lmtp(10822): Error: SSL context initialization failed, disabling SSL: Can't load SSL certificate (ssl_cert setting): The certificate is empty

Environment

  • k3s

Additional context
These two errors pop up in logs occasionally. I however am not seeing anything actually being broken, any use case I test is working and I cannot discern what operations within the setup is causing them or by what pattern in workloads.
Are they safe to disregard or should they be cause for concern?

@fiskhest fiskhest added the bug Something isn't working label Jul 31, 2023
@fastlorenzo
Copy link
Collaborator

@fiskhest
Copy link
Author

fiskhest commented Aug 11, 2023

Thanks for the suggestion, I have indeed noticed issues with this when migrating my previous mailu 1.8 setup.
I had to troubleshoot and enable forwarding (coredns workaround #144 Mailu/Mailu#2619 (comment)) to get mailu-admin to start. This is what is reported in my stack since I got the setup running (albeit logging these intermittent errors, as reported).

laptop -> pihole upstream dns (with DNSSEC enabled)

❯ dig example.org. A +adflag @192.168.27.221 | grep "flags:.*ad"
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

mailu-dovecot -> coredns (k3s dns, which forwards to pihole)

mailu-dovecot-7cd57fd5bf-6rfdn:/app# dig example.org. A +adflag @10.43.0.10 | grep "flags:.*ad"
;; flags: qr aa rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

After posting this issue, maybe related I noticed connection issues to the upstream dnsbl servers in the logs of mailu-rspamd. When troubleshooting that manually I noticed the rspamd container still did not see the AD flag. I've worked around it for now by patching the rspamd deployment with dnsPolicy: clusterFirstWithHostNet and when verifying manually I too can see that it can lookup DNSSEC correctly:

rspamd:/app# dig example.org. A +adflag @10.43.0.10 | grep "flags:.*ad"
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

But I still see stuff like this in logs:

2023-08-12 01:31:32 #170(controller) <7f6o1j>; monitored; rspamd_monitored_propagate_error: servfail on resolving multi.surbl.org, disable object
2023-08-12 01:32:24 #170(controller) <imhkkk>; monitored; rspamd_monitored_dns_cb: DNS query blocked on multi.uribl.com (127.0.0.1 returned), possibly due to high volume
2023-08-12 01:13:42 #170(controller) <c9hxbz>; monitored; rspamd_monitored_dns_cb: DNS reply returned 'no error' for zen.spamhaus.org while 'no records with this name' was expected when querying for '1.0.0.127.zen.spamhaus.org'(likely DNS spoofing or BL internal issues)

When writing this up, a thought struck my mind: should a working dns setup be able to dig 1.0.0.127.zen.spamhaus.org. A +adflag @x.y.z.i and expect ad in the returned flags? That does not seem to work anywhere in my stack.
I also just noticed when I tried a couple of more ideas I still seem to have some DNS issue, as this intermittently works:

rspamd:/app# dig 1.0.0.127.zen.spamhaus.org +short
127.255.255.254
rspamd:/app# dig 1.0.0.127.zen.spamhaus.org +short
# no result

Corefile (coredns v1.9.4)

  Corefile: |
    .:53 {
        errors
        health
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
        }
        hosts /etc/coredns/NodeHosts {
          ttl 60
          reload 15s
          fallthrough
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }
    import /etc/coredns/custom/*.server

@fastlorenzo
Copy link
Collaborator

Hum, I'll use the basis of #326 and make the option available for all containers, that should allow you to fix it with a workaround on any containers, as needed

@fastlorenzo fastlorenzo self-assigned this Jun 28, 2024
@nextgens
Copy link

This is a bad idea.

The RBLs are blocking @fiskhest because he uses a shared resolver upstream and they receive too many queries from it. It has nothing to do with DNSSEC. The right fix is to run your own recursive resolver.

The postfix warning is the only worrying one here... and odds are this is due to the upstream recursive resolver that sometimes doesn't do DNSSEC recursion (maybe when the entry is cached or when it's under high-load?). The fix is the same though: do not rely on a recursive resolver you do not control; run your own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pending answer
Projects
None yet
Development

No branches or pull requests

3 participants