Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed large number of SERVFAILs #413

Merged
merged 2 commits into from
Aug 5, 2024
Merged

Fixed large number of SERVFAILs #413

merged 2 commits into from
Aug 5, 2024

Conversation

phillip-stephens
Copy link
Contributor

@phillip-stephens phillip-stephens commented Aug 2, 2024

Resolves #411

Description

While chasing down an issue in the IPv6 recursive resolution, I was having a hard time understanding why we made some of the DNS lookups we did.

That led me to see this section of code in the cachedRetryingLookup fn that I couldn't explain why it is there or how it didn't break iterative lookups.
At first glance it looks like it's trying to save a lookup for the nameservers, but it's then returning this NS result regardless of what the original Question.Type was. This messes up the recursion for some domains, see the Testing section.

Removing it seems to solve the issue reported in #411. It seems to have been added as a performance enhancement, but I cannot explain how it doesn't make the resolution of some domains mess up. Additionally, removing it improves performance and I don't see how it drops accuracy.

Testing

This always works: echo "www.meteo.it" | ./zdns A --iterative

echo "www.meteo.it" | ./zdns A --iterative
{"data":{"answers":[{"answer":"www.meteo.it.cdn.cloudflare.net.","class":"IN","name":"www.meteo.it","ttl":300,"type":"CNAME"},{"answer":"104.22.3.199","class":"IN","name":"www.meteo.it.cdn.cloudflare.net","ttl":300,"type":"A"},{"answer":"172.67.4.5","class":"IN","name":"www.meteo.it.cdn.cloudflare.net","ttl":300,"type":"A"},{"answer":"104.22.2.199","class":"IN","name":"www.meteo.it.cdn.cloudflare.net","ttl":300,"type":"A"}],"protocol":"udp","resolver":"173.245.59.31:53"},"duration":0.27174049,"name":"www.meteo.it","status":"NOERROR","timestamp":"2024-08-02T21:15:43Z"}

However, when trying to resolve 6k domains, with www.meteo.it being towards the back of that list (and therefore giving the cache plenty of time to cache entries and this code to then be called)

main (repeated 3 times with same results)

make && head -n 6000 ../top_sites| ./zdns A --iterative | grep "www.meteo.it"
go build -o zdns
{"data":{"protocol":"","resolver":""},"duration":15.032074932,"name":"www.meteo.it","status":"SERVFAIL","timestamp":"2024-08-02T21:18:33Z"}

There is a SERVFAIL. I tested v1.1.0 and v1.0.0 with the same test with similar (or ITERATION_TIMEOUT error) results as main, so it seems this has been there for a while.

Phillip/411 (repeated 3 times)

➜  zdns git:(phillip/411) ✗ make && head -n 6000 ../top_sites| ./zdns A --iterative | grep "www.meteo.it"
go build -o zdns
{"data":{"answers":[{"answer":"www.meteo.it.cdn.cloudflare.net.","class":"IN","name":"www.meteo.it","ttl":300,"type":"CNAME"},{"answer":"172.67.4.5","class":"IN","name":"www.meteo.it.cdn.cloudflare.net","ttl":300,"type":"A"},{"answer":"104.22.2.199","class":"IN","name":"www.meteo.it.cdn.cloudflare.net","ttl":300,"type":"A"},{"answer":"104.22.3.199","class":"IN","name":"www.meteo.it.cdn.cloudflare.net","ttl":300,"type":"A"}],"protocol":"udp","resolver":"173.245.59.31:53"},"duration":0.346084432,"name":"www.meteo.it","status":"NOERROR","timestamp":"2024-08-02T21:20:28Z"}

Performance

Looking at that code, I can't think of an issue with accuracy since going to the wire should be at least as accurate as the cache, and more if the cache is being polluted or used inaccurately. But I was worried there'd be a marked downgrade in performance without this caching. Using the benchmark though, the runtime halved and the number of failed domains decreased significantly.

main

➜  zdns git:(main) ✗ make benchmark
go build -o zdns
cd ./benchmark && go run main.go stats.go
Benchmarking ZDNS, Resolving 7000 domains... 100% |███████████████████████████████████| (7000/7000, 106 it/s)
Benchmark took:                                                       65.77s
Min resolution time:                                                 23.61µs
Max resolution time:                                                  15.58s
Average resolution time:                                            759.68ms

Ten longest resolutions:
	www.sportmediaset.mediaset.it:                                15.58s
	www.hellomagazine.com:                                        15.55s
	eu12.proxysite.com:                                           15.51s
	wing.coupang.com:                                             15.51s
	www.skyscanner.es:                                            15.50s
	secim.ntv.com.tr:                                             15.48s
	laser247.online:                                              15.46s
	brunch.co.kr:                                                 15.45s
	download.vidbox.online:                                       15.44s
	y2mate.nu:                                                    15.00s

Domains resolved successfully:                                     6752/7000
Domains that timed out:                                                    5
	daryo.uz
	hi.tuberon.space
	myaadhaar.uidai.gov.in
	www.movilnet.com.ve
	y2mate.nu

Domains that failed:                                                     243

	0gomovies.la:                                               SERVFAIL
	1filmy4wep.bet:                                             SERVFAIL
	1qby-rjuv1r--api.dl-api.xyz:                                NXDOMAIN
	1w2u2tq7p1.jejstxlvca.net:                                  SERVFAIL
	20.allhen.online:                                           SERVFAIL
	5movierulz.pics:                                            SERVFAIL
	8171.bisp.gov.pk:                                           SERVFAIL
	abcnews.go.com:                                             SERVFAIL
	adliran.ir:                                                 SERVFAIL
	adme.media:                                                 SERVFAIL
	adult.contents.fc2.com:                                     SERVFAIL
	ahara.kar.nic.in:                                           SERVFAIL
	allen.in:                                                   SERVFAIL
	alpler.x1.tr.travian.com:                                   NXDOMAIN
	alpler.x3.turkey.travian.com:                               NXDOMAIN
	animeflix.live:                                             NXDOMAIN
	aposta.la:                                                  SERVFAIL
...

Phillip/411

➜  zdns git:(phillip/411) ✗ make benchmark
go build -o zdns
cd ./benchmark && go run main.go stats.go
Benchmarking ZDNS, Resolving 7000 domains... 100% |████████████████████████████████████| (7000/7000, 226 it/s)
Benchmark took:                                                       31.01s
Min resolution time:                                                 20.56µs
Max resolution time:                                                  15.00s
Average resolution time:                                            288.36ms

Ten longest resolutions:
	enrollment.aiou.edu.pk:                                       15.00s
	www.uidai.gov.in:                                             15.00s
	hi.tuberon.space:                                             15.00s
	www.myutiitsl.com:                                            15.00s
	www.imna.ir:                                                  15.00s
	myaadhaar.uidai.gov.in:                                       15.00s
	www.irctc.co.in:                                              15.00s
	uidai.gov.in:                                                 15.00s
	www.hamshahrionline.ir:                                       15.00s
	www.turkiye.gov.tr:                                           15.00s

Domains resolved successfully:                                     6898/7000
Domains that timed out:                                                   30
	adliran.ir
	daryo.uz
	enrollment.aiou.edu.pk
	faberlic.com
	france3-regions.francetvinfo.fr
	hi.tuberon.space
	m.digi24.ro
	moviesda9.me
	myaadhaar.uidai.gov.in
	namnak.com
	newtoki329.com
	nexoscans.net
	njavtv.com
	nregade3.nic.in
	okxxx1.com
	pixabay.com
	tathya.uidai.gov.in
	uidai.gov.in
	us.etrade.com
	www.hamshahrionline.ir
	www.imna.ir
	www.irctc.co.in
	www.khabaronline.ir
	www.lidl.pl
	www.myutiitsl.com
	www.pussyboy.net
	www.thetrainline.com
	www.turkiye.gov.tr
	www.uidai.gov.in
	www.xvideos53.com

Domains that failed:                                                      72

	0gomovies.la:                                               NXDOMAIN
	1qby-rjuv1r--api.dl-api.xyz:                                NXDOMAIN
	alpler.x1.tr.travian.com:                                   NXDOMAIN
	alpler.x3.turkey.travian.com:                               NXDOMAIN
	animeflix.live:                                             NXDOMAIN
	awardbonus.life:                                             REFUSED
	braidedpunkies.top:                                         NXDOMAIN
	bsebmatric.org:                                              REFUSED

@phillip-stephens
Copy link
Contributor Author

I've done some additional testing on this. Looks like we're getting more NOERROR statuses with this. Opening up for review.

Ubuntu VM with 22 cores

main

$ head -n 8000 benchmark/10k_crux_top_domains.input| ./zdns A --iterative  --threads=250 --output-file=main.out

~/zdns on  main! ⌚ 16:13:15
$ cat main.out | jq ".status" | sort | uniq -c
   7700 "NOERROR"
     50 "NXDOMAIN"
     29 "REFUSED"
    213 "SERVFAIL"
      8 "TIMEOUT"

Phillip/411

$ cat 411.out | jq ".status" | sort | uniq -c
      4 "ITERATIVE_TIMEOUT"
   7902 "NOERROR"
     52 "NXDOMAIN"
     31 "REFUSED"
     11 "TIMEOUT"

@phillip-stephens phillip-stephens marked this pull request as ready for review August 5, 2024 18:20
@phillip-stephens phillip-stephens requested a review from a team as a code owner August 5, 2024 18:20
@phillip-stephens phillip-stephens requested a review from zakird August 5, 2024 18:20
@zakird zakird merged commit 4b4d6b8 into main Aug 5, 2024
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Erroneous SERVFAIL errors when scanning larger numbers of domains
2 participants