Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added context check in iterativeLookup, solves cache contention issue #407

Merged
merged 1 commit into from
Jul 25, 2024

Conversation

phillip-stephens
Copy link
Contributor

@phillip-stephens phillip-stephens commented Jul 25, 2024

Resolves #404

Description

More details in #404, but in short main saw ZDNS slow to a crawl and pprof trace of goroutines showed most were waiting on a cache shard lock. I used git bisect to find the first bad version and it fortunately wasn't the large CLI + library split PR (c192044) but a much smaller commit's change-set.

Turned out, the first bad version was 67026d, where I tried to improve timeout handling. Specifically, here where I removed the timeout check in cachedRetryingLookup. I believe what was happening was that as long as we never went to the wire (where we do check the context again)but stayed just hitting the cache, this execution loop would never time out and the thread would be stuck:
image

By adding a check on the context in iterativeLookup, we can break the cycle.

Testing

Command - head -n 7000 10k_crux.input | ./zdns A --iterative --threads=100 --timeout=20 --iteration-timeout=20 --output-file="out.log

Time to Crawl 7k Domains
v1.1.0: - 1 min. 50 sec.
main: - > 10 mins, gave up at this point. There were 50/7k domains yet to be resolved
e5e2d8 - This PR: - 1 min 26 secs

@phillip-stephens phillip-stephens marked this pull request as ready for review July 25, 2024 20:01
@phillip-stephens phillip-stephens requested a review from a team as a code owner July 25, 2024 20:01
@zakird zakird merged commit 4783e42 into main Jul 25, 2024
3 checks passed
@zakird zakird deleted the phillip/404-fix-cache-contention branch July 25, 2024 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Regression, Bug] Multi routine Iteration resolving got infinite loop
2 participants