Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock, Goroutines count reaching sky high #820

Closed
JaskaranSM opened this issue Feb 24, 2023 · 17 comments
Closed

Deadlock, Goroutines count reaching sky high #820

JaskaranSM opened this issue Feb 24, 2023 · 17 comments
Assignees

Comments

@JaskaranSM
Copy link

This bug was not present in the v1.47 build I was using, However updated the client today as suggested in #813 (comment) , After adding around 10 torrents, the runtime started getting goroutine spikes, smells to be a deadlock caused by new changes

Goroutines: 45570
Alloc: 2.4 GB
TotalAlloc: 795.1 GB
HeapAlloc: 2.4 GB
NumGC: 1594
@anacrolix
Copy link
Owner

Thanks for the report. Are you seeding/uploading a lot? This commit sticks out as a likely candidate: 2cb7121.

@JaskaranSM
Copy link
Author

any update on this? 11 days uptime on application and goroutine count reached below stats:

Alloc: 1.6 GB | TAlloc: 2.9 TB | GC: 14803 | GR: 85654 | TH: 15 | CPU: 25.00% | DL: 2.5 MB | UP: 0 B

@JaskaranSM
Copy link
Author

Tested, goroutine leak is not there with 1.48.0 + e8971ea

@anacrolix
Copy link
Owner

If you could provide a stack trace that would be amazing! Also if you want to try running with 2cb7121 reverted/reversed out, that will prove it's the issue too.

@anacrolix
Copy link
Owner

The goroutine profile mentioned here https://pkg.go.dev/runtime/pprof#Profile, and exposed via https://pkg.go.dev/net/http/pprof is an easy way to capture that. I have a helper that I use do it if you aren't familiar with it: https://github.com/anacrolix/envpprof.

@anacrolix
Copy link
Owner

@JaskaranSM I believe I have a fix, if the above suspect is indeed the issue. A stack trace would have confirmed it. Please try out this commit: 3c8d702.

@anacrolix anacrolix self-assigned this May 3, 2023
@JaskaranSM
Copy link
Author

Tested latest dev branch, When I tried to directly go get with the aforementioned commit hash, it resulted in unknown revision error (GitHub says its not attached to a branch and could be from outside the repo), Then I just switched to dev because it had that commit, Here's a quick test's results:
The client seem to close goroutines fine,
The download performance somehow seems to have been improved (yet to look at full commit history of dev)
I have also attached the pprof graphs, Envpprof is a nice lib, manually adding pprof handlers in every application seems to be a hassle so great work over there as well.
profiles.zip

@anacrolix
Copy link
Owner

Thank you! I don't know why doing it the envpprof does it isn't the default. I've been doing it this way for nearly 10 years.

@JaskaranSM
Copy link
Author

Client panicked for some reason on a torrent.
logs: here
Magnet: here

@BriungRi
Copy link

Can confirm I am observing this issue when running Erigon v2.43.0 which seems to be depending on v1.48.1-0.20230219022425-e8971ea0f1bf

Here's what my goroutine pprof looks like:
Screenshot 2023-05-15 at 10 36 00 AM

Attaching the pprof file as well (zipped to be a supported file format for github)
goroutine.pprof.zip

@anacrolix
Copy link
Owner

@BriungRi please run after go get github.com/anacrolix/torrent@dev. This should be fixed.

@anacrolix
Copy link
Owner

anacrolix commented May 16, 2023

Client panicked for some reason on a torrent. logs: here Magnet: here

@JaskaranSM Thank you very much!

@anacrolix
Copy link
Owner

@JaskaranSM are you using any non-standard transports? The panic you show should only occur with WebRTC or webseeding perhaps. I'll have a fix soon regardless.

anacrolix added a commit that referenced this issue May 16, 2023
@anacrolix
Copy link
Owner

I will tentatively close, this is fixed in dev, and will be in v1.51.0. Let me know if it's not fixed in either of those places!

@AskAlexSharov
Copy link
Collaborator

@anacrolix hi. is it possible to backport this fix to v1.48.1 version? because we keep support of go1.19 and version v1.49.0 of torrent lib require go1.20 (so, we can't upgrade for now).

v1.48.1 seems doesn't have netip-addrport.go file

@anacrolix
Copy link
Owner

I just recently restored compatibility with go 1.19 in v1.51.3 (I also needed it for another project I'm working on). Could you try that?

@AskAlexSharov
Copy link
Collaborator

it works, thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants