Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cmd/opampsupervisor] Leaking goroutine detected in e2e tests #30931

Closed
crobert-1 opened this issue Jan 31, 2024 · 9 comments · Fixed by #32477
Closed

[cmd/opampsupervisor] Leaking goroutine detected in e2e tests #30931

crobert-1 opened this issue Jan 31, 2024 · 9 comments · Fixed by #32477
Labels
ci-cd CI, CD, testing, build issues cmd/opampsupervisor priority:p1 High

Comments

@crobert-1
Copy link
Member

Component(s)

cmd/opampsupervisor

Describe the issue you're reporting

Failure CI link: https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7715775080/job/21031293410?pr=30809

goleak: Errors on successful test run: found unexpected goroutines:
[Goroutine 89 in state chan send, with github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor.newOpAMPServer.func1 on top of the stack:
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor.newOpAMPServer.func1({0xaf2ff0, 0xc00007caa0}, {0xaf2d50?, 0xc0000664c0?})
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/cmd/opampsupervisor/e2e_test.go:87 +0x8f
github.com/open-telemetry/opamp-go/server.ConnectionCallbacksStruct.OnConnected(...)
	/home/runner/go/pkg/mod/github.com/open-telemetry/[email protected]/server/callbacks.go:41
github.com/open-telemetry/opamp-go/server.(*server).handleWSConnection(0xc0001[31](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7715775080/job/21031293410?pr=30809#step:8:32)[36](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7715775080/job/21031293410?pr=30809#step:8:37)0, {0xaf2ff0, 0xc00007caa0}, 0xc0003182c0, {0xaf2cf0, 0xc0000121f8})
	/home/runner/go/pkg/mod/github.com/open-telemetry/[email protected]/server/serverimpl.go:209 +0x182
created by github.com/open-telemetry/opamp-go/server.(*server).httpHandler
	/home/runner/go/pkg/mod/github.com/open-telemetry/[email protected]/server/serverimpl.go:188 +0x[37](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7715775080/job/21031293410?pr=30809#step:8:38)b
]
exit status 1
FAIL	github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor	4.124s
@crobert-1 crobert-1 added the needs triage New item requiring triage label Jan 31, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@evan-bradley
Copy link
Contributor

Thanks @crobert-1. I'll look into this.

@crobert-1
Copy link
Member Author

Just to check, I added goleak locally to the opamp-go/server package and found a similar leak exists there. If you are able to confirm it's an issue in opamp-go we can file an issue there and ignore for now here.

Failing on the test server/TestServerReceiveSendMessageWithCompression

Failure output:

 Goroutine 74 in state IO wait, with internal/poll.runtime_pollWait on top of the stack:
internal/poll.runtime_pollWait(0x486ba6d0, 0x72)
	/usr/local/Cellar/go/1.21.0/libexec/src/runtime/netpoll.go:343 +0x85
internal/poll.(*pollDesc).wait(0xc000035d80?, 0xc000380000?, 0x0)
	/usr/local/Cellar/go/1.21.0/libexec/src/internal/poll/fd_poll_runtime.go:84 +0x27
internal/poll.(*pollDesc).waitRead(...)
	/usr/local/Cellar/go/1.21.0/libexec/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc000035d80, {0xc000380000, 0x1000, 0x1000})
	/usr/local/Cellar/go/1.21.0/libexec/src/internal/poll/fd_unix.go:164 +0x27a
net.(*netFD).Read(0xc000035d80, {0xc000380000?, 0x20?, 0x14b8fe0?})
	/usr/local/Cellar/go/1.21.0/libexec/src/net/fd_posix.go:55 +0x25
net.(*conn).Read(0xc000068258, {0xc000380000?, 0xc00029a000?, 0x5c?})
	/usr/local/Cellar/go/1.21.0/libexec/src/net/net.go:179 +0x45
bufio.(*Reader).fill(0xc000187560)
	/usr/local/Cellar/go/1.21.0/libexec/src/bufio/bufio.go:113 +0x103
bufio.(*Reader).Peek(0xc000187560, 0x2)
	/usr/local/Cellar/go/1.21.0/libexec/src/bufio/bufio.go:151 +0x53
github.com/gorilla/websocket.(*Conn).read(0xc0000fadc0, 0xc0000be270?)
	/Users/crobert/go/pkg/mod/github.com/gorilla/[email protected]/conn.go:378 +0x26
github.com/gorilla/websocket.(*Conn).advanceFrame(0xc0000fadc0)
	/Users/crobert/go/pkg/mod/github.com/gorilla/[email protected]/conn.go:824 +0x6d
github.com/gorilla/websocket.(*Conn).NextReader(0xc0000fadc0)
	/Users/crobert/go/pkg/mod/github.com/gorilla/[email protected]/conn.go:1034 +0x13e
github.com/gorilla/websocket.(*Conn).ReadMessage(0xc0000c6c48?)
	/Users/crobert/go/pkg/mod/github.com/gorilla/[email protected]/conn.go:1120 +0x13
github.com/open-telemetry/opamp-go/server.(*server).handleWSConnection(0xc000194d20, {0x163e878, 0xc0000255e0}, 0xc0000fadc0, {0x163d210, 0xc000012c78})
	/Users/crobert/dev/opamp-go/server/serverimpl.go:216 +0x19e
created by github.com/open-telemetry/opamp-go/server.(*server).httpHandler in goroutine 94
	/Users/crobert/dev/opamp-go/server/serverimpl.go:188 +0x36e

@crobert-1
Copy link
Member Author

+1 freq: https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7835160544/job/21380129589?pr=31141

@evan-bradley
Copy link
Contributor

I looked into this a bit and I suspect it may be resolved by this PR: open-telemetry/opamp-go#213. I think we should probably disable the goleak check for now.

evan-bradley added a commit that referenced this issue Feb 8, 2024
**Description:**

Temporarily disable the goleak check to work around
#30931.

Co-authored-by: Evan Bradley <[email protected]>
@evan-bradley
Copy link
Contributor

I'll leave this issue open for now as a reminder to check for this when adding the tests back.

Copy link
Contributor

github-actions bot commented Apr 9, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@evan-bradley
Copy link
Contributor

I checked this again now that we're on opamp-go 0.14 and the issue is still present. Unless there is something we're doing wrong in the E2E tests, it will need a separate investigation in the library.

@crobert-1
Copy link
Member Author

I filed open-telemetry/opamp-go#271 against the opamp-go repo to track another leak that was detected.

evan-bradley pushed a commit that referenced this issue Apr 17, 2024
**Description:** Fix leaks in e2e tests

**Link to tracking Issue:** resolve #30931
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-cd CI, CD, testing, build issues cmd/opampsupervisor priority:p1 High
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants