Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: proxy-upload semaphore_wait timeout #111

Merged
merged 1 commit into from
Nov 12, 2024

Conversation

supervacuus
Copy link
Collaborator

@supervacuus supervacuus commented Nov 12, 2024

While testing the crashpad_handler HTTP-proxy-based report upload on macOS without a registered mitmproxy certificate, I saw that we don't get any error logs from the crashpad_handler and never get into the completion handler.

This was a bug: I incorrectly specified the dispatch_semaphore_wait() timeout. The timeout it accepts is not a relative offset but an absolute timeout! So, by passing a 10-second offset, the timeout expects 10 seconds after an arbitrary absolute start point (which almost assuredly already passed).

I am still trying to understand why this wasn't an issue when testing the initial implementation, which I had to do so many times due to the finicky proxy config of NSURLSession.

@supervacuus
Copy link
Collaborator Author

I am still trying to understand why this wasn't an issue when testing the initial implementation, which I had to do so many times due to the finicky proxy config of NSURLSession.

One attempt to answer this is that a significant change in the GCD previously allowed the task to be completed (and updated refs) even though semaphore waiting led to exiting the surrounding function (and autorelease pool), whereas now it stops the task. It must have always been racy, but since it could complete the effect of essentially not waiting, it must have been negligible. I wrote this in May 2023, so I still tested it on macOS 13. We could run a proxy integration test on one of the macOS 13 GHA runner images.

Copy link
Member

@JoshuaMoelans JoshuaMoelans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@supervacuus
Copy link
Collaborator Author

One attempt to answer this is that a significant change in the GCD previously allowed the task to be completed (and updated refs) even though semaphore waiting led to exiting the surrounding function (and autorelease pool), whereas now it stops the task. It must have always been racy, but since it could complete the effect of essentially not waiting, it must have been negligible. I wrote this in May 2023, so I still tested it on macOS 13. We could run a proxy integration test on one of the macOS 13 GHA runner images.

A more sensible explanation would be the timing until the process exits. If the crashpad_handler terminates fast enough (e.g., simply due to machine difference) after timing out waiting on the dispatch_semaphore, then GCD would immediately stop all running tasks. This is happening on your machine. The request might start but will never be completed.

However, if that exit path takes longer to execute, the task will continue to run in the background. Since the completion handler closes over the function's scope in the autorelease pool, nothing will be freed because only the local references of the surrounding function will be dropped, and the task still has live references via its completion handler. In that case, the upload could be finished successfully. If the timing is correct, the completion handler could even write sync_rv and the response body in time since they could both be valid and written before being evaluated in the caller.

@supervacuus supervacuus merged commit b95f5c6 into getsentry Nov 12, 2024
8 checks passed
@supervacuus supervacuus deleted the fix/upload_semaphore_wait_timeout branch November 12, 2024 12:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants