-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data race in Emitter and unfortunate handling of threads #774
Comments
Thank you for reporting this @hakonk! I think we might be able to get around the issue by having a second lock to access If you have any other ideas or are open to contribute a solution for this, it's very welcome! |
We had a look at the particular data race, and it is not very problematic considering it's a boolean value. After looking some more at the code it seems the main issue is that a global concurrent dispatch queue is used to serialize and send events. This is generally discouraged because it can create too many threads, see for example: https://forums.swift.org/t/how-to-currently-offload-stateless-blocking-work-hidden-executor-api/59128/2. In
The screenshot of from the Time Profiler in Instruments shows that several threads are employed to perfrom work in the Snowplow library. In this example we sent 10 events instead of 100 as in the example app above. If I understand the literature correctly, the several threads will be created even if they don't block for long, so the issue is not really the blocking calls even thought that is unfortunate. One way to improve this is simply to use private serial queues. That way you get mutual exclusion and offloading of the main queue. Also, instead of waiting synchronously for |
While the data race isn't that problematic in terms of concurrent read/write, I guess it could create race conditions that can result in bugs, so that may be worth checking into. |
Hi @hakonk, thanks for the additional investigation and suggestions! We have put together a PR that addresses the issue with unprotected access to the I understand that your suggestions go a step further to improve the thread handling. We'll keep that in mind in future refactoring of the tracker internals. |
Describe the bug
While running an app that makes use of SnowplowTracker (v5.0.0) and enabling Thread Sanitizer in Xcode, I have one some occasions encountered data races in the
Emitter
class in the SnowplowTracker library in a production app. More specifically, the data races involve concurrent access toEmitter.isSending
To Reproduce
Here is a minimal example that demonstrates the issue.
TestSnowplowApp.swift
:ContentView
struct:ContentView.swift
:Enable Thread sanitizer in the Diagnostics tab for the app's scheme and enable a Runtime Issue breakpoint where "Thread Sanitizer" is opted in for.
Run the app in the simulator in debug mode and press the button "Send a bunch of requests" several times until the breakpoint hits.
Observe that the breakpoint is reached.
Expected behavior
Concurrent access internally in the library should not trigger a TSan breakpoint.
Device information (please complete the following information):
Additional context
Here is an excerpt from
Emitter.swift:360-382
that I believe sheds some light on the issue:This code is inherently not thread safe because
isSending
is read insendGuard
prior to obtaining the lock. It is also read many other places. Therefore one may still get a data race even ifisSending
is written to, even if that mutation takes place after the lock has been obtained. Also, sincesendGuard
is dispatched on the global dispatch queue, which is a concurrent dispatch queue, one can be almost certain the code will execute on several threads.Another issue, which isn't directly tied to the data race we're seeing, is that on line 437 in
Emitter.swift
, the thread is put to sleep for 5 seconds. This is unfortunate because, as I understand it, one can put several threads that belong to the concurrent dispatch queue's pool to sleep. While I'm not sure how this affects Swift Concurrency's forward progress principle, I can't imagine it contributes in a positive manner.I hope this issue sufficiently describes the problem. Otherwise I'm happy to provide more context.
The text was updated successfully, but these errors were encountered: