-
Notifications
You must be signed in to change notification settings - Fork 667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvement to the Batch Span Processor #949
Comments
Indeed this code looks problematic. Notice how all the notify calls are within
What is more, we don't even have a boolean to wait on. This could even result in a deadlock there if the flush completes before we start to wait for it. I guess both problems could be solved at once by replacing the flush_condition with a threading.Event. |
@snehilchopra
Can you explain a bit more on how this would cause a deadlock, and why a theading.Event could solve this instead of |
This is the sequence of events that would deadlock:
T1 will now wait forever because the flush token has already been dequed. Notifying a condition only wakes up threads that are already waiting. |
i guess it is not really a deadlock since thread T1 will continue once the timeout on the flush condition elapses. another problem is that calls to |
Hi @lzchen. This is to handle spurious wake-ups. We check on a condition if the thread woke up spuriously. If the condition is satisfied, break out of the loop. Else, go back to waiting. |
Closed with #1062 |
Context
The python Batch Span Processor uses two threads, namely the main thread and a worker thread, in a synchronized manner to export batches of spans to the configured exporter.
However, it seems that the implementation might run into some unexpected behavior.
Problem
1.) On lines 163, 216, and 241, I think there is a high possibility that these notify calls actually miss waking up the worker thread.
In other words, it might be possible that the
notify
calls are made BEFORE the worker thread actually starts towait
.2.) Moreover, I believe spurious wake-ups with correct timeout recalculation seems to be missing too, as seen in line 245. Although I do realize that they rarely ever occur, it might be a good idea to account for them for robustness.
Solution
I have just implemented the batch span processor in C++ addressing these concerns.
For preventing notify calls from being missed, they could be wrapped in a while loop with a flag to check if the worker thread actually woke up.
Moreover, for spurious wake-ups with timeout recalculation, something like this could be done for python.
Please let me know what you think about this! Thanks!
@mauriciovasquezbernal @reyang @Oberon00
The text was updated successfully, but these errors were encountered: