-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document when groupBy can deadlock because of beckpressure #3443
Comments
That's true, In Akka Stream there is some notes about this.
|
Just so as to not flood the issues-section I found that the groupBy operator blocks/deadlocks in a case where the flatMap retries. Using backoff on the retry fixes this situation. e.g. Flux.just(
Map.entry(1,1),
Map.entry(2,1), //error from this blocks everything
Map.entry(3,3),
Map.entry(2,2)
).groupBy(Map.Entry::getKey)
.flatMap(grp ->
grp.switchMap(e -> Mono.just(e).flatMapMany(this::factory).retry())
)
<T> Flux<T> factory(Map.Entry<Integer, Integer> entry) {
if (!Objects.equals(entry.getKey(), entry.getValue()))
return Flux.error(...);
return Flux.just(...);
} |
@OlegDokuka I want to get started with contributing to this. Could you point out where the documentation needs to be added? |
Hey, @NamrataGuptaRoy. There seem to be 4 Also, we have a mention of the problems in the reference documentation. Perhaps adding a bit more examples to the warning section would be helpful. Perhaps best would be to start with the list of considered additions here in the issue and once we consider the list is cohesive you could open a PR? Thanks, I look forward to your ideas! |
Going through few issues and threads related to groupBy, I feel, it is known that groupBy works well with LOW cardinality, but what quantity is considered LOW (its a relative term) is unclear among users, this results in deadlock situations. I have the below proposal for the documentation: In the other scenario, where cardinality > concurrency (no.of groups > no. of subscribers), Example using concatMap:
In the above example:
The test gets stuck after printing these elements. Explanation: Initially groupBy requests 5 elements
concatMap has concurrency 1, therefore group with key 'a' is the only group subscribed. Next, groupBy requests for additional 3 items (2 spaces are already occupied in buffer)
Out of these "apple" is consumed rest remain in the buffer Next, groupBy requests for additional 1 item (4 spaces are already occupied in buffer)
Now, nothing from the buffer belongs to group a, hence no more consumption happens by concatMap and it remains open. In the same example, if the data was in slightly different order, for example :
The same test would PASS successfully (the same concatMap would be able to receive a complete signal, complete one group subscribe to next group and so on). |
Pardon the huge comment !
Although, I would not put the entire example and note at all the places, only the relevant parts in the respective operator documentations and probably the example in the reference documentation
Let me know your inputs and suggestions, and whether this looks good :) |
Thanks @NamrataGuptaRoy - the example is really good and easy to follow! Please go ahead and submit a PR for the reference documentation update with the example. You can also combine this with Javadoc updates you speak of. The PR should target the I am only unclear about the retry + backoff case, so that can be another PR/issue. Do we know why retry deadlocks, but a retry with backoff does not? |
Thanks @chemicL ! I will work on creating the PRs.
Example of deadlock with above method:
If used with backoff, it throws the error and sends the complete signal.
Also, in the scenario where retry succeeds after a couple of times, the process will eventually be completed even without a backoff specified (even when used alongwith groupBy).
Example of NO deadlock with groupBy + retry without backoff:
Let me know your thoughts! |
Oh, that's what you mean. I was confused by the term "backoff" which implied some notion of delay in my head. Right, so the problems can happen when you retry indefinitely. That absolutely makes sense. Yet, I think is a topic for another discussion and is not directly connected to
Perhaps something regarding |
Yes @chemicL I agree, this is not directly connected to groupBy. (assuming this interpretation of @MikkelHJuul 's comment is correct) |
…eactor#3443) This documents the scenarios where backpressure can lead to deadlock in groupBy and how it could be avoided.
Thanks, @NamrataGuptaRoy ! With #3872 now merged I think this case is now resolved. Congrats on your contribution! 🎉 |
recent activity related to groupBy deadlock (#3442 #3427) adds extra motivation to clarify how groupBy has to be used:
Documentation Issue
The text was updated successfully, but these errors were encountered: