Fix groupBy initial request off by one #2450

OlegDokuka · 2020-10-20T10:31:48Z

This fixes the following misalignment:

drainLoop in GroupByMain does s.request(e) when e groups are produced to the downstream. Each group at the moment has at least 1 element enqueued into it.
When UnicastGroupedFlux consumes elements for the first time, it requests the first element and does main.s.request(e) where e covers the first element
That said that GroupByMain fulfilled the demand for the first element and then UnicastGroupedFlux does the same for the second time. It leads that for each group we have 1 extra demand which then ends up with overflow.
To avoid that, this PR adds isFirstRequest check which allows removing that redundant demand for the first element on the first requestN

Signed-off-by: Oleh Dokuka [email protected]

codecov-io · 2020-10-20T11:09:43Z

Codecov Report

Merging #2450 (65d2ebd) into master (40715df) will decrease coverage by 0.07%.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##             master    #2450      +/-   ##
============================================
- Coverage     83.75%   83.67%   -0.08%     
+ Complexity     4716     4698      -18     
============================================
  Files           391      389       -2     
  Lines         32292    32177     -115     
  Branches       6207     6190      -17     
============================================
- Hits          27046    26924     -122     
- Misses         3532     3540       +8     
+ Partials       1714     1713       -1

Impacted Files	Coverage Δ	Complexity Δ
.../main/java/reactor/core/publisher/FluxGroupBy.java	`84.57% <100.00%> (+0.52%)`	`5.00 <0.00> (ø)`
...n/java/reactor/core/publisher/MonoMaterialize.java	`60.00% <0.00%> (-15.00%)`	`3.00% <0.00%> (ø%)`
...c/main/java/reactor/core/publisher/SinksSpecs.java	`75.00% <0.00%> (-5.27%)`	`1.00% <0.00%> (ø%)`
...ava/reactor/core/publisher/MonoFirstWithValue.java	`57.89% <0.00%> (-5.27%)`	`7.00% <0.00%> (-1.00%)`
...actor/core/publisher/FluxOnBackpressureLatest.java	`81.00% <0.00%> (-3.00%)`	`4.00% <0.00%> (ø%)`
...ava/reactor/core/publisher/FluxFirstWithValue.java	`68.83% <0.00%> (-2.60%)`	`7.00% <0.00%> (-1.00%)`
...in/java/reactor/core/publisher/FluxWindowWhen.java	`84.43% <0.00%> (-1.89%)`	`4.00% <0.00%> (ø%)`
...ava/reactor/core/publisher/FluxHandleFuseable.java	`69.86% <0.00%> (-1.07%)`	`5.00% <0.00%> (ø%)`
.../java/reactor/core/publisher/UnicastProcessor.java	`90.95% <0.00%> (-0.91%)`	`87.00% <0.00%> (ø%)`
.../java/reactor/core/publisher/EmitterProcessor.java	`83.75% <0.00%> (-0.84%)`	`93.00% <0.00%> (-1.00%)`
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d367f08...5ba8625. Read the comment docs.

simonbasle

can you add tests, at least one that validates that it fixes #2352 ?

reactor-core/src/main/java/reactor/core/publisher/FluxGroupBy.java

OlegDokuka · 2020-10-20T12:21:50Z

can you add tests, at least one that validates that it fixes #2352 ?

Actually, it fixes buffer overflow, but it does not fix possible drops of elements in case cancel onNext races. I'm not sure it is possible to guarantee that there will not bee any drops on such a racing.

But yeah, I added some tests, just forgot to push them

pavelkuchin · 2020-12-07T18:20:20Z

Hi @simonbasle @OlegDokuka,

Are there plans to merge it any time soon?

simonbasle · 2020-12-07T19:23:25Z

Hi @simonbasle @OlegDokuka,

Are there plans to merge it any time soon?

Thanks for the ping. I need to review this again, with the added tests, it fell through the cracks.
Given that, it should be released in the January release I think.

In the meantime, if you think you were affected (from the description in this PR) it would be highly valuable for us to get your early feedback (by checking out the branch and building it locally to try to see if it improves things for you). Same for @Sage-Pierce, also @OlegDokuka already stated it didn't really fix the exact issue described in #2352

pavelkuchin · 2020-12-07T19:40:52Z

Hi @simonbasle @OlegDokuka,
Are there plans to merge it any time soon?

Thanks for the ping. I need to review this again, with the added tests, it fell through the cracks.
Given that, it should be released in the January release I think.

In the meantime, if you think you were affected (from the description in this PR) it would be highly valuable for us to get your early feedback (by checking out the branch and building it locally to try to see if it improves things for you). Same for @Sage-Pierce, also @OlegDokuka already stated it didn't really fix the exact issue described in #2352

Thanks @simonbasle! The issue we are having similar to #2138. We are using reactive-kafka consumer and processing kafka events stream using combination of groupBy/take and other operators. At some arbitrary moments the stream hangs up on production and based on logs it seems to be related to groupBy. We are trying to reproduce the issue in our end-to-end tests right now, once we can reproduce it in consistent manner I'll try the fix and provide feedback.

pavelkuchin · 2020-12-08T20:07:12Z

Hi! I've pulled the changes and applied to v3.4.1 of reactor-core. Unfortunately it doesn't fix issue described in #2138. I've also tried to run twoGroupsLongAsyncMergeHidden2 test @OlegDokuka added, and it never fails (on my env it has passed with the fix and without).

Signed-off-by: Oleh Dokuka <[email protected]>

simonbasle · 2020-12-09T14:20:36Z

@pavelkuchin thanks for trying it out nonetheless.
@OlegDokuka I have reviewed the PR and re-uploaded it with a 3.3.x base branch, looks good to me (although it is unrelated to the other groupBy issues currently open)

reactorbot · 2020-12-09T16:43:45Z

@simonbasle this PR seems to have been merged on a maintenance branch, please ensure the change is merge-forwarded to intermediate maintenance branches and up to master 🙇

OlegDokuka mentioned this pull request Oct 20, 2020

Flowable#groupBy race leads to a back-pressure issue ReactiveX/RxJava#7100

Open

OlegDokuka force-pushed the bugfix/#2352 branch from f211225 to c40e359 Compare October 20, 2020 10:46

simonbasle suggested changes Oct 20, 2020

View reviewed changes

reactor-core/src/main/java/reactor/core/publisher/FluxGroupBy.java Show resolved Hide resolved

OlegDokuka requested a review from simonbasle October 20, 2020 16:39

simonbasle added this to the 3.4.2 milestone Dec 7, 2020

OlegDokuka added 2 commits December 9, 2020 14:09

fixes GroupBy overflow issue

8b0c1d8

Signed-off-by: Oleh Dokuka <[email protected]>

adds tests

5ba8625

Signed-off-by: Oleh Dokuka <[email protected]>

simonbasle force-pushed the bugfix/#2352 branch from 65d2ebd to 5ba8625 Compare December 9, 2020 14:18

simonbasle changed the base branch from master to 3.3.x December 9, 2020 14:18

simonbasle modified the milestones: 3.4.2, 3.3.13.RELEASE Dec 9, 2020

simonbasle added type/bug A general bug type/enhancement A general enhancement labels Dec 9, 2020

simonbasle changed the title ~~fixes GroupBy overflow issue~~ fix groupBy initial request off by one Dec 9, 2020

simonbasle approved these changes Dec 9, 2020

View reviewed changes

simonbasle changed the title ~~fix groupBy initial request off by one~~ Fix groupBy initial request off by one Dec 9, 2020

simonbasle merged commit 2ff31b3 into reactor:3.3.x Dec 9, 2020

simonbasle deleted the bugfix/#2352 branch December 9, 2020 16:43

simonbasle added a commit that referenced this pull request Dec 9, 2020

Merge #2450 into 3.4.2

c9f98ce

TunaYagci mentioned this pull request Apr 12, 2021

GroupBy hangs when used with TakeUntil on version 3.3.13.RELEASE #2675

Closed

Sage-Pierce mentioned this pull request Jun 29, 2021

Major Performance Degradation from 3.4.1 to 3.4.7 with groupBy and take(Duration) with flatMap #2730

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix groupBy initial request off by one #2450

Fix groupBy initial request off by one #2450

OlegDokuka commented Oct 20, 2020 •

edited

Loading

codecov-io commented Oct 20, 2020 •

edited

Loading

simonbasle left a comment

OlegDokuka commented Oct 20, 2020 •

edited

Loading

pavelkuchin commented Dec 7, 2020 •

edited

Loading

simonbasle commented Dec 7, 2020

pavelkuchin commented Dec 7, 2020

pavelkuchin commented Dec 8, 2020

simonbasle commented Dec 9, 2020

reactorbot commented Dec 9, 2020

Fix groupBy initial request off by one #2450

Fix groupBy initial request off by one #2450

Conversation

OlegDokuka commented Oct 20, 2020 • edited Loading

codecov-io commented Oct 20, 2020 • edited Loading

Codecov Report

simonbasle left a comment

Choose a reason for hiding this comment

OlegDokuka commented Oct 20, 2020 • edited Loading

pavelkuchin commented Dec 7, 2020 • edited Loading

simonbasle commented Dec 7, 2020

pavelkuchin commented Dec 7, 2020

pavelkuchin commented Dec 8, 2020

simonbasle commented Dec 9, 2020

reactorbot commented Dec 9, 2020

OlegDokuka commented Oct 20, 2020 •

edited

Loading

codecov-io commented Oct 20, 2020 •

edited

Loading

OlegDokuka commented Oct 20, 2020 •

edited

Loading

pavelkuchin commented Dec 7, 2020 •

edited

Loading