Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix groupBy initial request off by one #2450

Merged
merged 2 commits into from
Dec 9, 2020

Conversation

OlegDokuka
Copy link
Contributor

@OlegDokuka OlegDokuka commented Oct 20, 2020

This fixes the following misalignment:

  1. drainLoop in GroupByMain does s.request(e) when e groups are produced to the downstream. Each group at the moment has at least 1 element enqueued into it.
  2. When UnicastGroupedFlux consumes elements for the first time, it requests the first element and does main.s.request(e) where e covers the first element
  3. That said that GroupByMain fulfilled the demand for the first element and then UnicastGroupedFlux does the same for the second time. It leads that for each group we have 1 extra demand which then ends up with overflow.
  4. To avoid that, this PR adds isFirstRequest check which allows removing that redundant demand for the first element on the first requestN

Signed-off-by: Oleh Dokuka [email protected]

@codecov-io
Copy link

codecov-io commented Oct 20, 2020

Codecov Report

Merging #2450 (65d2ebd) into master (40715df) will decrease coverage by 0.07%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #2450      +/-   ##
============================================
- Coverage     83.75%   83.67%   -0.08%     
+ Complexity     4716     4698      -18     
============================================
  Files           391      389       -2     
  Lines         32292    32177     -115     
  Branches       6207     6190      -17     
============================================
- Hits          27046    26924     -122     
- Misses         3532     3540       +8     
+ Partials       1714     1713       -1     
Impacted Files Coverage Δ Complexity Δ
.../main/java/reactor/core/publisher/FluxGroupBy.java 84.57% <100.00%> (+0.52%) 5.00 <0.00> (ø)
...n/java/reactor/core/publisher/MonoMaterialize.java 60.00% <0.00%> (-15.00%) 3.00% <0.00%> (ø%)
...c/main/java/reactor/core/publisher/SinksSpecs.java 75.00% <0.00%> (-5.27%) 1.00% <0.00%> (ø%)
...ava/reactor/core/publisher/MonoFirstWithValue.java 57.89% <0.00%> (-5.27%) 7.00% <0.00%> (-1.00%)
...actor/core/publisher/FluxOnBackpressureLatest.java 81.00% <0.00%> (-3.00%) 4.00% <0.00%> (ø%)
...ava/reactor/core/publisher/FluxFirstWithValue.java 68.83% <0.00%> (-2.60%) 7.00% <0.00%> (-1.00%)
...in/java/reactor/core/publisher/FluxWindowWhen.java 84.43% <0.00%> (-1.89%) 4.00% <0.00%> (ø%)
...ava/reactor/core/publisher/FluxHandleFuseable.java 69.86% <0.00%> (-1.07%) 5.00% <0.00%> (ø%)
.../java/reactor/core/publisher/UnicastProcessor.java 90.95% <0.00%> (-0.91%) 87.00% <0.00%> (ø%)
.../java/reactor/core/publisher/EmitterProcessor.java 83.75% <0.00%> (-0.84%) 93.00% <0.00%> (-1.00%)
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d367f08...5ba8625. Read the comment docs.

Copy link
Contributor

@simonbasle simonbasle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add tests, at least one that validates that it fixes #2352 ?

@OlegDokuka
Copy link
Contributor Author

OlegDokuka commented Oct 20, 2020

can you add tests, at least one that validates that it fixes #2352 ?

Actually, it fixes buffer overflow, but it does not fix possible drops of elements in case cancel onNext races. I'm not sure it is possible to guarantee that there will not bee any drops on such a racing.

But yeah, I added some tests, just forgot to push them

@OlegDokuka OlegDokuka requested a review from simonbasle October 20, 2020 16:39
@pavelkuchin
Copy link

pavelkuchin commented Dec 7, 2020

Hi @simonbasle @OlegDokuka,

Are there plans to merge it any time soon?

@simonbasle simonbasle added this to the 3.4.2 milestone Dec 7, 2020
@simonbasle
Copy link
Contributor

Hi @simonbasle @OlegDokuka,

Are there plans to merge it any time soon?

Thanks for the ping. I need to review this again, with the added tests, it fell through the cracks.
Given that, it should be released in the January release I think.

In the meantime, if you think you were affected (from the description in this PR) it would be highly valuable for us to get your early feedback (by checking out the branch and building it locally to try to see if it improves things for you). Same for @Sage-Pierce, also @OlegDokuka already stated it didn't really fix the exact issue described in #2352

@pavelkuchin
Copy link

Hi @simonbasle @OlegDokuka,
Are there plans to merge it any time soon?

Thanks for the ping. I need to review this again, with the added tests, it fell through the cracks.
Given that, it should be released in the January release I think.

In the meantime, if you think you were affected (from the description in this PR) it would be highly valuable for us to get your early feedback (by checking out the branch and building it locally to try to see if it improves things for you). Same for @Sage-Pierce, also @OlegDokuka already stated it didn't really fix the exact issue described in #2352

Thanks @simonbasle! The issue we are having similar to #2138. We are using reactive-kafka consumer and processing kafka events stream using combination of groupBy/take and other operators. At some arbitrary moments the stream hangs up on production and based on logs it seems to be related to groupBy. We are trying to reproduce the issue in our end-to-end tests right now, once we can reproduce it in consistent manner I'll try the fix and provide feedback.

@pavelkuchin
Copy link

Hi! I've pulled the changes and applied to v3.4.1 of reactor-core. Unfortunately it doesn't fix issue described in #2138. I've also tried to run twoGroupsLongAsyncMergeHidden2 test @OlegDokuka added, and it never fails (on my env it has passed with the fix and without).

Signed-off-by: Oleh Dokuka <[email protected]>
@simonbasle simonbasle changed the base branch from master to 3.3.x December 9, 2020 14:18
@simonbasle simonbasle modified the milestones: 3.4.2, 3.3.13.RELEASE Dec 9, 2020
@simonbasle simonbasle added type/bug A general bug type/enhancement A general enhancement labels Dec 9, 2020
@simonbasle simonbasle changed the title fixes GroupBy overflow issue fix groupBy initial request off by one Dec 9, 2020
@simonbasle
Copy link
Contributor

@pavelkuchin thanks for trying it out nonetheless.
@OlegDokuka I have reviewed the PR and re-uploaded it with a 3.3.x base branch, looks good to me (although it is unrelated to the other groupBy issues currently open)

@simonbasle simonbasle changed the title fix groupBy initial request off by one Fix groupBy initial request off by one Dec 9, 2020
@simonbasle simonbasle merged commit 2ff31b3 into reactor:3.3.x Dec 9, 2020
@reactorbot
Copy link

@simonbasle this PR seems to have been merged on a maintenance branch, please ensure the change is merge-forwarded to intermediate maintenance branches and up to master 🙇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug A general bug type/enhancement A general enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants