-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: long-running queries shouldn't block the main event loop #7420
Conversation
@confluentinc It looks like @vvcephei just signed our Contributor License Agreement. 👍 Always at your service, clabot |
@@ -296,6 +310,30 @@ public void tearDown() { | |||
REST_APP.getServiceContext().close(); | |||
} | |||
|
|||
@Test | |||
public void shouldStreamMultiplePushQueries() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works with and without my patch, but it doesn't hurt to have a test for it.
@@ -958,29 +996,24 @@ public void shouldListTopics() throws Exception { | |||
} | |||
|
|||
@Test | |||
public void shouldListQueries() { | |||
public void shouldListQueries() throws ExecutionException, InterruptedException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was flake-prone, since it assumes that it could serialize itself after other tests (i.e., wait until all other tests' queries have stopped) and that other tests wouldn't start new queries during its execution.
There doesn't seem to be a reason for this, so I changed the test just to verify the thing it wanted to verify, regardless of what other queries are running.
/** | ||
* This integration test is displaced from the rest-client package | ||
* to make use of utilities that are not available there. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty obnoxious.
The integration test needs to start a REST server, but in the rest-client test module, we don't have a dependency on the rest-server test module. And we can't add one because rest-server test module already depends on rest-client main module.
I can think of two solutions that are potentially better than putting it here: create a new "integration-test" module that depends on all the other modules, or relocate this integration test to the rest-server test module.
} | ||
}, vcf); | ||
}, | ||
false /*if this is true, worker execution blocks the main event loop*/, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of lines changed because I fixed the indentation, but this is what actually changed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems from the documentation that one worker task can block another if they are invoked from the same context. Not sure if that's what you're describing in the description (though it certainly would cause the bug referenced). Anyway, in our world, worker invocations are all independent.
Hey @vvcephei, the PR looks good. Did you look into the other places where
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Just look at the other 2 files that call executeBlocking() to see if they need the false too.
Thanks, @spena ! I also saw those, but I had enough trouble with these integration tests that I lost my appetite to do it all in one chunk. I think I'll file some follow-on tickets to try and produce blocking in those contexts and fix it there. My instinct says that we probably never want to block the main event loop on the worker pool. Otherwise, why would we bother delegating to the worker pool? If you think it's ok to just change those other two usages without tests, I can certainly do that in this PR. Thanks, |
This reverts commit 3ca0294.
@vvcephei I think it's ok to change it without tests. We should not block the event loop as you said. |
Thanks, @spena ! Since I have a green build right now, I'll go ahead and capitalize on it by merging this PR. I'll follow up right away with a separate PR to change those other two methods. Thanks for the reviews, @spena and @AlanConfluent ! |
Description
Currently, due to #7358, long-running queries block the main server event loop, even though they execute on the worker pool instead of the event loop pool.
This patch corrects this flaw.
Testing done
Added integration tests to reproduce the bug (#7358).
Note that the newer HTTP2 API did not have the bug, but the integration test for the older API does hang and time out without the fix in place.
Reviewer checklist