-
Notifications
You must be signed in to change notification settings - Fork 130
Conversation
… instead of trying shutdown first. EthScheduler executes a lot of tasks which wait for responses from the network and may have a significant number of tasks queued. Using shutdown would wait for all network responses and all queued tasks to complete before exiting which almost always reaches the 2 minute timeout allowed before switching to shutdownNow. All tasks have to cope with being unexpectedly terminated (as would happen with a kill -9) so there's no reason to have this extra delay.
Feels a bit heavy handed. My thought is being better at cleanup should be our first approach: #841 |
It is heavy handed but Pantheon must be designed to handle being killed without any warning anyway so making normal shutdown heavy handed doesn't introduce any new requirements. It also frees us up to use blocking calls in our tasks which otherwise would delay shutdown. |
Per your gist (https://gist.github.com/ajsutton/d8afc6fe7aeac94a2d82cfc98d6b8a6c) it looks like we are attempting to shut down in two separate threads. There are no more service threads in that stack so I wonder if we were actually shutting down in three threads and only one of the threads got notified. |
Yeah I couldn't understand why that one was hanging. The |
@@ -55,9 +55,7 @@ public void shutdown_syncWorkerShutsDown() throws InterruptedException { | |||
ethScheduler.stop(); | |||
|
|||
assertThat(syncWorkerExecutor.isShutdown()).isTrue(); | |||
assertThat(syncWorkerExecutor.isTerminated()).isFalse(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to test for multiple tasks in the queue and assert they don't get executed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Added.
@@ -19,11 +19,9 @@ | |||
public class MockEthTask extends AbstractEthTask<Object> { | |||
|
|||
private boolean executed = false; | |||
private CountDownLatch countdown; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are going to verify non-execution for forced shutdown we will need the unblockable tasks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out we needed it for EthSchedulerTest
anyway - it just didn't need the countdown method.
PR description
EthScheduler executes a lot of tasks which wait for responses from the network and may have a significant number of tasks queued. Using shutdown would wait for all network responses and all queued tasks to complete before exiting which almost always reaches the 2 minute timeout allowed before switching to shutdownNow. All tasks have to cope with being unexpectedly terminated (as would happen with a kill -9) so there's no reason to have this extra delay.
PantheonCommand
now also shuts down Log4J correctly to ensure log messages aren't lost. We disable Log4J's shutdown handler via a system property in our start script (which allows logging to work in our shutdown hook) so need to do this manually.