-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement workspace start cancellation #3701
Conversation
Build # 1611 - FAILED Please check console output at https://ci.codenvycorp.com/job/che-pullrequests-build/1611/ to view the results. |
* } | ||
* <pre>{@code | ||
* StripedLocks stripedLocks = new StripedLocks(16); | ||
* try (CloseableLock lock = stripedLocks.writeLock(myKey)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change CloseableLock
to UnLocker
Build # 1614 - FAILED Please check console output at https://ci.codenvycorp.com/job/che-pullrequests-build/1614/ to view the results. |
Build # 1619 - FAILED Please check console output at https://ci.codenvycorp.com/job/che-pullrequests-build/1619/ to view the results. |
# operations that require asynchronous execution e.g. starting/stopping/snapshotting | ||
|
||
# possible values are 'fixed', 'cached' | ||
che.workspace.pool.type=fixed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to overwrite these values from che.env? If not then why do we need these properties here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If not then why do we need these properties here?
I didn't do anything for overwriting them from che.env.
I have the opposite question, if i put the properties into this place i probably mean that this is a part of workspace server configuration so WHY do i have to put them somewhere else?
From you question it looks like either property is present in che.properties and somewhere else or it's missing from che.properties.
The properties are low-level configuration for workspace threads pool, they allow users to provide more dynamic configuration of starting/stopping/snapshotting q.
So where should i put these properties?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@evoevodin has taken the right approach.
The way it works is that anything in che.properties can be overridden in the che.env
with a naming convention. So in this case, an admin can enter CHE_WORKSPACE_POOL_TYPE=cached
and it will override the property.
However, we are only adding entries in the default che.env
which would be entries that we commonly expect an administrator to need to edit. For all other entries, which are more system operational internal entries, we just leave them in the che.properties and give admins a reference to it for investigation if they want.
I would include this property as an internal system property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
@@ -86,7 +86,6 @@ public void shouldNotWriteIntoSubConsumersAfterClosingCompositeConsumer() throws | |||
public Object[][] subConsumersExceptions() { | |||
return new Throwable[][] { | |||
{new ConsumerAlreadyClosedException("Error")}, | |||
{new ClosedByInterruptException()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you changed this would you be able to add tests also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
@@ -28,8 +28,9 @@ | |||
* <p>Workspace becomes starting only if it was {@link #STOPPED}. | |||
* The status map: | |||
* <pre> | |||
* STOPPED -> <b>STARTING</b> -> RUNNING (normal behaviour) | |||
* STOPPED -> <b>STARTING</b> -> STOPPED (failed to start) | |||
* STOPPED -> <b>STARTING</b> -> RUNNING (normal behaviour) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I missing something or this docs should include snapshotting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a definition for the STARTING status, there is now way you can go from STARTING to SNAPSHOTTING workspace, and only possible options are described in status doc, which are:
STOPPED -> STARTING -> RUNNING (regular flow)
STOPPED -> STARTING -> STOPPING (start cancellation)
STOPPED -> STARTING -> STOPPED (failed to start)
@Override | ||
void close(); | ||
default void close() { unlock(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to have unlock method inside of close if it is the same thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it makes more sense to implement & use Unlocker.unlock
instead of Unlocker.close
.
Build success. https://ci.codenvycorp.com/job/che-pullrequests-build/1630/ |
Build success. https://ci.codenvycorp.com/job/che-pullrequests-build/1638/ |
f9ff29f
to
f87be46
Compare
Build success. https://ci.codenvycorp.com/job/che-pullrequests-build/1646/ |
Allows to cancel workspace start on the API level.
Allows to do things described by #3249, fixes #3742.
Implementation is based on deterministic points of time when components can check whether starting process is interrupted and react appropriately. The components like
WorkspaceRuntimes
&CheEnvironmentEngine
are refactored to respect cancellation of starting thread. The other components are refactored to propagate interruption correctly, e.g.DockerConnector
didn't propagate interrupted flag. So when we start the workspace starting task is saved and we keep it until workspace is started. If the task is canceled within thread interruption e.g.future.cancel(true)
start task tries to stop the execution and cleanup corresponding resources. The caller of start(e.g.WorkspaceManager
) will receiveEnvironmentStartInterruptedException
and may do whatever it needs to, for now workspace manager logs the info message that workspace start was directly interrupted and considers it as a normal behaviour.WorkspaceRuntimes#startAsync
now returnsCompletableFuture<WorkspaceRuntime>
instead of regular
Future
which allows to handle start result in the same thread and reduces amount of threads needed for start. AlsoWorkspaceRuntimes
doesn't exposeRuntimeDescriptor
anymore, it provides methods for getting & filling workspace with runtime information instead:WorkspaceRuntimes#getWorkspaces
is removed as it's not used and it dealswith stale workspace statuses which is not good.
Agents now launched after a single machine is launched that allows to fail
faster if there is an error that doesn't allow to start the machine. In current version
of che agents are launched after all the machines are launched.
Also there are new properties for workspace pool adjustment, allows to configure
workspace starting queue dynamically. The properties are:
che.workspace.pool.type
possible values are fixed, cachedche.workspace.pool.exact_size
optional. This property is ignored when pool type is different from fixed. Configures the exact size of the pool, if it's set multiplier property is ignored.If this property is not set(0, < 0, NULL) then pool sized to number of cores, it can be modified
within multiplier property
che.workspace.pool.cores_multiplier
optional. This property is ignored when pool type is different from fixed or exact pool size is set. If it's set the pool size will beN_CORES * multiplier