Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add config parameter to change stop timeout during daemon shutdown #23036

Merged

Conversation

yongtang
Copy link
Member

- What I did

This fix tries to add a daemon config parameter --shutdown-timeout that specifies the timeout value to stop containers gracefully (before SIGKILL). The default value is 15s.

- How I did it

The --shutdown-timeout parameter is added to daemon options and config file. It will also be updated during daemon reload.

- How to verify it

Additional test cases have been added to cover the change.

- Description for the changelog

Add --shutdown-timeout to dockerd to specify the timeout (in seconds) to stop containers gracefully before daemon exit.

- A picture of a cute animal (not mandatory but encouraged)

This fix is related to #22471 and #22566.

Signed-off-by: Yong Tang [email protected]

@thaJeztah
Copy link
Member

Ok, we're okay with adding this, so moving to code review.

can you rebase?

@yongtang yongtang force-pushed the 22471-daemon-shutdown-timeout-daemon branch from bd9183b to 023477b Compare July 15, 2016 23:45
@yongtang
Copy link
Member Author

Thanks @thaJeztah the pull request has been rebased.

@thaJeztah
Copy link
Member

sorry @yongtang, needs another rebase

@yongtang yongtang force-pushed the 22471-daemon-shutdown-timeout-daemon branch from 023477b to b39cb13 Compare July 29, 2016 14:13
@yongtang
Copy link
Member Author

Thanks @thaJeztah, the PR has been rebased.

@yongtang yongtang force-pushed the 22471-daemon-shutdown-timeout-daemon branch from b39cb13 to 80b0923 Compare August 13, 2016 21:50
@vdemeester
Copy link
Member

LGTM 🐸

@@ -121,6 +125,10 @@ type CommonConfig struct {
// may take place at a time for each push.
MaxConcurrentUploads *int `json:"max-concurrent-uploads,omitempty"`

// ShutdownTimeout is the timeout value (in seconds) the daemon will wait for the container
// to stop when daemon is being shutdown
ShutdownTimeout int `json:"shutdown-timeout,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time.Duration ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cpuguy83 One potential issue with time.Duration is that the default marshal/unmarshal of time.Duration is nano-seconds (cannot parse a string like "5s"). In other words, if user wants to specify 5 seconds, it will be "shutdown-timeout": 5000000000, which might not be ideal.

Maybe we could use a customized type, e.g.,

type Duration struct {
    time.Duration
}
func (d Duration) MarshalJSON() ([]byte, error) {

so that user could specify 5s instead of 5000000000?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect this to be handled on the client, not the API (even though I'm sure we probably do this elsewhere...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cpuguy83 The --shutdown-timeout is the global timeout and is handled in dockerd. It could be unmarshaled from --config-file=/etc/docker/daemon.json.

Another PR (#22566) is the per-container timeout which could be handled in client.

Maybe for this PR we use seconds , and for the other PR we could change the per-container timeout (in #22566) to time.Duration?

@yongtang yongtang force-pushed the 22471-daemon-shutdown-timeout-daemon branch from 80b0923 to d4111ab Compare August 25, 2016 19:08
@vdemeester
Copy link
Member

@yongtang needs a rebase 😓

@yongtang yongtang force-pushed the 22471-daemon-shutdown-timeout-daemon branch from d4111ab to bbc01a1 Compare September 4, 2016 22:37
@yongtang
Copy link
Member Author

yongtang commented Sep 4, 2016

Thanks @vdemeester. The pull request has been rebased.

@yongtang yongtang force-pushed the 22471-daemon-shutdown-timeout-daemon branch 2 times, most recently from 30da883 to 4ddbe01 Compare September 19, 2016 23:35
@vdemeester
Copy link
Member

LGTM 🐸

Copy link
Contributor

@mlaventure mlaventure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of nits.

LGTM otherwise.

Also, requires a rebase.

@@ -730,7 +736,7 @@ func (daemon *Daemon) Shutdown() error {
}

if daemon.containers != nil {
logrus.Debug("starting clean shutdown of all containers...")
logrus.Debugf("starting clean shutdown of all containers in %d seconds...", daemon.configStore.ShutdownTimeout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording feels to me like we are only going to start shutting down the containers after ShutdownTimeout seconds have expired.

Maybe use something like start clean shutdown of all containers with a %d seconds timeout...?

cc @thaJeztah for better wording 😅

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mlaventure @thaJeztah . I will temporarily settle with start clean shutdown of all containers with a %d seconds timeout... but let me know if there are other suggestions.

@@ -1125,6 +1126,7 @@ This is a full example of the allowed configuration options on Linux:
"cluster-advertise": "",
"max-concurrent-downloads": 3,
"max-concurrent-uploads": 5,
"shutdown-timeout": 15,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alignment issue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mlaventure. Done.

@yongtang yongtang force-pushed the 22471-daemon-shutdown-timeout-daemon branch from 4ddbe01 to d991cf6 Compare October 17, 2016 20:30
@yongtang
Copy link
Member Author

The PR has been rebased and updated. Please let me know if there are other issues.

@mlaventure
Copy link
Contributor

LGTM if still 💚

ping @vdemeester @cpuguy83 @thaJeztah

@@ -64,6 +64,7 @@ Options:
--raw-logs Full timestamps without ANSI coloring
--registry-mirror value Preferred Docker registry mirror (default [])
--selinux-enabled Enable selinux support
--shutdown-timeout=15 Set the shutdown timeout value in seconds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't forget to update the man page as well https://github.com/docker/docker/blob/master/man/dockerd.8.md

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @thaJeztah. The man page has been updated.

@@ -1118,6 +1119,7 @@ This is a full example of the allowed configuration options on Linux:
"cluster-advertise": "",
"max-concurrent-downloads": 3,
"max-concurrent-uploads": 5,
"shutdown-timeout": 15,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only supported on Linux, or Windows as well? If so, the "Windows" example needs to be updated as well 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @thaJeztah. The docs has been updated.

@thaJeztah
Copy link
Member

left some docs notes already; @mlaventure IIRC, systemd will kill all processes if the systemd (unit-file) timeout is exceeded; do you think we should mention that in the documentation?

Also, this'll need changes in the bash and zsh completion scripts @albers @sdurrheimer

@mlaventure
Copy link
Contributor

@thaJeztah I wouldn't make it systemd specific. Most service manager would have a similar (configurable) behavior.

Mentioning that their service manager may kill docker regardless if the daemon is stopped via it wouldn't hurt though.

@yongtang yongtang force-pushed the 22471-daemon-shutdown-timeout-daemon branch from d991cf6 to 3945662 Compare October 18, 2016 00:44
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

ping @vdemeester PTAL

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs LGTM 🐸

@vdemeester
Copy link
Member

@yongtang needs a rebase though 👼

This fix tries to add a daemon config parameter `--shutdown-timeout`
that specifies the timeout value to stop containers gracefully
(before SIGKILL). The default value is 15s.

The `--shutdown-timeout` parameter is added to daemon options and
config file. It will also be updated during daemon reload.

Additional test cases have been added to cover the change.

This fix fixes moby#22471.

Signed-off-by: Yong Tang <[email protected]>
@yongtang
Copy link
Member Author

Thanks @vdemeester. The PR has been rebased and all tests passed.

@mlaventure
Copy link
Contributor

ping @thaJeztah @vdemeester final look before merging? :)

@vdemeester
Copy link
Member

Still LGTM 👼
merging 😝

@vdemeester vdemeester merged commit 4a4f028 into moby:master Oct 19, 2016
@yongtang
Copy link
Member Author

Thanks all for the review and help! 👍

@yongtang yongtang deleted the 22471-daemon-shutdown-timeout-daemon branch October 19, 2016 16:54
@albers
Copy link
Member

albers commented Oct 19, 2016

Added bash completion: #27550

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants