-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker Desktop on Windows: 'docker context inspect' is very slow #2420
Comments
It may not be limited to just |
Yep, I have confirmed that |
For some context, my machine is on VPN, a good 3000 miles away from the corporate network. So, any network calls that cross that gap are gonna end up being pretty slow--I'm sure that my observed 7-8 seconds is well above average. I looked up the global averages in our telemetry, |
@silvin-lubecki @gtardif looks like the issue is still not completely fixed. I have built the binary from the C:\code>\tools\timecmd.bat \scratch\docker-windows-amd64.exe context inspect
[
{
"Name": "default",
"Metadata": {
"StackOrchestrator": "swarm"
},
"Endpoints": {
"docker": {
"Host": "npipe:////./pipe/docker_engine",
"SkipTLSVerify": false
}
},
"TLSMaterial": {},
"Storage": {
"MetadataPath": "\u003cIN MEMORY\u003e",
"TLSPath": "\u003cIN MEMORY\u003e"
}
}
]
command took 0:0:2.41 (2.41s total)
C:\code>\tools\timecmd.bat \scratch\docker-windows-amd64.exe context inspect
[
{
"Name": "default",
"Metadata": {
"StackOrchestrator": "swarm"
},
"Endpoints": {
"docker": {
"Host": "npipe:////./pipe/docker_engine",
"SkipTLSVerify": false
}
},
"TLSMaterial": {},
"Storage": {
"MetadataPath": "\u003cIN MEMORY\u003e",
"TLSPath": "\u003cIN MEMORY\u003e"
}
}
]
command took 0:0:0.10 (0.10s total) It is enough to wait about 30 seconds between The docker.exe binary that comes with Docker Desktop for Windows 2.2.0.5 behaves the same way BTW. |
@karolz-ms curious; where do those MetadataPaths ( FWIW, with #2424 applied, I tried reproducing in a Linux container (I was curious if the difference was between docker context create faulty --docker host=tcp://42.42.42.41:4242
docker context create local --docker host=unix:///var/run/docker.sock
docker context use faulty
time docker context ls
NAME DESCRIPTION DOCKER ENDPOINT KUBERNETES ENDPOINT ORCHESTRATOR
default Current DOCKER_HOST based configuration unix:///var/run/docker.sock swarm
faulty * tcp://42.42.42.41:4242
local unix:///var/run/docker.sock
real 0m 0.33s
user 0m 0.04s
sys 0m 0.04s
time docker context inspect local
[
{
"Name": "local",
"Metadata": {},
"Endpoints": {
"docker": {
"Host": "unix:///var/run/docker.sock",
"SkipTLSVerify": false
}
},
"TLSMaterial": {},
"Storage": {
"MetadataPath": "/root/.docker/contexts/meta/25bf8e1a2393f1108d37029b3df5593236c755742ec93465bbafa9b290bddcf6",
"TLSPath": "/root/.docker/contexts/tls/25bf8e1a2393f1108d37029b3df5593236c755742ec93465bbafa9b290bddcf6"
}
}
]
real 0m 0.36s
user 0m 0.05s
sys 0m 0.03s
time docker context inspect faulty
[
{
"Name": "faulty",
"Metadata": {},
"Endpoints": {
"docker": {
"Host": "tcp://42.42.42.41:4242",
"SkipTLSVerify": false
}
},
"TLSMaterial": {},
"Storage": {
"MetadataPath": "/root/.docker/contexts/meta/9ec1600e60d2aba399d84db8163ec37fc67cd60096cb32a66f0a52280797ee95",
"TLSPath": "/root/.docker/contexts/tls/9ec1600e60d2aba399d84db8163ec37fc67cd60096cb32a66f0a52280797ee95"
}
}
]
real 0m 0.34s
user 0m 0.03s
sys 0m 0.05s |
Oh, never mind; looks like the cli/cli/command/defaultcontextstore.go Line 212 in 0f337f1
Same result when using / inspecting the default docker context use default
time docker context inspect
[
{
"Name": "default",
"Metadata": {
"StackOrchestrator": "swarm"
},
"Endpoints": {
"docker": {
"Host": "unix:///var/run/docker.sock",
"SkipTLSVerify": false
}
},
"TLSMaterial": {},
"Storage": {
"MetadataPath": "\u003cIN MEMORY\u003e",
"TLSPath": "\u003cIN MEMORY\u003e"
}
}
]
real 0m 0.29s
user 0m 0.04s
sys 0m 0.05s |
I can still consistently get long times on Windows with the current build. |
@thaJeztah @gtardif any news on this issue? You know, we are a Windows shop here |
The NetGetUserInfo call probably comes for getting the HomeDir of the user... we recently changed the way we get the homedir to the golang base library implementation. |
@simonferquel we went through a couple of changes there; #2101, which was later (partially?) reverted in (#2111); could you check which is the problematic one? I also have #2161 pending review, which should at least not call it by default |
Hmm, maybe we should ask for Microsoft guys here what is the correct (and most efficient) way to get the home dir on Windows using win32 APIs and plug homedir.Get onto that for Windows ? |
The most reliable way is to call |
Somebody overwriting that environment variable is asking for so much trouble that I think it's an acceptable risk to use it. 😄 |
So Lines 30 to 32 in d05846d
Lines 108 to 112 in d05846d
The
func UserHomeDir() (string, error) {
env, enverr := "HOME", "$HOME"
switch runtime.GOOS {
case "windows":
env, enverr = "USERPROFILE", "%userprofile%"
case "plan9":
env, enverr = "home", "$home"
}
if v := Getenv(env); v != "" {
return v, nil
} |
Hmm so do we have tooling to see the go stacktrace calling the problematic win32 API? |
I will try to find out and report back here |
@thaJeztah @simonferquel I was able to find the issue: it is in the These lookups causes problems on some versions of Windows OS, but much more importantly (and this is what this bug is about), this way of obtaining current user information for logging purposes is just too damn slow for a CLI utility, and probably not really necessary either. Hope this helps! |
How about pulling this fix? |
Great find; It's vendored as a dependency of the |
I see the fix was back ported to v1.x; kubernetes/klog#124, but is not yet in a release; kubernetes/klog@v1.0.0...release-1.x I may have one more suggestion for that fix; think it would make sense to lazily look up the username, given that on Linux looking up the user may also make network connections for external users (if cgo is enabled and the @dims any chance on tagging a v1.0.1 release? |
Actually perhaps it would be good to have a change in Go itself, and add support for the @karolz-ms perhaps there's interest from Microsoft's side to contribute such a change to Golang? (looking at the PR you linked, it looks like |
opened kubernetes/klog#143 to perform userName lookup lazily instead of on |
@thaJeztah Ack on 143, let me try to get some reviews on that first. |
@thaJeztah thanks for quick reply, much appreciated
This sounds like a good idea, although such work would be outside of my area of expertise. I will ask around, but do you happen to know about any Microsoft people who did contribute to Golang runtime in the past? |
Hmmm, no, not from the top of my head. I made a contribution myself once, but their flow is not a regular "github" flow, so I need to re-read the instructions again 😅. In this case, it'd probably be good to look if a reasonable "fallback" exists (that's generic enough), which probably should be done by someone with a better understanding of the Windows mechanisms than I have |
This should be fixed through #2516 |
There are definitely substantial improvements in the latest build from
Globally, the average for Thanks everyone! This is a huge help for us! |
Thanks @bwateratmsft!
The We're aware of this, but (admittedly) haven't come round to design a better approach (I was thinking of having an optional query parameter to limit the returned information to specific parts, but ideally the struct that's returned would also be more structured). Is there specific information that VSCode uses from the
Thanks! Erm, so it turns out there's a regression (#2533) related to the first change (#2424), I think that fix was not the "main contributor" to the |
That makes sense. I looked at where we call
Only the first of those three has any significant usage. We're querying the Docker server OSType, indirectly to find out the container OS, which we use to decide whether we should launch Curiously, Dockerode takes only ~10-20 ms to call that endpoint. Any ideas on the discrepancy? |
…ng/glog glog has the same issue as k8s.io/klog, and is calling `user.Current()` inside an `init()`; see kubernetes/klog@466fbb6 Calling `user.Current()` on Windows can result in remove connections being made to get the user's information, which can be a heavy call. See docker#2420 glog was only used in a single location in compose-on-kubernetes, so we may as well remove it. compose-on-kubernetes has been archived, so (for now) vendoring from my fork, but we should consider to temporarily un-archive it, and get this fix in upstream. Signed-off-by: Sebastiaan van Stijn <[email protected]>
…ng/glog glog has the same issue as k8s.io/klog, and is calling `user.Current()` inside an `init()`; see kubernetes/klog@466fbb6 Calling `user.Current()` on Windows can result in remove connections being made to get the user's information, which can be a heavy call. See docker#2420 glog was only used in a single location in compose-on-kubernetes, so we may as well remove it. compose-on-kubernetes has been archived, so (for now) vendoring from my fork, but we should consider to temporarily un-archive it, and get this fix in upstream. Signed-off-by: Sebastiaan van Stijn <[email protected]>
…ng/glog glog has the same issue as k8s.io/klog, and is calling `user.Current()` inside an `init()`; see kubernetes/klog@466fbb6 Calling `user.Current()` on Windows can result in remove connections being made to get the user's information, which can be a heavy call. See docker#2420 glog was only used in a single location in compose-on-kubernetes, so we may as well remove it. compose-on-kubernetes has been archived, so (for now) vendoring from my fork, but we should consider to temporarily un-archive it, and get this fix in upstream. Signed-off-by: Sebastiaan van Stijn <[email protected]>
…ng/glog glog has the same issue as k8s.io/klog, and is calling `user.Current()` inside an `init()`; see kubernetes/klog@466fbb6 Calling `user.Current()` on Windows can result in remove connections being made to get the user's information, which can be a heavy call. See docker#2420 glog was only used in a single location in compose-on-kubernetes, so we may as well remove it. Signed-off-by: Sebastiaan van Stijn <[email protected]>
…ang/glog glog has the same issue as k8s.io/klog, and is calling `user.Current()` inside an `init()`; see kubernetes/klog@466fbb6 Calling `user.Current()` on Windows can result in remove connections being made to get the user's information, which can be a heavy call. See docker#2420 glog was only used in a single location in compose-on-kubernetes, so we may as well remove it. Signed-off-by: Sebastiaan van Stijn <[email protected]>
…ang/glog glog has the same issue as k8s.io/klog, and is calling `user.Current()` inside an `init()`; see kubernetes/klog@466fbb6 Calling `user.Current()` on Windows can result in remove connections being made to get the user's information, which can be a heavy call. See docker#2420 glog was only used in a single location in compose-on-kubernetes, so we may as well remove it. Signed-off-by: Sebastiaan van Stijn <[email protected]>
…ng/glog glog has the same issue as k8s.io/klog, and is calling `user.Current()` inside an `init()`; see kubernetes/klog@466fbb6 Calling `user.Current()` on Windows can result in remove connections being made to get the user's information, which can be a heavy call. See docker#2420 glog was only used in a single location in compose-on-kubernetes, so we may as well remove it. Signed-off-by: Sebastiaan van Stijn <[email protected]>
Steps to reproduce the issue:
docker context inspect
Describe the results you received:
It consistently takes more than 1 second to run this command, often more than 2 seconds.
Describe the results you expected:
Should take 100 ms or less. The default context, pointing to Docker Desktop engine, is the only context on the machine and it should be really straightforward to "inspect" it.
Additional information you deem important (e.g. issue happens only occasionally):
I took a trace (available upon request) and noticed that the majority of time is spent waiting on results from NetUserGetInfo Windows API. On my machine this API is called for a default Docker Desktop context inspection, which has no user information. Ultimately this accounts for more than 2 seconds of latency (for that run of the command)
May be related to #1739
Output of
docker version
:Output of
docker info
:Additional environment details (AWS, VirtualBox, physical, etc.):
The text was updated successfully, but these errors were encountered: